Semarchy xDM semql reference guide
This article helps you find some useful content using the knowledge base full search. It is a copy of the official's guide.
Please go to the official Semarchy xDM documentation for more self-investigation on your problem.
Welcome to Semarchy xDM.
This guide contains information about the SemQL language which is used across Semarchy xDM.
PREFACE
Audience
If you want to learn about MDM or discover Semarchy xDM, you can watch our tutorials. |
The Semarchy xDM Documentation Library, including the development, administration and installation guides is available online. |
Document Conventions
This document uses the following formatting conventions:
Convention | Meaning |
---|---|
boldface | Boldface type indicates graphical user interface elements associated with an action, or a product specific term or concept. |
italic | Italic type indicates special emphasis or placeholder variable that you need to provide. |
| Monospace type indicates code example, text or commands that you enter. |
Other Semarchy Resources
In addition to the product manuals, Semarchy provides other resources available on its web site: http://www.semarchy.com.
Obtaining Help
There are many ways to access the Semarchy Technical Support. You can call or email our global Technical Support Center (support@semarchy.com). For more information, see http://www.semarchy.com.
Feedback
We welcome your comments and suggestions on the quality and usefulness of this documentation.
If you find any error or have any suggestion for improvement, please mail support@semarchy.com and indicate the title of the documentation along with the chapter, section, and page number, if available. Please let us know if you want a reply.
Overview
Using this guide, you will learn the SemQL language and its usage in Semarchy xDM.
INTRODUCTION TO SEMQL
SemQL is a language to express declarative rules in Semarchy xDM. It is used for example to define:
Enrichers, Matchers, Validations and Survivorship Rules involved in the Certification Process.
Composite and transformed attributes in form and collections appearing in MDM Application.
Filters when browsing data in the MDM Hub.
It has the following main characteristics:
The syntax is close to the SQL language and most SemQL functions map to Database functions.
SemQL is converted on the fly and executed by the hub database.
SemQL uses Qualified Attribute Names instead of columns names. The code remains implementation-independent.
The following sections describe the main characteristics of the SemQL language.
SQL-Like Clauses
The SemQL Language allows users to define SQL-Like clauses. Depending on the context, these clauses may be one of the following:
Condition: A clause that returns a boolean result from the evaluation of expressions using operators. A condition can be used for example for filtering or validating data records (if the clause is false, then the record is filtered out or considered invalid).
Expression: A clause that returns a value. In the context of a SemQL Enricher for example, an expression transforms, standardizes and enriches source attributes.
Order By Clause: An expression used to sort records. In consolidation rules, such a clause is used to manage the consolidation conflicts. For example, consider a consolidation made by Most Frequent Value. When multiple values occur with equal frequency, then the SemQL in the Ranking Expression determines which value is used.
Functions differ from comparison operators as they return a non-boolean value. They cannot be used as is in conditions unless used with a comparison operator. For example, TO_CHAR(IsValidCustomer) is a valid expression, but not a valid condition. TO_CHAR(IsValidCustomer)='1' is a valid condition. |
SemQL is used to create expressions, conditions and order by clauses. SELECT, UPDATE or INSERT queries are not supported, as well as joins, sub-queries, aggregates, in-line views and set operators. |
Users proficient with SQL should not be mistaken by the appearances. Even if SemQL looks like SQL, the SemQL expressions are entirely parsed and rewritten by the Semarchy xDM platform into SQL before their execution by the hub database. A simple SemQL expression may result into a complex SQL statement. As a consequence, it is not recommended to try injecting SQL statements within SemQL expressions. |
Qualified Attribute Names
SemQL clauses manipulate attributes and variables defined in the Semarchy xDM model. Attributes are accessed through an unambiguous Qualified Attribute Name. The Qualified Attribute Name is the path to an attribute from the entity being processed.
Built-in and Customized Functions
In expressions, conditions and order by clauses, it is possible to use the built-in SemQL functions. Most of these functions are functions built in the hub’s database and processed by it. Other functions (for example, matching functions) are specific to Semarchy xDM.
You can also use in SemQL customized functions implemented in the database. You must declare these functions in the model to have them appear in the list of functions. See the Declaring Database Functions and Procedures section in the Semarchy xDM Developer’s Guide for more information about declaring customized functions.
Functions that are not declared can still be used in SemQL, but will not be recognized by the SemQL parser and will cause validation warnings. |
The SemQL Editor lists all built-in SemQL functions, plus the customized functions that are declared in the model, with their syntax. |
SEMQL SYNTAX
Language Elements
The SemQL syntax supports the equivalent of SQL Expressions, Conditions or Order By Clause, which are a combination of one or more Values, Operators, and Functions.
Values, Operators and Functions
Values, operators and functions are the tokens in the SemQL language.
Values are simple expressions. They may be literals, attributes or variables.
Operators modify or compare expressions. SemQL support most SQL operators, including arithmetic and character operators (
+
,-
,*
,/
,||
), comparison operators (=
,!=
,>
,>=
,<
,⇐
,IN
,BETWEEN
,LIKE
,REGEXP_LIKE
,IS NULL
) and logical operators (AND
,OR
,NOT
).Functions & Expression Constructs combine other tokens to create new expressions. They include most functions available in the hub’s database, plus the functions implemented by the user.
Operators and Functions are not case-sensitive. Values are case-sensitive.
|
Expressions, Conditions, Order By Clause
Expressions, Conditions and Order By Clauses are the phrases supported by the SemQL Language.
Expressions combine values, operators and functions to return a non-boolean value.
Conditions combine values, operators and functions to return a boolean value (true or false).
Order By Clauses are expressions used to sort data by ascending or descending values of the expression.
TheASC
orDESC
post-fix define the sort order. Default sort order is ascending.
Order by clauses in the consolidation rules' Ranking Expression also support theNULLS FIRST
orNULLS LAST
clause to specifies whether NULL values should be ordered before or after non-NULL values. By default, the order by clause usesNULLS LAST
if the sort isASC
andNULLS FIRST
if the sort isDESC
.
Examples of Expressions
FirstName
is an attribute.'USA'
is a string literal.Initcap(FirstName)
is a function using an attribute.Initcap(FirstName) || ' ' || Initcap(LastName)
is a combination of operators, functions, attributes and literals.
Example of Conditions
1=1
compares two literals.Country='USA'
compares an attribute and a literal.Upper(Country) in ('USA', 'US', 'CANADA')
uses a function.
Example of Order By Clauses
Country
sorts by the Country attribute (ascending by default)Country DESC
sorts by the Country attribute (descending). Nulls values are sorted (default behavior forDESC
) before all non-null values.Country DESC NULLS LAST
sorts by the Country attribute (descending). Nulls values are sorted after all non-null values.CASE PublisherID WHEN 'MKT' THEN 1 WHEN 'CRM' THEN 2 ELSE 99 END ASC
sorts records where PublisherID equalsMKT
, thenCRM
, then the rest.
The following sections detail the elements of a SemQL clause.
Comments
Comments in SemQL are surrounded by /*
and */
.
An example of code with comments is provided below.
ANY Contacts HAVE (1=1) /* customer with contacts */ AND NOT ANY Contacts HAVE ( IsInfluencer = 1 ) /* and no contact is an influencer */
Values
The values in a SemQL expression may be literals, attributes or model variables:
literals are constant values. Numeric are provided as is, other literals must be surrounded by single quotes
'
.
Examples:
'John'
-42
'1994-11-07'
Attributes refer to attributes of the entities in the model.
Model Variables contain values that are used to customize the user experience or parameterize an integration job. Variable values are local to the user session or executed job. They are set either via a job parameter (for jobs), or retrieved from remote servers (declared as Variable Value Providers) when the user opens his session. Variables can be used in SemQL filters and expressions created at design and run-time.
Search Parameters store the values entered in search form and submitted to the search condition attached to the search form. Search parameters are available only in their own search form’s condition.
For more information about Attributes and Model Variables, refer to the Attributes and Variables/Search Parameters section. |
Operators
Operators are used to:
Combine expressions to create new expressions (Arithmetic or Character Operators).
Evaluate expressions to return a boolean value. Such operators are used to create conditions.
This section details the operators supported in SemQL.
Arithmetic Operators
Operator | Description |
---|---|
| Addition |
| Subtraction |
| Multiplication |
| Division |
Character Operators
The ||
(double pipe) is used for string concatenation.
Comparison Operators
Operator | Description | ||
---|---|---|---|
| Equality | ||
| Inequality | ||
| Greater than, greater than or equal | ||
| Smaller than, smaller than or equal | ||
| Compares a value with each value in the list, returns true if one value matches. | ||
| Greater than or equal to value_1 and less than or equal to value_2 | ||
| TRUE if value matches the pattern. Within the pattern, the character | ||
| returns true if the string matches the regular expression pattern.
| ||
| Tests for nulls | ||
| Condition that returns true if any of child records - in a one to many relationship - meet the given condition. For more information, see Using Related Entities’ Attributes. | ||
| Condition that returns true if all of child records - in a one to many relationship - meet the given condition. For more information, see Using Related Entities’ Attributes. |
Logical Operators
Operator | Description |
---|---|
| Return true if both conditions are true. |
| Return true if one condition of the other is true. |
| Returns true if the following condition is false. |
Functions & Expression Constructs
SemQL support functions and expression constructs, that is elements that return a value.
Functions differ from comparison operators as they return a non-boolean value. They cannot be used as is in conditions unless used with a comparison operator. For example, TO_CHAR(IsValidCustomer) is a valid expression, but not a valid condition. TO_CHAR(IsValidCustomer)='1' is a valid condition. |
Built-in Functions
The functions available in Semarchy xDM include functions in the following categories:
Strings
Comparison
Conversion
Date & Time
Matching
Miscellaneous
Null Management
Numeric
Useful & Noteworthy Functions
The following list contains noteworthy functions and expressions:
TO_CHAR
,TO_DATE
,TO_NUMBER
functions to perform conversion across data types.TRIM
,LTRIM
,RTRIM
,PAD
LPAD
,RPAD
to trip or pad with blanks.SUBSTR
to retrieve a part of a string.REPLACE
,REGEXP_REPLACE
to replace part of a strings.INSTR
to find the location of a substring in a string.NULLIF
,COALESCE
andNVL
to handle null values.GREATEST
andLEAST
to return the greatest and least of a list of expressions.SYSDATE
to retrieve the system date.
The complete set of built-in functions with their description is available in Appendix A
Functions for Matching
Certain functions are key in a fuzzy matching process.
Functions for normalizing of transforming values to reduce the noise during fuzzy matching:
UPPER
,LOWER
andINITCAP
absorb the case-sensitivity differences in strings.SOUNDEX
returns phonetic representations of strings, absorbing typos.SEM_NORMALIZE
returns a string with non-ASCII characters transformed to ASCII-equivalent or a blank.
Functions that implement fuzzy matching capabilities:
SEM_EDIT_DISTANCE
andSEM_EDIT_DISTANCE_SIMILARITY
respectively returns the distance and percentage of similarity between two strings according to the Levenshtein distance algorithm.SEM_JARO_WINKLER
andSEM_JARO_WINKLER_SIMILARITY
respectively return the distance and percentage of similarity between two strings according to the Jaro-Winkler distance algorithm.SEM_SEM_NGRAMS_SIMILARITY
returns the percentage of similarity of two strings according to the Dice’s coefficient similarity measure applied to the n-grams of the strings.
The *_SIMILARITY functions return a value between 0 (no match) and 100 (perfect match). If one or both strings are null, the returned value is 0. |
Other Constructs
CASE Expression
The CASE
expression selects a result from one or more alternatives, and returns this result.
This syntax returns the first result for which the expression matches the selector. If none match, it returns the default result.
CASE selector WHEN expression_1 THEN result_1 ... WHEN expression_n THEN result_n [ELSE default_result] END
This syntax returns the first result for which the condition is true. If none is true, it returns the default result.
CASE WHEN condition_1 THEN result_1 ... WHEN condition_n THEN result_n [ELSE default_result] END
The following example from an Enricher transforms the CustomerName attribute according to the Publisher of the record.
CASE PublisherID WHEN 'CRM' THEN Upper(CustomerName) WHEN 'MKT' THEN Upper(Replace(CustomerName, '-', ' ')) ELSE CustomerName END
The same example with the second syntax:
CASE WHEN PublisherID='CRM' THEN Upper(CustomerName) WHEN PublisherID='MKT' THEN Upper(Replace(CustomerName, '-', ' ')) ELSE CustomerName END
Table Functions
SemQL supports searching for an expression’s value in the values returned by a table function, using the following syntax:
expression IN table_function(parameter_1, parameter_2 ...)
For example the following condition uses a table function named SEARCH_FOR_IDS
that returns a list of IDs from a customer name, and checks whether the customer ID is in that list.
CUSTOMER_ID in SEARCH_FOR_IDS(CUSTOMER_NAME)
Customized Functions
SemQL allows you to access database functions implemented in the database instance hosting the hub.
Call these functions as regular functions by prefixing them with their schema and (optionnally) their package name: <schema>.<package>.<function>
.
For example, to call a CUSTFUNC() function, stored in a CUST001 package, in a COMMON_TOOLS schema, the syntax is:
COMMON_TOOLS.CUST001.CUSTFUNC(CustomerName)
The database user of the schema hosting the hub must have sufficient privileges to execute the customized functions. |
Database functions process data with the database engine. For certain processing involving for example algorithms, libraries or services not easily implemented with the database capabilities, it is preferable to opt for the plugin option. See the Semarchy xDM Plug-in Development Guide for more information. |
ATTRIBUTES AND VARIABLES/SEARCH PARAMETERS
Variables
Variables store values that are integration job-specific or user session-specific. Variable are either Built-in Platform Variables, or Model Variables.
Built-in Platform Variables
Built-in Platform Variables are built-in the platform. They contain information about the load or batch being processed or about the user connected or performing the operations. They are listed below:
Variable Name | Definition |
---|---|
| For the certification process, name of the user who has submitted the integration job. For a user data session, the name of the connected user. |
| Comma-separated list of roles of the connected user. Note that this variable only returns user roles that have corresponding roles declared in Semarchy xDM (in the Administration view, under the Roles node.). For the Tomcat application server specifically, all roles are returned, including those with no corresponding role declared. |
| ID of the external load that has submitted the integration job. This variable is not available in expressions used outside the certification process, such as enrichers or validations triggered in steppers. |
| ID of the batch running the integration job. This variable is not available in expressions used outside the certification process, such as enrichers or validations triggered in steppers. |
| User Address, configured by the user in his profile. |
| User City, configured by the user in his profile. |
| User Country, configured by the user in his profile. |
| User Decimal Separator, configured by the user in his profile. |
| User Departement, configured by the user in his profile. |
| User Email, configured by the user in his profile. |
| User First Name, configured by the user in his profile. |
| User Job Title, configured by the user in his profile. |
| User Language Tag, configured by the user in his profile. |
| User Last Name, configured by the user in his profile. |
| User Postal Code, configured by the user in his profile. |
| User Primary Phone, configured by the user in his profile. |
| User Secondary Phone, configured by the user in his profile. |
| User Thousand Separator, configured by the user in his profile. |
| User Timezone, configured by the user in his profile. |
Model Variables
Model Variables can be used:
In user sessions: They are set when the user accesses an application, using a variable value provider. In this context, variables are used to parameterize the user experience (for example, in filters restricting the user privileges).
In integration jobs: In an integration job, a variable value is usually set using a job parameter. If no job parameter is set, the value is set using the variable value provider. In this context, variables are used to parameterize the job execution (for example, in an enricher’s filter expression to prevent the enricher from processing any record depending on the value).
Model variables set from a Variable Value Providers are refreshed when the user accesses Semarchy xDM. In the context of a job, the variables are refreshed as if the user starting the job had accessed Semarchy xDM.
Using the built-in platform variables V_USERNAME
, it is possible to query (via a Variable Value Provider definition) the corporate LDAP directory and retrieve the email of the connected user, and then store this value in a model variable called CORPORATE_USER_EMAIL
.
For more information about Model Variables, see the Model Variables section in the Logical Modeling chapter of the Semarchy xDM Developer’s Guide. For more information about Variable Value Providers, see the Configuring Variable Value Providers in the Semarchy xDM Administration Guide. |
Using Variables
Variables are used with the following syntax: :<variable_name>
, for example: :CORPORATE_USER_EMAIL
.
Search Parameters
Search Parameters store the values entered into a search form and submitted to the search condition attached to that search form. They are available only for their own search form’s condition.
For more information about Search Parameters, see the Creating Search Forms section in the Semarchy xDM Developer’s Guide. |
Using Search Parameters
Parameters are used using their defined Binding, using the following syntax: :<binding_name>
, for example: :SEARCHED_NAME
.
When editing the SemQL condition of a search form, the available search parameters are listed in the Variablessection of the Expression Editor. |
Named Queries Parameters
Query Parameters store the values passed to a request made to a named query. It is used via its binding name, similarly to a Search parameter.
For more information about Query Parameters, see the Creating Named Queries section in the Semarchy xDM Developer’s Guide. |
Attribute Qualified Names
An Attribute Qualified Name is the path to an attribute from the current entity being processed. This path not only allows accessing the attributes of the entity. It also allows access to:
The attributes of the entities related to the current entity (parent and child entities).
The Lineage. For example, to access the attributes of the golden record a master record relates to.
In this section, Name always refers to the (internal) Name of an attribute, and not to the Label. The label may be translated in various languages, but the name is invariant. |
Attribute names are case sensitive. For example customerName and CustomERName do not represent the same attribute. |
Using Current Entity’s Attributes
A given SemQL clause is expressed for a given entity. For example, an enricher, a validation or a filter is performed on a pre-defined entity. Such a clause always has access to the attributes of this entity. These attributes may be simple or complex attributes.
Simple Attributes
Simple attributes can be accessed using their Name.
FirstName
returns the value of the FirstName simple attribute of the current entity (Employee).
Complex Attributes
Display Name
Complex attributes can be accessed using their Attribute Name. This returns the value of the complex attribute in the format of the corresponding complex type’s Display Name.
The SimpleAddressType complex type is defined with a display type that shows the Address, City and Country definition attributes separated by a space. This type is used for the InputAddress attribute of the Customer entity. The InputAddress
qualified name therefore returns a string containing <Address> <City> <Country>
value for each Customer.
Definition Attribute
It is also possible to retrieve the value of each definition attribute of a complex type by prefixing this definition attribute name by the name of the complex attribute.
The SimpleAddressType complex type includes the Country definition attribute. This type is used for the InputAddress attribute of the Custome entity. The InputAddress.Country
qualified name therefore returns the Country stored in the InputAddresscomplex type value for each Customer.
ID Attributes
SemQL exposes several ID attributes to identify a record at certain phases in the certification process.
For example:
A customer golden record is identified by the value in the
CustomerID
attribute.Each master record consolidated into this golden is identified by the
PublisherID
,SourceID
pair.
The table below lists the attributes representing a record ID and the value they take depending on the data access view. In this table, <IDAttribute>
refers to the name of the ID attribute of the entity.
ID Attribute | Label | Description |
---|---|---|
| Golden ID | The ID of the golden record or ID of the golden record related to the current record if any. This attribute is not available for basic entities. For fuzzy matched entities, this attribute is completed by the following attributes that track changes due to duplicates management operations:
|
| Record Source ID | The ID of the source record. This attribute is available only for views exposing master or source records, for fuzzy matched entities. |
| Record Publisher | The Code of the publisher of the source record. This attribute is available only for views exposing master or source records for fuzzy or ID matched matched. |
| Record ID | The ID of the current record. Its value depends on the view:
|
Foreign Attributes
SemQL exposes multiple attributes representing a related (parent) record ID at certain phases of the certification process.
The table below lists the various attributes representing a related record ID the value they take depending on the data access view. In this table, <ForeignAttribute>
refers to the name of the foreign attribute in the reference.
ID Attribute | Label | Description |
---|---|---|
| Referenced Record Source ID | The ID of the referenced record. This attribute is available when referring to fuzzy matched entities. |
| Referenced Record Publisher | The code of the publisher of referenced record. This attribute is available when referring to fuzzy matched entities. |
| Referenced Record ID | The ID of the referenced record. Its value depends on the referenced record type: When referencing a golden record or a basic entity record it exposes the ID of the referenced golden record. |
In forms, an attribute called FDN_<ForeignAttribute> returns a pointer to the referenced record. Use this attribute preferably to display the reference in user-friendly format. |
See also Using Related Entities’ Attributes for more information about using the attribute of the related entities. |
Built-in Attributes
Built-in attributes are provided by Semarchy xDM to support the certification process. They appear in addition to the attributes designed in the data model.
Data Access Views
Data is accessed or manipulated using SemQL via a data access view.
A data access view represents the data at a particular phase of its lifecycle in the MDM hub. For example, Golden Data, Master Data, Source Error, etc. Views are available depending on the type of the entity. For example, the Master Data view does not exist for a basic entity.
Each view is identified by an alias (GD for Golden Data, SE for source error, etc). This alias is used in the REST API to query this view, and as a prefix for the physical table containing this view’s data, if such a table exists.
The complete list of data access views with their description is available in Appendix B |
Available attributes depend on the data access view running the SemQL expression and on the type of entity (basic, ID of fuzzy matched).
For example, a clause that involves the Source Data view (for example, an enricher on source data) will support built-in attributes such as the SourceID
(ID of the source record) or the PublisherID
(Code of the application that published the source record). On Golden Records - which are not directly related to one source - these built-in attributes no longer make sense.
Built-In Attributes List
Built-in attributes are available depending on the type of entity (basic, ID or fuzzy matched), and view being used.
The complete list of built-in attributes with their description and views is available in Appendix B |
The SemQL Editor automatically lists the attributes available depending on the situation. Use this editor to make sure to use only the correct attributes. |
Using Related Entities’ Attributes
Related entities may be either parent entities (for a given relation, the current entity has zero or one parent), or child entities (for a given relation, the current entity has zero or more children).
Parent Entities
It is possible to access attributes of a parent entity by prefixing this attribute by the Role of this parent in the relation.
The current Customer entity references the Employee entity in a relation. In this relation, the Role of this second entity is AccountManager. The AccountManager.FirstName
attribute refers to the FirstName of the Employee that is the parent - in this relation, the AccountManager - of the current Customer.
Referring to parent entities can follow a chain of relations.
AccountManager.CostCenter.CostCenterName
follows two relations to return the CostCenterName of the CostCenter to which the AcountManager of the current Customer reports to.
Child Entities
Accessing child entities is possible in conditions only using the SemQL any
and all
syntax.
Any and All Syntax
The any
syntax is a condition that returns true if any of the child records meet the given condition.
any <child_entity_role> have ( <condition_on_child_entity> )
or
any <child_entity_role> has ( <condition_on_child_entity> )
To filter Customers having at least one Contact named John
any Contacts have ( FirstName = 'John' )
The all
syntax is a condition that returns true if all the child records meet the given condition.
all <child_entity_role> have ( <condition_on_child_entity> )
or
all <child_entity_role> has ( <condition_on_child_entity> )
To filter Customers having all their Contacts with the IsInfluencer flag set to '0' :
all Contacts have ( isInfluencer = '0' )
Cascading References
It is possible to cascade through several relations’ roles.
To filter Employees managing Customers having one contact with the IsInfluencer flag set to 1
:
any Customers.Contacts have ( IsInfluencer = '1' )
ParentRecord
In the Any and All syntax, it is possible to access the direct parent’s record from the condition on the child entity, through the ParentRecord
reserved keyword or through the parent’s role name in the relation.
The following condition returns all the customers having two contacts with a different ContactID but the same FirstName.
any Contacts have ( any ParentRecord.Contacts have ( ParentRecord.ContactID != ContactID and ParentRecord.FirstName = FirstName ) )
Using the Lineage
You can navigate records lineage using SemQL. Using this navigation, you can for example access the master and the golden records consolidated from a source record, or you can access all the attached master records from a golden record.
To understand the lineage structures, refer to the Data Certification section in the Semarchy xDM Integration Guide. |
Lineage Parent Records
You can access the attributes of a parent record related to your current record in the lineage.
This navigation is possible using a pseudo-role name representing the parent relation in the lineage, such as GoldenRecord
(of a master record for example), MasterRecord
or SourceRecord
.
To filter out the master records that are singletons, you can access the golden record using the GoldenRecord
pseudo-role and then use its number of masters MastersCount
:
GoldenRecord.MastersCount > 1
To get the CustomerName
consolidated in the golden record that results from a given source record:
MasterRecord.GoldenRecord.CustomerName
The complete list of built-in lineage navigation is available in Appendix B |
The SemQL Editor automatically lists the lineage navigation available depending on the situation. |
Lineage Child Records
You can access the attributes of child records related to your current record in the lineage.
This navigation is possible using a pseudo-role name representing the child records in the lineage relation, such as MasterRecords
(the master records attached to a golden record) or SourceRecords
(the source records attached to a master record).
You can use the lineage child records similarly to the Child Entities records, using the SemQL any
and all
syntax.
To filter golden records that have duplicates only from the CRM publisher:
MastersCount > 1 and all MasterRecords have ( PublisherID = 'CRM' )
To filter all master records created from source records older than 5 days.
all SourceRecords have (CreationDate < SYSDATE() - 5)
The complete list of built-in lineage navigation is available in Appendix B |
The SemQL Editor automatically lists the lineage navigation available depending on the situation. |
Special Cases
Attributes of Duplicates
Certain SemQL expressions manipulate two similar records simultaneously:
The SemQL condition that defines the match rule in a Matcher.
The SemQL condition used to filter duplicates.
For these expressions, the two similar records are identified by the RECORD1
and RECORD2
pseudo-record prefixes.
The following condition returns the duplicates with the same InputAddress.Address (complex type) but a different CustomerName
Record1.CustomerName <> Record2.CustomerName and Record1.InputAddress.Address = Record2.InputAddress.Address
Attributes in Reference Pickers Filters
In Steppers, SemQL conditions can filter the records selectable with reference pickers. These conditions usually manipulate two different records.
In this context, the records are identified using pseudo-record prefixes:
Record
represents the record being edited. It is the referencing record.Referenced
represents the selectable referenced record (the one filtered in the reference picker).
The following condition is set on a reference picker that enables selecting an Account Manager (Employee) from a Customerentity managed in a stepper. It reduces the selectable employee records to only those in the same country as the Customer being edited.
Record.Country = Referenced.Country
The following condition is set on a reference picker to select the Manager (an instance of Employee) of a given Employee. It filters the selectable managers (Referenced
) so that they are in the same CostCenter as the the current manager (Record.Manager
).
Record.Manager.FID_CostCenter = Referenced.FID_CostCenter
See the Reference Selection section in the Steppers chapter in the Semarchy xDM Developer’s Guide for more information.
USING SEMQL IN SEMARCHY XDM
This section describe the various uses of SemQL in Semarchy xDM, as well as the attributes available for these uses.
Using SemQL at Design-Time
At design-time, SemQL is used:
In the entities, to generate automatically the Source ID of an entity when creating or importing new records.
In the various rules defining the Certification Process, that is the enrichers, validations, matchers and survivorship rules.
In the Applications to define the filters applied to the lists in the business views and the form/collections attributes.
In properties that support SemQL in addition to literal values.
In the conditions used for securing data Row-Based Filtering.
SemQL for ID Generation
The Source ID of an entity can be generated using a SemQL expression. This expression is executed and the ID generated when a record form is saved for the first time in a stepper or when a record is imported from a file containing no IDs.
This expression can use:
Attributes from the entity.
Attributes of the referenced (parent) entities.
SemQL functions and operators. You can for example use the
SEQ_NEXTVAL
function to retrieve the next increment of a named sequence.
Examples of SemQL for ID Generation:
To create an ID using a literal concatenated with a sequence increment, for a customer entity:
'CUST_' || SEQ_NEXTVAL('CUST_SEQ')
To create an ID using values from the record itself, for a customer entity:
Upper(LastName) || '_' || CustomerNumber
To create an ID concatenating the IDs of two related records. Typically, for an entity that relates products and distribution markets:
DistributedProduct.ProductID || '_' || DistributionMarket.MarketID
An ID generated with a SemQL expression is immutable. This ID will not change after the initial record creation even if the value of the attributes used in the expression change. For example, if you generate the ID using the customer name and edit a saved customer, modifying its name will not alter the ID. |
SemQL in the Certification Process
The certification process takes source records pushed to the MDM Hub by identified Publishers and creates enriched, validated, matched and consolidated golden records. SemQL is involved in the various phases of this process.
Certification Process Clauses
The following table describes the expressions, condition and order by clauses used in the certification process.
Process Step | Clause Type | Description |
---|---|---|
SemQL Enricher Filter | Condition | This condition filters the source records of the entity that must go through this enricher. |
SemQL Enricher Expressions | Expression | The result of the expressions load the enriched entity’s attributes. |
Plug-in Enricher Filter | Condition | This condition filters the source records of the entity that must go through this enricher. |
Plug-in Enricher Inputs | Expression | Each expression result is pushed to a plug-in input. The outputs of the plug-in load the entity’s attributes with enriched values. |
SemQL Validation Condition | Condition | This condition defines which source or golden records pass the validation. |
Plug-in Validation Input | Expression | Each expression result is pushed to a plug-in input. The plug-in outputs a boolean defining the records that pass the validation. |
Matcher Binning Expression | Expression | Master records having the same value for all binning expressions are in the same bin, and will be matched using the matching condition. |
Matcher Matching Condition | Condition | Master records in the same bin for which the matching condition returns true are considered as matches. |
Consolidation Rule Ranking Expression | Order By | For Custom Ranking strategy, the consolidation uses values from the first of the records ordered by this expression. For other strategies, if the consolidation strategy returns two records with the same rank, the consolidation sorts them by this expression and uses the value of the first one. |
Available Attributes by Clause
In the certification process:
Attributes from the entity being processed are always available.
Built-in attributes availability depends on the type of entity and on the clause being created.
Attributes from related entities (parent or child entities as well as Lineage parent and child records) availablity also depends on the type of the entity and on the clause being created.
Use The SemQL Editor for the list of attributes available in each situation. |
Certification Process SemQL Example
Enricher Expressions
Examples of enricher conditions:
FirstName:
InitCap(FirstName)
Name:
InitCap(FirstName) || ' ' || Upper(LastName)
City:
Replace(Upper(InputAddress.City),'CEDEX','')
EmployeeCostCenterName :
CostCenter.CostCenterName
: Current Employee entity references the CostCenter entity and this expression returns an employee’s cost center name
Validation Conditions
Checking the Customer’s InputAddress complex attribute validity:
InputAddress.Address is not null and ( InputAddress.PostalCode is not null or InputAddress.City is not null)
In this example, the IS NOT NULL
, AND
and OR
SemQL operators are used to build the condition.
It is possible to use child records in validations ( See Child Entities for more information).
For example, a Customer is valid only if all its attached Contacts have an email:
ALL Contacts HAVE (EmailAddress is not null)
When running such validation pre-consolidation, only the contacts within the same load are considered. Post consolidation, contacts from the load as well as those already in the hub are taken into account. |
Matcher
Binning Expression to group customers by their Country/PostalCode:
InputAddress.Country || InputAddress.PostalCode
Matching Condition: Matching two customer records by name, address and city name similarity:
SEM_EDIT_DISTANCE_SIMILARITY( Record1.CustomerName, Record2.CustomerName ) > 65 and SEM_EDIT_DISTANCE_SIMILARITY( Record1.InputAddress.Address, Record2.InputAddress.Address ) > 65 and SEM_EDIT_DISTANCE_SIMILARITY( Record1.InputAddress.City, Record2.InputAddress.City ) > 65
In this second example, SEM_EDIT_DISTANCE_SIMILARITY
is a SemQL function. Record1
and Record2
are predefined names for qualifying the two records to match.
Matching on child entities is possible by configuring the rule to use Child Records.
For example, to match two Customers when they have a Contact with the same FirstName and LastName and a CEO title, and the Customer names are 50% similar, configure the match rule with the Child Records set to Contact and use the following clause :
Record1.FirstName = Record2.FirstName AND Record1.LastName = Record2.LastName AND Record1.JobTitle = 'CEO' and Record2.JobTitle = 'CEO' AND SEM_EDIT_DISTANCE_SIMILARITY (Record1.Customer.Name, Record2.Customer.Name) > 50
This rule matches Customers but its base entity is Contact. So the Record1.Customer.Name
expression returns the name of the customer the contact we match is attached to.
SemQL in Privilege Grants
Entity privileges support Row-Level Filtering, to apply privileges only to a subset of the records of the entity. The subsets are defined using SemQL filters.
Each entity privilege grant may be associated with a filtering Condition.
This condition has access to all attributes of the entity, its related entities and built-in attributes ClassName, CreationDate, UpdateDate, Creator and Updator.
To grant a given privilege to a user whose email address (stored in a CORPORATE_USER_EMAIL
variable) appears in the current record in the EmailAddress attribute, the filter is EmailAddress = :CORPORATE_USER_EMAIL
.
SemQL in Applications
Business View Transition Path
Business entities in business views are related using transition paths. A transition path is an expression pointing to records in a directly or indirectly related entity.
For example, to display all the siblings of an employee in a manager/report hierarchy, the following transition path is used:Manager.Reports
.
Business View Filters
Each business entity of a business view supports a Filter condition that filters the records displayed in this business entity.
This condition has access to all attributes of the entity, its related entities and built-in attributes ClassName, CreationDate, UpdateDate, Creator and Updator.
To filter at the root of the hierarchy of cost centers only those with no parent, the following filter is applied:FID_ParentCostCenter is null
. FID_ParentCostCenter is the attribute representing the relation between a cost center and its parent cost center.
Form Fields/Table Columns/Properties
Form fields and table columns, as well as application properties support SemQL Expressions. These expressions allow building composite attributes for display purposes and customizing the appearance of the application.
When a form is used for data authoring, form fields using SemQL expressions appear as read-only. |
This expression has access to all attributes of the entity, its related entities and built-in attributes.
If you use in a form field a built-in attribute specific to a certain view (for example, the PublisherID, which exists on source and master data but not on golden data), this form field is automatically hidden when the form is used with a view that does not support the built-in attribute. |
To display a custom attribute called ContactsStatus on a Customer form which displays whether this customer has contacts and whether these contacts have the IsInfluencer flag set to '1':
CASE WHEN ANY Contacts HAVE (IsInfluencer = 1 ) THEN 'One or More Influencers' WHEN NOT ANY Contacts HAVE (1=1) THEN 'No Contacts' ELSE 'No Influencer Contact' END
Predefined Sort
In Business Views, you can define an Expression that is used to sort the records under the node in the business view.
Reference Picker Filters
In Business Views, SemQL conditions can filter the records selectable with reference pickers. See Reference Pickers for more information.
Search Forms
In Search Forms, you define a SemQL condition to filter records based on the search parameters values entered in the form.
This condition has access to all attributes of the searched entity, its related entities and built-in attributes. Note that certain built-in attributes should be used with caution as some do not apply to all the views the search form may apply to.
Using SemQL at Run-Time
Filters
All date view in an application support filters in the form of Conditions.
This condition has access to all attributes of the entity, its related entities and built-il attributes that depend on the view that is accessed.
For example, on a Golden Record: BatchID, ClassName, CreationDate, UpdateDate, Creator and Updator are available.
Duplicates Filters
When filtering duplicates, for example to filter records to checkout in a duplicate management activity, it is possible to use Record1
and Record2
as in a Matcher, except that the comparison takes place on the pairs of records that have matched.
To show matching records, but with the CustomerName fields less similar by less than 95%:
SEM_EDIT_DISTANCE_SIMILARITY(Record1.CustomerName, Record2.CustomerName) < 95
THE SEMQL EDITOR
The SemQL editor can be called from the workbench when a SemQL expression, condition or clause needs to be built.
This editor is organized as follows:
Attributes available for the expression appear in left panel. Double-click an attribute to add it to the expression.
Functions declared in SemQL appear in the left bottom panel, grouped in function groups. Double-click a function to add it to the expression.
Variables available for the expression appear in the bottom center panel.
Messages appear in the right bottom panel, showing parsing errors and warnings.
Description for the selected function or attribute appear at the bottom of the editor.
The Toolbar allows to indent the code or hide/display the various panels of the editor and to undo/redo code edits.
APPENDIX A: SEMQL FUNCTIONS LIST
The following tables lists the built-in functions available in SemQL.
Functions for Oracle
The following functions are available when using Oracle.
Function | Description |
---|---|
| Returns the absolute value of number. |
| Returns the arc cosine of number. The argument number must be in the range of -1 to 1, and the function returns a value in the range of 0 to pi, expressed in radians. |
| Returns the date "date" plus "integer" months. The date argument can be a datetime value or any value that can be implicitly converted to DATE. The integer argument can be an integer or any value that can be implicitly converted to an integer. |
| Returns the decimal representation in the database character set of the first character of string. |
| Takes as its argument a string, or an expression that resolves to a string, in any character set and returns an ASCII version of the string in the database character set. Non-ASCII characters are converted to the form xxxx, where xxxx represents a UTF-16 code unit. |
| Returns the arc sine of number. The argument number must be in the range of -1 to 1, and the function returns a value in the range of -pi/2 to pi/2, expressed in radians. |
| Returns the arc tangent of number. The argument number can be in an unbounded range and the function returns a value in the range of -pi/2 to pi/2, expressed in radians. |
| Returns the arc tangent of number1 and number2. The argument number1 can be in an unbounded range and the function returns a value in the range of -pi to pi, depending on the signs of number1 and number2, expressed in radians. ATAN2(n1,n2) is the same as ATAN2(n1/n2). |
| Converts a bit vector to its equivalent number. Each argument to this function represents a bit in the bit vector. This function takes as arguments any numeric data type, or any nonnumeric data type that can be implicitly converted to a number. Each expr must evaluate to 0 or 1. |
| Computes an AND operation on the bits of expr1 and expr2, both of which must resolve to nonnegative integers, and returns an integer. |
| Returns the smallest integer greater than or equal to number. |
| Returns the character having the binary equivalent to number as a string value in the database character set. |
| Returns the first non-null expr in the expression list. At least one expr must not be the literal NULL. If all occurrences of expr evaluate to null, then the function returns null. |
| Takes as its argument a string, or an expression that resolves to a string, in any data type, and returns a Unicode string in its fully normalized form in the same character set as the input. |
| Returns string1 concatenated with string2. This function is equivalent to the concatenation operator. |
| Converts a character string from one character set to another. |
| Returns the cosine of number (an angle expressed in radians). |
| Returns the hyperbolic cosine of number. |
| Returns the current date in the session time zone, in a value in the Gregorian calendar. |
| Returns the current date and time in the session time zone. If you omit precision, then the default is 6. |
| Returns the value of the database time zone. The return type is a time zone offset (a character type in the format '[+|-]TZH:TZM') or a time zone region name. |
| Compares expr to each search value one by one. If expr is equal to a search, then it returns the corresponding result. If no match is found, then it returns default. If default is omitted, then Oracle returns null. |
| Takes as its argument a string in any data type and returns a Unicode string after decomposition in the same character set as the input. For example, an o-umlaut code point will be returned as the "o" code point followed by an umlaut code point. |
| Returns e raised to the number-th power, where e = 2.71828183 … The function returns a value of the same type as the argument. |
| Extracts and returns the day from expr. expr must be a valid ANSI date. |
| Extracts and returns the hour from expr. expr must be a valid ANSI datetime. |
| Extracts and returns the minute from expr. expr must be a valid ANSI datetime. |
| Extracts and returns the month from expr. expr must be a valid ANSI date. |
| Extracts and returns the second from expr. expr must be a valid ANSI datetime. |
| Extracts and returns the abbreviation of the timezone from expr. expr must be a valid ANSI datetime including a timezone. |
| Extracts and returns the hour of the timezone from expr. expr must be a valid ANSI datetime including a timezone. |
| Extracts and returns the minute of the timezone from expr. expr must be a valid ANSI datetime including a timezone. |
| Extracts and returns the region of the timezone from expr. expr must be a valid ANSI datetime including a timezone. |
| Extracts and returns the year from expr. expr must be a valid ANSI date. |
| Returns largest integer equal to or less than number. |
| Converts a timestamp value and a time zone to a TIMESTAMP WITH TIME ZONE value. Time_zone_value is a character string in the format 'TZH:TZM' or a character expression that returns a string in TZR with optional TZD format. |
| Returns the greatest of the list of one or more expressions. |
| Converts string containing hexadecimal digits to a raw value. |
| Returns string, with the first letter of each word in uppercase, all other letters in lowercase. Words are delimited by white space or characters that are not alphanumeric. |
| Searches string for substring using the input character set. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for. |
| Searches string for substring using UC2 code points. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for. |
| Searches string for substring using UC4 code points. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for. |
| Searches string for substring using bytes instead of character. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for. |
| Searches string for substring using Unicode complete characters. The function returns an integer indicating the position of the first occurrence. Optionally, you can indicate the starting position for the search and the occurrence to search for. |
| Returns the date of the last day of the month that contains date. |
| Returns the least of the list of expressions. |
| Returns the length of string. Length is calculated using characters as defined by the input character set. |
| Returns the length of string. Length is calculated using characters as defined by the UC2 code point. |
| Returns the length of string. Length is calculated using characters as defined by the UC4 code point. |
| Returns the length of string. Length is calculated using bytes instead of characters. |
| Returns the length of string. Length is calculated using Unicode complete characters. |
| Returns the natural logarithm of number, where number is greater than 0. |
| Returns the current date and time in the session time zone. |
| Returns the logarithm, base number2, of number1. The base number1 can be any positive value other than 0 or 1 and number2 can be any positive value. |
| Returns char, with all letters lowercase. |
| Returns expr1, left-padded to length number characters with the sequence of characters in expr2. If you do not specify expr2, then the default is a single blank. |
| Removes from the left end of string all of the characters contained in set_of_chars. If you do not specify set_of_chars, it defaults to a single blank. |
| Returns the remainder of number2 divided by number1. Returns number2 if number1 is 0. |
| Returns number of months between dates date1 and date2. If date1 is later than date2, then the result is positive. If date1 is earlier than date2, then the result is negative. |
| If number1 is not a number (NaN) then NANVL returns number2. Otherwise, it returns number1. |
| Returns the character having the binary equivalent to number as a string value in the national character set. |
| Returns the date and time (given in timezone 1) converted in time zone timezone2. |
| Returns the date of the first weekday named by day_name that is later than the date date. |
| Returns a collation key for string, that is a string of bytes used to sort strings. lsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence. This function manages language-specific sorting. |
| Returns string with the first letter of each word in uppercase and all other letters in lowercase. nlsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence. This function manages language-specific characters case changes. |
| Returns string with all letters in lowercase. nlsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence. |
| Returns string with all letters in uppercase. nlsparam is in the form NLS_SORT=sort where sort is either BINARY or a liguistic sort sequence. |
| Compares expr1 and expr2. If they are equal, then the function returns null. If they are not equal, then the function returns expr1. You cannot specify the literal NULL for expr1. The NULLIF function is logically equivalent to the following CASE expression: CASE WHEN expr1 = expr 2 THEN NULL ELSE expr1 END |
| Converts number to an INTERVAL DAY TO SECOND literal. The value for interval_unit specifies the unit of number and must resolve to one of the following string values: 'DAY', 'HOUR', 'MINUTE', 'SECOND'. |
| Converts number to an INTERVAL YEAR TO MONTH literal. The value for interval_unit specifies the unit of number and must resolve to one of the following string values: 'YEAR', 'MONTH'. |
| If expr1 is null, then NVL returns expr2. If expr1 is not null, then NVL returns expr1. |
| If expr1 is not null, then NVL2 returns expr2. If expr1 is null, then NVL2 returns expr3. |
| Computes a hash value for a given expression. The expr argument determines the data for which you want to compute a hash value. The optional max_bucket argument determines the maximum bucket value returned by the hash function. You can specify any value between 0 and 4294967295. The default is 4294967295. The optional seed_value argument produces many different results for the same set of data. |
| Returns number2 raised to the number1 power. The base number2 and the exponent number1 can be any numbers, but if number2 is negative, then number1 must be an integer. |
| Converts raw to a character value containing its hexadecimal equivalent. |
| Extends the functionality of the REGEXP_INSTR function by returning the number of times a pattern occurs in a source string, starting at position. match_param contains one of more of the following values: i - Case-insensitive match, c - Case-sensitive match, n - Allow '.' to match even the newline character, m - treat input as multiple lines, x - ignore whitespaces. |
| Extends the functionality of the INSTR function by letting you search a string for a regular expression pattern. It returns an integer indicating the beginning or ending position of the matched substring, depending on the value of the return_option argument. If no match is found, the function returns 0. |
| Extends the functionality of the REPLACE function by letting you search a string for a regular expression pattern. By default, the function returns source_char with every occurrence of the regular expression pattern replaced with replace_string. |
| Extends the functionality of the SUBSTR function by letting you search a string for a regular expression pattern. It is also similar to REGEXP_INSTR, but instead of returning the position of the substring, it returns the substring itself. This function is useful if you need the contents of a match string but not its position in the source string. |
| Returns the remainder of number2 divided by number1. |
| Returns string with every occurrence of search_string replaced with replacement_string. If replacement_string is omitted or null, then all occurrences of search_string are removed. If search_string is null, then string is returned. |
| Returns date_or_number rounded to the unit specified by the format model fmt_or_integer. |
| Returns expr1, right-padded to length number characters with expr2, replicated as many times as necessary. If you do not specify expr2, then it defaults to a single blank. |
| Removes from the right end of string all of the characters that appear in set_of_chars. If you do not specify set_of_chars, then it defaults to a single blank. |
| Converts a boolean to its standard string representation. |
| Returns a string concatenating the input values with the separator, optionally removing null values |
| Calculates the distance between two strings, that is the number of insertions, deletions or substitutions required to transform string1 into string2. If one or both of the strings are null the distance will be the largest integer value (2147483647). Note that this function measures the distance in number of bytes and not in characters. As a consequence, strings stored using variable-width characters sets (UTF-8, for example) can cause counter-intuitive results. It is recommended to convert these strings to a fixed-width character set (AL16UTF16, for example) prior to passing them to this function. |
| Calculates the distance between string1 into string2 (as described in the SEM_EDIT_DISTANCE function), and returns the Normalized value of the Edit Distance between two Strings. The value is between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0. |
| Finds the location of a substring within a specified string. |
| Calculates the measure of agreement between two strings using Jaro-Winkler method. The value is between 0 (no match) and 1 (perfect match). If one or both strings are null the result will be 0. |
| Calculates the measure of agreement between two strings using Jaro-Winkler method, and returns a score between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0. |
| Calculates the measure of agreement between two strings using the Dice’s coefficient similarity measure applied to the n-grams of the strings, and returns a score between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0. The ngrams_length parameter defines the length of the n-grams (2 by default). |
| Returns a string with Latin (supplement, Extended-A and Extended-B) characters converted into their ASCII equivalents, and other (space and non-alphanumeric) characters eliminated. |
| Converts a number to its standard string representation. |
| Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using characters as defined by the input character set. |
| Converts a timestamp to its standard string representation. |
| Converts a string or a number to its standard string representation. |
| Converts a UUID to its standard string representation. |
| Get the next value of a sequence. Note that this function is not supported in Enrichers. |
| Returns the sign of number. The sign is: -1 if n<0, 0 if n=0, 1 if n>0. |
| Returns the sine of number (an angle expressed in radians). |
| Returns the hyperbolic sine of number. |
| Returns a character string containing the phonetic representation of string. This function lets you compare words that are spelled differently, but sound alike in English. |
| Returns the square root of number. |
| Computes a hash value for a given expression and returns it in a RAW value. The optional method lets you choose the hash algorithm (defaults to SHA1) in the following list: SHA1, SHA256, SHA384, SHA512 and MD5. This function requires Oracle version 12c or above. |
| Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using characters as defined by the input character set. |
| Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using UCS2 code points. |
| Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using UCS4 code points. |
| Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using bytes instead of characters. |
| Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using Unicode complete characters. |
| Returns the current date and time set for the operating system on which the database resides. |
| Returns the system date, including fractional seconds and time zone, of the system on which the database resides. |
| Returns the value of parameter associated with the context namespace. |
| Extracts the UTC (Coordinated Universal Time—formerly Greenwich Mean Time) from a datetime value with time zone offset or time zone region name. |
| Generates and returns a globally unique identifier (RAW value) made up of 16 bytes. |
| Returns the tangent of number (an angle expressed in radians). |
| Returns the hyperbolic tangent of number. |
| Returns a double-precision floating-point number. |
| Returns a single-precision floating-point number. |
| Converts expr to its string representation optionally using fmt and nlsparam for the conversion. |
| Converts expr to a CLOB (large string) |
| Converts string to a date value. The fmt is a datetime model format specifying the format of string. If you omit fmt, then string must be in the default date format. If fmt is J, for Julian, then string must be an integer. |
| Converts a character string to an INTERVAL DAY TO SECOND value. |
| Returns string with all of its single-byte characters converted to their corresponding multibyte characters. |
| Converts expr to a number value using the optional format model fmt and nlsparam. |
| Returns string with all of its multibyte characters converted to their corresponding single-byte characters. |
| Converts string a timestamp value. The optional fmt specifies the format of string. |
| Converts string to a TIMESTAMP WITH TIME ZONE value. The optional fmt specifies the format of string. |
| Converts string to an INTERVAL YEAR TO MONTH type. |
| Returns expr with all occurrences of each character in from_string replaced by its corresponding character in to_string. Characters in expr that are not in from_string are not replaced. The argument from_string can contain more characters than to_string. In this case, the extra characters at the end of from_string have no corresponding characters in to_string. If these extra characters appear in string, then they are removed from the return value. |
| Removes from the left and right ends of string all of the characters contained in set_of_chars. If you do not specify set_of_chars, it defaults to a single blank. |
| When expr is a date, returns expr with the time portion of the day truncated to the unit specified by the format model fmt_or_number. If you omit fmt_or_number, then date is truncated to the nearest day. When expr is a number, returns expr truncated to fmt_or_number decimal places. If fmt_or_number is omitted, then expr is truncated to 0 places. |
| Takes as its argument a text literal or an expression that resolves to character data and returns it in the national character set. The national character set of the database can be either AL16UTF16 or UTF8. UNISTR provides support for Unicode string literals by letting you specify the Unicode encoding value of characters in the string. |
| Returns string with all letters uppercase. |
| Lets you construct equiwidth histograms, in which the histogram range is divided into intervals that have identical size. For a given expression, the function returns the bucket number into which the value of this expression would fall after being evaluated. Expr must evaluate to a numeric or datetime. Min_value and max_value are expressions that resolve to the end points of the acceptable range for expr. Both of these expressions must also evaluate to numeric or datetime values, and neither can evaluate to null. |
Functions for PostgreSQL
The following functions are available when using PostgreSQL.
Function | Description |
---|---|
| Returns the absolute value |
| Returns the inverse cosine |
| Subtracts arguments, producing a symbolic result that uses years and months, rather than just days |
| Concatenates input values, including nulls, into an array. If the inputs are arrays, concatenates them into an array of one higher dimension (inputs must all have same dimensionality, and cannot be empty or NULL) |
| ASCII code of the first character of the argument. For UTF8 returns the Unicode code point of the character. For other multibyte encodings, the argument must be an ASCII character. |
| Returns the inverse sine |
| Returns the inverse tangent |
| Returns the inverse tangent of number_1/number_2 |
| Returns the average (arithmetic mean) of all input values |
| Returns the bitwise AND of all non-null input values, or null if none |
| Returns the number of bits in string |
| Returns the bitwise OR of all non-null input values, or null if none |
| Returns true if all input values are true, otherwise false |
| Returns true if at least one input value is true, otherwise false |
| Removes the longest string consisting only of characters in characters (a space by default) from the start and end of string |
| Returns the cube root |
| Returns the nearest integer greater than or equal to argument |
| Returns the nearest integer greater than or equal to argument (same as ceil) |
| Returns the number of characters in string |
| Returns the character with the given code. For UTF8 the argument is treated as a Unicode code point. For other multibyte encodings the argument must designate an ASCII character. The NULL (0) character is not allowed because text data types cannot store such bytes. |
| Returns the current date and time (changes during statement execution) |
| The COALESCE function returns the first of its arguments that is not null. Null is returned only if all arguments are null. |
| Concatenates the text representations of all the arguments. NULL arguments are ignored. |
| Concatenates all but the first argument with separators. The first argument is used as the separator string. NULL arguments are ignored. |
| Converts string to dest_encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. |
| Converts string to the database encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. |
| Converts string to dest_encoding. |
| Returns the cosine |
| Returns the cotangent |
| Returns the number of input rows for which the value of expression is not null |
| Returns the current date |
| Returns the current time of day |
| Returns the current date and time (at the beginning of current the transaction) |
| Return value most recently obtained with nextval for specified sequence |
| Creates a range of dates. Inclusivity is a string with an opening/closing square bracket or parenthesis indicating inclusiveness, with a default of '(]'. |
| Get subfield from a timestamp or an interval. The part to extract is defined by the text. |
| Truncate timestamp or interval to the precision specified in the text. |
| Decodes binary data from textual representation in string. Options for format are same as in encode. |
| Converts radians to degrees |
| Returns the integer quotient of number_1/number_2 |
| Encodes binary data into a textual representation. Supported formats are: base64, hex, escape. escape converts zero bytes and high-bit-set bytes to octal sequences (\nnn) and doubles backslashes. |
| Equivalent to bool_and |
| Returns the exponential |
| Returns the nearest integer less than or equal to argument |
| Formats the arguments according to format_string. This function is similar to the C function sprintf. |
| Extract bit from string |
| Extract byte from string |
| The GREATEST function selects the largest value from a list of any number of expressions. |
| Converts the first letter of each word to upper case and the rest to lower case. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. |
| Creates a range of integers. Inclusivity is a string with an opening/closing square bracket or parenthesis indicating inclusiveness, with a default of '(]'. |
| Creates a range of bigints. Inclusivity is a string with an opening/closing square bracket or parenthesis indicating inclusiveness, with a default of '(]'. |
| Returns a boolean indicating whether the range empty. |
| Tests for finite date, interval or timestamp (not +/-infinity) |
| Aggregates values as a JSON array |
| Aggregates name/value pairs as a JSON object |
| Aggregates values as a JSON array |
| Aggregates name/value pairs as a JSON object |
| Adjust interval so 30-day time periods are represented as months |
| Adjust interval so 24-hour time periods are represented as days |
| Adjust interval using justify_days and justify_hours, with additional sign adjustments |
| Return value most recently obtained with nextval for any sequence |
| The LEAST function selects the smallest value from a list of any number of expressions. |
| Returns the first n characters in the string. When n is negative, return all but last n characters. |
| Returns the number of characters in string with an optional given encoding. The string must be valid in this encoding. |
| Returns the natural logarithm. |
| Returns the current time of day |
| Return the current date and time (at start of current transaction) |
| Returns the logarithm in base 10. |
| Convert string to lower case |
| Returns a boolean indicating whether the lower bound of the range is inclusive. |
| Returns a boolean indicating whether the lower bound of the range is infinite. |
| Fills up the string to length length by prepending the characters fill_text (a space by default). If the string is already longer than length then it is truncated (on the right). |
| Removes the longest string containing only characters from characters (a space by default) from the start of string |
| Create date from integer year, month and day fields |
| Create interval from years, months, weeks, days, hours, minutes and seconds fields. If a field is left empty, it defaults to zero. |
| Create time from hour, minute and seconds fields |
| Create timestamp from year, month, day, hour, minute and seconds fields |
| Create timestamp with time zone from year, month, day, hour, minute and seconds fields; if timezone is not specified, the current time zone is used |
| Returns the maximum value of expression across all input values |
| Calculates the MD5 hash of string, returning the result in hexadecimal |
| Returns the minimum value of expression across all input values |
| Returns the remainder of number_1/number_2 |
| Advance sequence and return new value |
| Returns the current date and time (start of current transaction) |
| The NULLIF function returns a null value if value1 equals value2; otherwise it returns value1. |
| Creates a range of numerics. Inclusivity is a string with an opening/closing square bracket or parenthesis indicating inclusiveness, with a default of '(]'. |
| Returns the number of non-null values |
| Returns the number of null values |
| Returns the number of bytes in string |
| Overlays string with overlay_string, starting at start_position and for length characters. |
| Returns the PI constant. |
| Returns number_1 raised to the power of number_2 |
| Returns the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary (i.e., if the string contains non-identifier characters or would be case-folded). Embedded quotes are properly doubled. |
| Returns the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single-quotes and backslashes are properly doubled. Note that quote_literal returns null on null input; if the argument might be null, quote_nullable is often more suitable. |
| Returns the given string suitably quoted to be used as a string literal in an SQL statement string; or, if the argument is null, return NULL. Embedded single-quotes and backslashes are properly doubled. |
| Converts degrees to radians |
| Returns a random value in the [0, 1] range. |
| Returns the lower bound of range |
| Returns the smallest range which includes both of the given ranges |
| Returns the upper bound of range |
| Returns the captured substring(s) resulting from the first match of a POSIX regular expression to the string. |
| Replaces the substring(s) matching a POSIX regular expression. |
| Splits the string using a POSIX regular expression as the delimiter. |
| Repeats string the specified number of times |
| Replaces all occurrences in string of substring from with substring to |
| Returns the reversed string. |
| Returns last n characters in the string. When n is negative, return all but first |n| characters. |
| Rounds the number to the nearest integer or to int decimal places |
| Fills up the string to length length by appending the characters fill (a space by default). If the string is already longer than length then it is truncated. |
| Removes the longest string containing only characters from characters (a space by default) from the end of string |
| Returns the scale of the argument (the number of decimal digits in the fractional part) |
| Converts a boolean to its standard string representation. |
| Explicitly cast its argument as a boolean. This can help the database or JDBC driver to infer the type of a bound parameter. |
| Explicitly cast its argument as a numeric value (decimal or integer). This can help the database or JDBC driver to infer the type of a bound parameter. |
| Explicitly cast its argument as a string (varchar). This can help the database or JDBC driver to infer the type of a bound parameter. |
| Explicitly cast its argument as a timestamp. This can help the database or JDBC driver to infer the type of a bound parameter. |
| Explicitly cast its argument as a UUID. This can help the database or JDBC driver to infer the type of a bound parameter. |
| Returns a string concatenating the input values with the separator, optionally removing null values |
| Calculates the number of insertions, deletions or substitutions required to transform string1 into string2. If one or both of the strings are null the distance will be the largest integer value (2147483647). |
| Calculates the number of insertions, deletions or substitutions required to transform string1 into string2, and returns the Normalized value of the Edit Distance between two Strings. The value is between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0. |
| Finds the location of a substring within a specified string. |
| Calculates the measure of agreement between two strings using Jaro-Winkler method. The value is between 0 (no match) and 1 (perfect match). If one or both strings are null the result will be 0. |
| Calculates the measure of agreement between two strings using Jaro-Winkler method, and returns a score between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0. |
| Calculates the measure of agreement between two strings using the Dice’s coefficient similarity measure applied to the n-grams of the strings, and returns a score between 0 (no match) and 100 (perfect match). If one or both strings are null the result will be 0. The ngrams_length parameter defines the length of the n-grams (2 by default). |
| Returns a string with Latin (supplement, Extended-A and Extended-B) characters converted into their ASCII equivalents, and other (space and non-alphanumeric) characters eliminated. |
| Converts a number to its standard string representation. |
| Returns a portion of string, beginning at character position, substring_length characters long. If position is 0, then it is treated as 1. If position is positive, then the function counts from the beginning of string to find the first character. If position is negative, then the function counts backward from the end of string. If substring_length is omitted, then the function returns all characters to the end of string. Length is calculated using characters as defined by the input character set. |
| Converts a timestamp to its standard string representation. |
| Converts a string or a number to its standard string representation. |
| Converts a UUID to its standard string representation. |
| Get the next value of a sequence. Note that this function is not supported in Enrichers. |
| Sets the seed for subsequent random() calls (value between -1.0 and 1.0, inclusive) |
| Set sequence’s current value |
| Set bit in string |
| Set byte in string |
| Returns the sign of the argument (-1, 0, +1) |
| Returns the sine. |
| Splits string on delimiter and return the element at position (counting from one) |
| Returns the square root |
| Returns the current date and time (start of current statement) |
| Return input values concatenated into a string, separated by delimiter |
| Returns the location of specified substring in string |
| Extracts a substring from string starting at from position and for count characters. |
| Extracts from string a substring matching the pattern (a POSIX regular expression). |
| Extracts from string a substring matching an SQL regular expression. |
| Returns the sum of expression across all input values |
| Returns the tangent. |
| Returns the current date and time (like clock_timestamp, but as a text string) |
| Converts string to ASCII from another encoding (only supports conversion from LATIN1, LATIN2, LATIN9, and WIN1250 encodings) |
| Convert the value (time stamp, interval, integer, real/double, numeric, string) to string according to the format. |
| Converts the string to date |
| Converts number to its equivalent hexadecimal representation |
| Converts the string to numeric |
| Converts the string to time stamp |
| Returns the current date and time (start of current transaction) |
| Any character in string that matches a character in the from set is replaced by the corresponding character in the to set. If from is longer than to, occurrences of the extra characters in from are removed. |
| Truncates the number toward zero or to int decimal places |
| Creates a range of timestamps without time zone. Inclusivity is a string with an opening/closing square bracket or parenthesis indicating inclusiveness, with a default of '(]'. |
| Creates a range of timestamps with time zone. Inclusivity is a string with an opening/closing square bracket or parenthesis indicating inclusiveness, with a default of '(]'. |
| Converts string to upper case |
| Returns a boolean indicating whether the upper bound of the range is inclusive. |
| Returns a boolean indicating whether the upper bound of the range is infinite. |
| Returns the bucket number to which operand would be assigned in a histogram having count equal-width buckets spanning the range b1 to b2; returns 0 or count+1 for an input outside the range. |
| Returns the bucket number to which operand would be assigned given an array listing the lower bounds of the buckets; returns 0 for an input less than the first lower bound; the thresholds array must be sorted, smallest first, or unexpected results will be obtained. |
| Returns the concatenation of XML values |
APPENDIX B: DATA ACCESS VIEWS, BUILT-IN ATTRIBUTES AND LINEAGE NAVIGATION
The following tables lists the built-in views, attributes and lineage navigation transitions available in SemQL.
Data Access Views
The following table lists the available data access views. Each view is identified by an alias (for example SD for the Source Data) that corresponds to the physical table storing this data if such a table exists.
Alias | Name | Description |
---|---|---|
AE | Source Authoring Errors | Errors detected on source authoring data. |
DU | Duplicates | Duplicates detected by the matching process. |
GD | Golden Data | Golden consolidated and certified records. |
GDWE | Golden Data with Errors | Golden data with errors. |
GE | Post-Consolidation Errors | Errors detected after master data consolidation (post-consolidation). |
GH | Golden History | Golden records history. |
GH4B | Golden Data As of Batch | Golden data history as of batch or golden data if not historized. |
GI | Golden Integration | Latest consolidated (integrated) golden records before post-consolidation validation. |
GX | Deleted Golden | Deleted golden records logs (and data for soft delete). |
MD | Master Data | Enriched, validated and cleansed master records. |
MH | Master History | Master records history. |
MH4B | Master Data As of Batch | Master data history as of batch or master data if not historized. |
MI | Master Integration | Latest integrated master records. |
MX | Deleted Master | Deleted master records logs (and data for soft delete). |
SA | Source Authoring | Source data authored by users. |
SA4L | Source Authoring for Load | Source data authored by users for a given load. |
SA4LK | Source Authoring Lookup | References lookup for source authoring data. |
SAWE | Source Authoring with Errors | Source data authored by users with errors. |
SD | Source Data | Source data successively loaded by the publishers. |
SD4L | Source Data for Load | Source data loaded by the publishers for a given load. |
SD4LK | Source Data Lookup | References lookup for source data loaded by the publishers for a given load. |
SDWE | Source Data with Errors | Source data loaded by the publishers with errors. |
SE | Source Errors | Errors detected on source data loaded by the publishers (pre-consolidation). |
UG | Duplicates Management Consolidated Data | Data consolidated from user decisions taken in a duplicates manager. |
UM | Duplicates Management User Decisions | Master records changed by a user using duplicates management. |
Built-in Attributes
The following table lists the attributes, with the views into which they are available.
Name | Label | Views | Description |
---|---|---|---|
| Authoring Type | SA, SA4L | For matched entities, type of authoring operation:
|
| Batch Date | SA, SAWE, SD, SDWE | Date of execution of the batch. |
| Batch ID | AE, GD, GDWE, GE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SAWE, SD, SD4L, SDWE, SE, UM | ID of the batch into which the new data, data changes, overrides or duplicate decisions were applied, or during which errors were detected. |
| Batch Submitter | SA, SAWE, SD, SDWE | User who submitted this batch. |
| Class Name | AE, GD, GDWE, GE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SA4LK, SAWE, SD, SD4L, SD4LK, SDWE, SE, UG, UM | Unique name of class / entity to which this record belongs |
| Confidence Score | GD, GDWE, GH4B, GI, MD, MH, MH4B, MI, MX, UG, UM | Confidence Score of the golden record. It is the average of the match scores in the match group. |
| Confirmation Status | GD, GDWE, GH4B, GI, MD, MH, MH4B, MI, MX, UG, UM | Confirmation status for duplicate management:
|
| Constraint Name | AE, GE, SE | For error records, name of the constraint causing this error. |
| Constraint Type | AE, GE, SE | For error records, type of the constraint causing this error. |
| Created On | GD, GDWE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SA4LK, SAWE, SD, SD4L, SD4LK, SDWE, UM | Creation date of a record |
| Created By | GD, GDWE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SA4LK, SAWE, SD, SD4L, SD4LK, SDWE, UM | User who created the record |
| Deleted By | GX, MX, SA, SA4L, SA4LK, SD, SD4L, SD4LK | User for deleted the record |
| Deleted On | GX, MX, SA, SA4L, SA4LK, SD, SD4L, SD4LK | Deletion date of a record |
| Delete Operation | GX, MX, SA, SA4L, SA4LK, SD, SD4L, SD4LK | Delete operation ID. |
| Delete Path | GX, MX, SA, SA4L, SA4LK, SD, SD4L, SD4LK | Cascade path through which the record was reached during a delete. Null for record directly selected for deletion. |
| Delete Type | GX, MX | Delete Type ( |
| Checkout Cause | UM | Cause that made the record part of the duplicate management transaction. Possible causes are:
|
| Dups Operation ID | UM | Identifier for a duplicate management operation |
| Error Status | GD, GDWE, GH4B, GI, SA, SAWE, SD, SDWE | Error Status of a record. This value indicates whether the source or golden record has passed successfully or not validations. Possible values are:
|
| Exclusion Group | MD, MH, MH4B, MI, MX, UM | Exclusion Group ID. An exclusion group represents a group of records for which a user has taken split decisions. |
| From Batch ID | GH, GH4B, MH, MH4B | Batch at which the history record was created. |
| Golden Type | GD, GDWE, GH4B | For fuzzy matching and ID matching entities, indicates whether the golden record was created and authored only in the MDM ( |
| Has Override | GD, GH4B | For fuzzy matching and ID matching entities, this flag ( |
| Has Sugg. Merge | GD, GDWE, GH4B, GI, MD, MH, MH4B, MI, MX, UM | Flag ( |
| Is Confirmed | GD, GDWE, GH4B, GI | Flag ( |
| Load ID | AE, SA, SA4L, SAWE, SD, SD4L, SDWE, SE, UM | Load Identifier used as the unique transaction ID for external application pushing data to the platform |
| Masters Count | GD, GDWE, GH4B, GI, MD, MH, MH4B, MI, MX, UG, UM | Number of master records contributing to the golden record. |
| Match Group ID | MD, MH, MH4B, MI, MX | ID of the match group for the master record. This column is set when matching takes place. |
| Old Match Group ID | MI | Previous identifier of the match group for the master record. |
| Original Batch ID | SA, SA4L, SAWE, SD, SD4L, SDWE, UM | Batch identifier of the record when it was originally edited out in a stepper or a duplicate manager. |
| Original Confidence Score | UM | Confidence Score of the original golden in a duplicate management operation. |
| Original Confirmation Status | UM | Original Confirmation Status in a duplicate management operation. |
| Original Exclusion Group | UM | Original Exclusion Group in a duplicate management operation. |
| Original Masters Count | UM | Number of master records in the original golden in a duplicate management operation. |
| Publisher ID | AE, GD, GE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SAWE, SD, SD4L, SD4LK, SDWE, SE, UG, UM | For matching entities, code of the publisher that published the record. |
| Source ID | AE, GD, GE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SAWE, SD, SD4L, SD4LK, SDWE, SE, UG, UM | ID of the source record in the source publisher system (Fuzzy Matched entities only). |
| Sugg. Merge Confidence Score | GD, GDWE, GH4B, GI, MD, MH, MH4B, MI, MX, UM | Confidence Score for the suggested match group. |
| Sugg. Merge ID | GD, GDWE, GH4B, GI, MD, MH, MH4B, MI, MX, UG, UM | ID of the merge suggested by the automated matching. |
| Sugg. Merge Masters Count | GD, GDWE, GH4B, GI, MD, MH, MH4B, MI, MX, UM | Number of master records in the suggested merge. |
| To Batch ID | GH, GH4B, MH, MH4B | Batch at which history record stopped being current or null if the records is still current. |
| Updated On | GD, GDWE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SA4LK, SAWE, SD, SD4L, SD4LK, SDWE, UM | Update date of a record |
| Updated By | GD, GDWE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SA4LK, SAWE, SD, SD4L, SD4LK, SDWE, UM | User who updated the record |
| View Type | AE, GD, GDWE, GE, GH, GH4B, GI, GX, MD, MH, MH4B, MI, MX, SA, SA4L, SA4LK, SAWE, SD, SD4L, SD4LK, SDWE, SE, UG, UM | Returns the current view type for the record. e.g. GD, MD… |
The following attributes are deprecated.
Name | Label | Views | Description |
---|---|---|---|
| Branch ID | AE, GD, GDWE, GE, GI, MD, MH4B, MI, SD, SD4L, SD4LK, SDWE, SE | Branch ID to which this record belongs. This attribute is deprecated and always returns |
| From Edition | GD, GDWE, MD, MH4B | Data Edition ID in which this record was created or last updated. This attribute is deprecated and always returns |
| To Edition | GD, GDWE, MD, MH4B | Data Edition ID into which this record was deleted or closed. This attribute is deprecated and always returns null. |
Built-in Lineage Navigation
Parent Navigation
The following table lists the built-in lineage navigation to a parent record.
Pseudo-Role | Navigation | Description | Navigation Path |
---|---|---|---|
| Current Golden Record | Current golden record a golden history record is attached to. | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
| Current Master Record | Current master record a master history record is attached to. | Fuzzy matched entity:
ID matched entity:
|
| Golden Record | Golden record into which the master, source or golden integration record consolidates. | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
| Master Record | Master record corresponding to a source record. | Fuzzy matched entity:
ID matched entity:
|
| Record with Error | Authoring record with error attached to a source authoring error. | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
| Source Authoring Record | Authoring record attached to a source authoring error. | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
| Source Record | Source Record attached to the current error record. | Fuzzy matched entity:
ID matched entity:
|
Child Navigation
The following table lists the built-in lineage navigation to child records.
Pseudo-Role | Navigation | Description | Navigation Path |
---|---|---|---|
| Errors | Errors detected for a given source or golden record. | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
| Golden History Records | Golden History Records | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
| Integration Master Records | Master integration record attached to the golden integration record. Available in post-consolidation enrichers and validations. | Fuzzy matched entity:
ID matched entity:
|
| Master History Records | Master History Records | Fuzzy matched entity:
ID matched entity:
|
| Master Records | Master Records consolidated in the golden record. | Fuzzy matched entity:
ID matched entity:
|
| Source Authoring Records | Source data authored by users attached to a golden record. | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
| Source Records | Source Records attached to the current golden record (for ID matching) or master record. | Basic entity:
Fuzzy matched entity:
ID matched entity:
|
Last updated 2018-06-14 11:52:52 UTC