Matching methods

Last published at: 2022-07-11 12:37:36 UTC
Delete

The matching methods are available in the Advanced and Premium editions. The free edition supports 'Exact' matching only.

Delete

A matching method is a (fuzzy) algorithm that we use to analyze field values. Because every field has a different type of value (e.g. a numeric vs. text), it is important to apply a matching method that is able to analyze that certain value type. The matching method also decides if you apply an exact or fuzzy logic (e.g. detecting spelling mistakes) or not. 

Exact

This matching method was formerly labeled as 'equal ignore case.' When using 'Exact' as a matching method, only exact matches will generate a score. The casing is not taken into account.

Field A
Field B
Score
Addressed wORLD
Addressed World
100%
Addressed world
Addressed World
0%
Field A
Field B
Score
Addressed wORLD
Addressed World
100%
Addressed World
AddressedWorld
0%
Addressed World
AddressedWorld
0%

Exact (Random Order)

When applying the Exact (Random Order) matching method, values will score 100% that have the same characters, but potentially in a different order. White spaces are taken into account. 

Field A
Field B
Score
John Johnsen
Johnsen John
100%
Johnjohnsen
John Johnsen
  0%

Partial Exact

This matching method was formerly labeled as 'Company.' The partial exact matching method will flag value as a duplicate when a single word in one string is found in the other string. This might give results to are too loose. If so, we recommend changing the Partial Exact matching method to 'Fuzzy' or 'Partial Fuzzy Heavy.'

Field A
Field B
   Score
Mc Donalds
Mc Donalds
100%
Mc Donalds Ireland
Mc Donalds Spain
100%
Mc Dnalds
Mac Donalds
0%

Partial Fuzzy Light

This matching method was formerly labeled as 'Fuzzy Company.' The Partial Fuzzy Light matching method will flag a value as a duplicate when a word or words in one string are found in the other string but also takes into account spelling mistakes or different formats.

This might give results to are too loose. If so, we recommend changing the Partial Fuzzy Light matching method to 'Fuzzy' or 'Partial Fuzzy Heavy'.

Field A Field B
Score
Mc Donalds
Mac Donalds
92%
Mc Donalds Ireland
Mc Donalds Spain
84%
Mc Dnalds
Mac Donalds
80%

Partial Fuzzy Medium

This matching was formerly labeled as 'Fuzzy Company Strict'. Partial Fuzzy Medium is an algorithm that is a combination of Partial Fuzzy Light and Fuzzy.

Field A Field B
Score
Mc Donalds
Mac Donalds
92%
Mc Donalds Ireland
Mc Donalds Spain
84%
Mc Dnalds
Mac Donalds
80%

Partial Fuzzy Heavy or Company Name

These matching methods were formerly labeled as 'Fuzzy Company Extra Strict.' Partial Fuzzy Heavy is an algorithm that analyses the full value and generates a high score if there is a bigger partial match, taking into account spelling errors and different formats. Company Name does the same, but is optimized for matching company names.

Field A
Field B
Score
Mc Donalds
Mac Donalds
92%
Mc Donalds Ireland
Mc Donalds Spain
78%
Mc Dnalds
Mac Donalds
80%

Person Name or Fuzzy

These matching methods were formerly labeled as 'Fuzzy Person.' Spelling errors, typing errors and sequence differences are taken into account. This matching method can also be very useful for comparing company names.

Field A
Field B
Score
Sten Ebenau
Sten Ebenau
91%
Pyramid Construction PLC
Construct Pyramid PLC
93%

Large Text

This is specially designed for large texts like emails or documents. It will determine the matching degree between those texts. 

Field A
Field B
Score
 Ut quis risus orci. Integer in nisl eu massa rutrum dapibus non ut sem. Duis nec erat placerat, efficitur ligula sit amet, efficitur velit. Etiam vestibulum tortor et tempus viverra. Suspendisse consequat nibh nec justo efficitur, ut molestie quam lobortis. Nulla viverra ligula eu purus faucibus sollicitudin. Duis dignissim lorem eget sem dictum, at tincidunt mi faucibus. Etiam pellentesque, ante nec vestibulum posuere, ex lorem pellentesque leo, eu auctor lorem mi id velit. Proin id libero non purus egestas gravida. Aliquam erat volutpat. Etiam feugiat est et tellus tristique, ut pellentesque magna semper.
Ut quis risus orci. Integer in nisl eu massa rutrum dapibus non ut sem. Duis nec erat placerat, efficitur ligula sit amet, efficitur velit. Etiam vestibulum tortor et tempus viverra. Suspendisse consequat nibh nec justo efficitur, ut molestie quam lobortis. Nulla viverra ligula eu purus faucibus sollicitudin. Duis dignissim lorem eget sem dictum, at tincidunt mi faucibus. Etiam pellentesque, ante nec vestibulum posuere, ex lorem pellentesque leo, eu auctor lorem mi id velit. Proin id libero non purus egestas gravida. Etiam feugiat est et tellus tristique, ut pellentesque magna semper.
93%

Equal to Null

The Equal to Null matching method scores when the matched field is empty (Null). 

Field A
Field B
Score
Mc Donalds

100%
Mc Donalds
Mc Donalds
0%

Not Equal

The Not Equal matching method scores when the matched field is NOT equal.

Field A
Field B
Score
Mc Donalds

100%
Mc Donalds
Mc Donalds
0%

Email Address

This will match email addresses, matching the addressee part differently than the domain name part.

Field A
Field B
Score
sten.ebenau@ebenau.org
sten.ebenau@plauti.org
64%
sten@ebenau.nl
sten@ebenau.com
90%

Domain

The Domain matching method can find if a URL or email address has the same domain name. This matching method could be used to find records that could be related to the same company. Or to find duplicate website URL's, even if they are written in a different format. 

Field A
Field B
Score
www.duplicatecheck.com
http://www.duplicatecheck.com/tour
100%
ruben.vandekamp@plauti.com
sten.ebenau@plauti.com
100%
ruben.vandekamp@plauti.com
ruben.vandekamp@duplicatecheck.com
0%

Domain Fuzzy

The Domain Fuzzy matching method can find if a URL or email address has a similar domain name. The matching method strips the email or URL to only leave the domain and then calculates a fuzzy match. 

Field A
Field B
Score
www.duplicatecheck.com
http://www.duplicatecheck.com/tour
100%
ruben.vandekamp@plauti.com
sten.ebenau@plautie.com
90%
ruben.vandekamp@plauti.com
ruben.vandekamp@duplicatecheck.com
0%

URL

The URL matching method can find duplicate URLs, even if they are written in a different format. In comparison with the domain matching method, this matching method also takes into account subdomains or URI. 

Field A
Field B
Score
www.duplicatecheck.com
http://www.duplicatecheck.com
100%
duplicatecheck.com/tour
www.duplicatecheck.com/tour
100%
duplicatecheck.com
http://duplicatecheck.com/tour
0%

URL Fuzzy 

The URL matching method can find duplicate URLs, even if they are written in a different format. In comparison with the domain matching method, this matching method also takes into account subdomains or URI. 

Field A Field B Score
www.duplicatecheck.com
http://www.duplicatecheck.com
100%
duplicatecheck.com/tour
www.duplicatecheck.com/tour/new
91%
duplicatecheck.com
plauti.com
0%

Product Number

This is specially designed for numbers. Working best for serial numbers. 

Field A
Field B
Score
0345-2434
345-2434
100%
0345-2434
03452434
100%

Phone Number

This is specially designed for comparing phone numbers. It removes all special characters and puts most weight on the last four numbers.

Field A
Field B
Score
+316123456
06123456
81%
234-235-5678
2342355678
100%

Phone Number Advanced

The Phone Number Advanced matching method can score 100% on phone numbers written in a different format or standardization. It doesn't matter if your phone number has an international format, national format, E.164 or RFC 3966. If your phone number possibly has an extension, it's not taken into account. 

Field A
Field B
Score
(415) 555-2761
+1 415 555-2761
100%
+1 415 555-2761
(415) 555-2761 ext. 5
100%
(415) 555-2761 ext. 5
+1 415 555 2761, ext. 8
100%
+1 415 555 2761, ext. 8
+1 415 555 2761, extension 8
100%
+1 415 555 2761
+1 415 555 2762
0%

Contains Value

Enter a custom value at 'match config.' When a value matches your custom value, it will score 100%. This matching method only works on "matched records". The source record can have any value and is not defined by the "Contains Value" matching method. 

Please note that this matching method is not designed to function as a filter. If you want to use a filter, please apply a Job Filter

Date Distance in Days

Can only be used on date fields. When two dates are within the provided date range (for example, '4' is 'four days'), the field will score 100%. Useful for finding opportunities that have a similar close date for example.

Delete

Learn how to set up and configure a scenario by watching this video tutorial.
Learn how to find and merge duplicates by watching this video tutorial.

Delete

When you are using the Fuzzy Matching Method or another advanced matching method, a Search Index must be enabled and created for the Object.