The Company keywords (company_keywords
) filter enables you to discover companies based on relevant keywords found on their website or in text descriptions from other sources.
Keyword search sources
The filtering process involves searching through various Veridion data points, including company business tags, company descriptions, and general HTML meta information.
Business tags
These tags encapsulate the core activities and functions of a company. Our machine learning algorithms analyze the company's website to identify and extract the most relevant business tags, representing the company's operations.
Company description
This is an attribute generated by Veridion machine learning algorithms as a comprehensive description of the company's activity and is created by analyzing and synthesising all available data on its website.
Meta information
The meta information represents title tags, meta descriptions, and language metadata extracted by Veridion from the company’s website homepage HTML.
Search modes
Expression match
The Company keywords filter value
is modelled as an expression containing a set of instructions.
Syntax of an expression
- An expression must include exactly one
match
block (terms to be matched) (mandatory) - An expression can include exactly one
exclude
block (terms to be excluded from the search) (optional) - In turn, both
match
andexclude
blocks consist of one or moreoperator
andoperands
blocks:- The
operator
can only take two values:and
andor
- The
operands
can either be a list of search terms or a list of operator and operands blocks, allowing for nested queries and more complex search use cases.
- The
Note
When operands are grouped by an
and
rule, it indicates that every single operand is mandatory, meaning all of them must be matched for a response to be returned.Conversely, with the
or
rule, any of the operands can be matched for a response to be returned.
For example, the filter below searches for companies that mention keywords related to Sustainability (excluding consulting & agency services):
{
"attribute": "company_keywords",
"relation": "match_expression",
"value": {
"match": {
"operator": "or",
"operands": [
"Sustainability",
"ESG Compliance",
"Environmental Social Governance",
"Environmental & Social",
"Environmental Stewardship",
"Environmental Management",
"Sustainable business"
]
},
"exclude": {
"operator": "or",
"operands": [
"consulting",
"agency"
]
}
}
}
Search terms
A search term can be defined as:
- a single word, like
sustainability
- an exact phrase, like
Sustainable business
Note
Please keep in mind that exact phrases are searched exactly as they are provided in the input.
For example, given the exact phrase
Sustainable business
, the API will identify a text passage likeSustainable business practices
.However,
Sustainable practices for your business
would not be recognized as a match. To enable this type of matching, please consult the Complex expressions section below.
Grouping search terms
The or
operator allows you to specify alternatives to the terms within a group. This is important when you need to optimize for results volume, as different companies might refer to the same concept in a slightly different way. For example, some companies might talk about Sustainability
, others might call it ESG compliance
, or they might mention just a segment of ESG (e.g. Social responsibility
or Environmental responsibility
).
Tip
Adding variations to your search terms will also help increase the accuracy of the results.
Due to how the results Sorting Mechanism works, if multiple terms are matched for the same company, that company will be prioritized in the result list as being more relevant to your search.
The and
operator is useful when you need to enforce the presence of multiple search terms or search groups.
For example, you need to identify companies that talk about their environmental footprint from an ESG perspective. You can search by just one keyword, such as Environmental
, but that might bring back false positives. Instead, you will want to qualify your search by adding another keyword such as ESG compliance
to narrow down the universe of potential matches.
The filter below searches for all companies that mention both ESG compliance
and Environmental
(excluding consulting & agency services):
{
"attribute": "company_keywords",
"relation": "match_expression",
"value": {
"match": {
"operator": "and",
"operands": [
"ESG compliance",
"Environmental"
]
},
"exclude": {
"operator": "or",
"operands": [
"consulting",
"agency"
]
}
}
}
Complex expressions
Complex expressions are useful when you need to cover a broader set of criteria for your search or define a more complex concept ( for which you would need to use sets of keywords instead of a single word).
For example, the following filter searches for "Veteran/Women/Minority owned IT or software companies":
{
"attribute": "company_keywords",
"relation": "match_expression",
"value": {
"match": {
"operator": "and",
"operands": [
{
"operator": "or",
"operands": [
"veteran owned",
"women owned",
"minority owned",
"service disabled"
]
},
{
"operator": "or",
"operands": [
"IT",
"software"
]
}
]
}
}
}
The filter defines an expression formed of two operands , wrapped in an and
operator. This means that a returned match must include at least one search term defined for each of the operands. For example, veteran owned
and IT
result in a successful match. Similarly, minority owned
and software
will result in a successful match.
Strictness level
Note
By default, the Company keywords filter searches all of the available Keyword sources ( default
strictness: 3
)
You can control which fields get searched during filtering by changing the value of the strictness
parameter:
- strictness set to:
1
- the filter will scan the available Business tags, resulting in increased accuracy. This setting will significantly impact the number of records returned, yielding the highest precision. - strictness set to:
2
- the search will encompass both the company's Business tags and Meta information extracted from the company’s website homepage HTML. This balanced setting strikes a compromise between the number of results and the level of accuracy. - strictness set to:
3
- the search becomes the least restrictive, generating the highest number of results. While the accuracy might slightly decrease towards the end of the results list, this setting offers a comprehensive search through all three keyword sources: Business tags, Meta information, and Company description. This serves as the default strictness level when no specific parameter value is passed.
For example, the filter below searches for "sustainability"-like keywords present in the Business tags of a company:
{
"attribute": "company_keywords",
"relation": "match_expression",
"value": {
"match": {
"operator": "or",
"operands": [
"Sustainability",
"ESG Compliance",
"Environmental Social Governance",
"Environmental & Social",
"Environmental Stewardship",
"Environmental Management",
"Sustainable business"
]
},
"exclude": {
"operator": "or",
"operands": [
"consulting",
"agency"
]
}
},
"strictness": 1
}
Reference
To view all available filter relations and accepted types, please check the Filter Relations section.