The Company keywords (company_keywords) filter enables you to discover companies based on relevant keywords found on their website or in text descriptions from other sources.


Keyword search sources

The filtering process involves searching through various Veridion data points, including company business tags, company descriptions, and general HTML meta information.

Business tags

These tags encapsulate the core activities and functions of a company. Our machine learning algorithms analyze the company's website to identify and extract the most relevant business tags, representing the company's operations.

Company description

This is an attribute generated by Veridion machine learning algorithms as a comprehensive description of the company's activity and is created by analyzing and synthesising all available data on its website.

Meta information

The meta information represents title tags, meta descriptions, and language metadata extracted by Veridion from the company’s website homepage HTML.


Search modes

Expression match

The Company keywords filter value is modelled as an expression containing a set of instructions.

Syntax of an expression

  1. An expression must include exactly one match block (terms to be matched) (mandatory)
  2. An expression can include exactly one exclude block (terms to be excluded from the search) (optional)
  3. In turn, both match and exclude blocks consist of one or more operator and operands blocks:
    • The operator can only take two values: and and or
    • The operands can either be a list of search terms or a list of operator and operands blocks, allowing for nested queries and more complex search use cases.

📘

Note

When operands are grouped by an and rule, it indicates that every single operand is mandatory, meaning all of them must be matched for a response to be returned.

Conversely, with the or rule, any of the operands can be matched for a response to be returned.

For example, the filter below searches for companies that mention keywords related to Sustainability (excluding consulting & agency services):

{
  "attribute": "company_keywords",
  "relation": "match_expression",
  "value": {
    "match": {
      "operator": "or",
      "operands": [
        "Sustainability",
        "ESG Compliance",
        "Environmental Social Governance",
        "Environmental & Social",
        "Environmental Stewardship",
        "Environmental Management",
        "Sustainable business"
      ]
    },
    "exclude": {
      "operator": "or",
      "operands": [
        "consulting",
        "agency"
      ]
    }
  }
}

Search terms

A search term can be defined as:

  • a single word, like sustainability
  • an exact phrase, like Sustainable business

📘

Note

Please keep in mind that exact phrases are searched exactly as they are provided in the input.

For example, given the exact phrase Sustainable business, the API will identify a text passage like Sustainable business practices.

However, Sustainable practices for your business would not be recognized as a match. To enable this type of matching, please consult the Complex expressions section below.

Grouping search terms

The or operator allows you to specify alternatives to the terms within a group. This is important when you need to optimize for results volume, as different companies might refer to the same concept in a slightly different way. For example, some companies might talk about Sustainability , others might call it ESG compliance, or they might mention just a segment of ESG (e.g. Social responsibility or Environmental responsibility).

👍

Tip

Adding variations to your search terms will also help increase the accuracy of the results.

Due to how the results Sorting Mechanism works, if multiple terms are matched for the same company, that company will be prioritized in the result list as being more relevant to your search.

The and operator is useful when you need to enforce the presence of multiple search terms or search groups.

For example, you need to identify companies that talk about their environmental footprint from an ESG perspective. You can search by just one keyword, such as Environmental, but that might bring back false positives. Instead, you will want to qualify your search by adding another keyword such as ESG compliance to narrow down the universe of potential matches.

The filter below searches for all companies that mention both ESG compliance and Environmental (excluding consulting & agency services):

{
  "attribute": "company_keywords",
  "relation": "match_expression",
  "value": {
    "match": {
      "operator": "and",
      "operands": [
        "ESG compliance",
        "Environmental"
      ]
    },
    "exclude": {
      "operator": "or",
      "operands": [
        "consulting",
        "agency"
      ]
    }
  }
}

Complex expressions

Complex expressions are useful when you need to cover a broader set of criteria for your search or define a more complex concept ( for which you would need to use sets of keywords instead of a single word).

For example, the following filter searches for "Veteran/Women/Minority owned IT or software companies":

{
  "attribute": "company_keywords",
  "relation": "match_expression",
  "value": {
    "match": {
      "operator": "and",
      "operands": [
        {
          "operator": "or",
          "operands": [
            "veteran owned",
            "women owned",
            "minority owned",
            "service disabled"
          ]
        },
        {
          "operator": "or",
          "operands": [
            "IT",
            "software"
          ]
        }
      ]
    }
  }
}

The filter defines an expression formed of two operands , wrapped in an and operator. This means that a returned match must include at least one search term defined for each of the operands. For example, veteran owned and IT result in a successful match. Similarly, minority owned and software will result in a successful match.


Strictness level

📘

Note

By default, the Company keywords filter searches all of the available Keyword sources ( default strictness: 3 )

You can control which fields get searched during filtering by changing the value of the strictness parameter:

  • strictness set to: 1 - the filter will scan the available Business tags, resulting in increased accuracy. This setting will significantly impact the number of records returned, yielding the highest precision.
  • strictness set to: 2 - the search will encompass both the company's Business tags and Meta information extracted from the company’s website homepage HTML. This balanced setting strikes a compromise between the number of results and the level of accuracy.
  • strictness set to: 3 - the search becomes the least restrictive, generating the highest number of results. While the accuracy might slightly decrease towards the end of the results list, this setting offers a comprehensive search through all three keyword sources: Business tags, Meta information, and Company description. This serves as the default strictness level when no specific parameter value is passed.

For example, the filter below searches for "sustainability"-like keywords present in the Business tags of a company:

{
  "attribute": "company_keywords",
  "relation": "match_expression",
  "value": {
    "match": {
      "operator": "or",
      "operands": [
        "Sustainability",
        "ESG Compliance",
        "Environmental Social Governance",
        "Environmental & Social",
        "Environmental Stewardship",
        "Environmental Management",
        "Sustainable business"
      ]
    },
    "exclude": {
      "operator": "or",
      "operands": [
        "consulting",
        "agency"
      ]
    }
  },
  "strictness": 1
}

📖

Reference

To view all available filter relations and accepted types, please check the Filter Relations section.