Extract Rich Text Field Values without HTML Tags using Saved Search Formula

Workaround: Using Formula (Text) Fields

To address this, you can utilize a Formula (Text) field in your Saved Search to strip the HTML tags from the exported data. This ensures that only the text values are extracted.

Edit your Saved Search

On the Results tab, add a Formula (Text) field with this formula:

REGEXP_REPLACE({fieldid},'<[^<>]*>’,”)

where {fieldid} is the ID of the Rich Text field you need

Click Save

When you export the results of the Search to either CSV or Excel, notice that only the text value remains and no HTML tags are included.

Formula Explained

  • REGEXP_REPLACE: This is a SQL function that performs regular expression-based replacements on text. It is used to find and replace patterns within a text string.
  • {fieldid}: This is a placeholder for the internal ID of the Rich Text field you want to modify.
  • <[^<>]*>: This is a regular expression pattern that matches any HTML tag.< and > are used to match the opening and closing brackets of an HTML tag.
  • [^<>]* is a pattern that matches any character except < or > (i.e., it captures everything inside the tags).
  • Together, this pattern identifies and captures the entire HTML tag, allowing the REGEXP_REPLACE function to remove it.
  • ” (Empty String): This is the replacement value. In this case, it’s an empty string, which means that the matched HTML tags will be replaced with nothing, effectively removing them.

When you use this formula in a Formula (Text) field, it processes the text in the specified Rich Text field ({fieldid}). The REGEXP_REPLACE function scans the text for any HTML tags (<[^<>]*>) and replaces them with an empty string, thus stripping the HTML formatting. The result is a clean text value without the HTML markup.

Leave a comment

Your email address will not be published. Required fields are marked *