Overview
During implementation, we encountered an issue where string values were not matching correctly due to inconsistencies in how they were stored and compared. Specifically:
- Some values contained HTML entities like
<and>instead of<and>. - Some values included HTML line breaks (
<br>), carriage returns, or multiple spaces. - As a result, direct string comparisons were failing, even though the actual content was intended to be the same.
To ensure reliable comparisons, we implemented a string normalization function.
Root Cause
- HTML Encoding: Certain values were stored with encoded symbols (
<vs<). - Line Break Variations: Line breaks were represented as
<br>tags orrn. - Whitespace Inconsistencies: Multiple consecutive spaces were sometimes present.
These variations meant that two logically identical strings could appear different to the system.
Solution – Normalization Function
We introduced a normalize() function that cleans and standardizes strings before comparison.
function normalize(str) {
if (!str || typeof str !== 'string') return '';
// Decode HTML entities for < and >
str = str.replace(/</g, '<').replace(/>/g, '>');
return str
// Replace <br> with a space
.replace(/<brs*/?>/gi, ' ')
// Replace newlines (r, n) with a space
.replace(/[rn]+/g, ' ')
// Collapse multiple spaces into one
.replace(/s+/g, ' ')
// Remove leading/trailing whitespace
.trim();
}
How It Works
- Decode Entities – Converts
<→<,>→>. - Unify Line Breaks – Replaces
<br>tags andrnwith a single space. - Standardize Whitespace – Collapses multiple spaces into one.
- Trim – Removes leading and trailing spaces.
Example
let input1 = "Hello<br>World";
let input2 = "Hello <br> World";
normalize(input1); // "Hello World"
normalize(input2); // "Hello World"
// Safe comparison
if (normalize(input1) === normalize(input2)) {
console.log("Strings are equal");
}
Result: Both inputs normalize to "Hello World", ensuring consistent comparison.