This guide explains how to add filters to existing declarations to remove meaningless content that cannot be removed with CSS selectors, to prevent noise in the versions.
remove
property.Built-in filters are pre-defined functions that handle common noise patterns. They are the easiest way to clean up content.
Review the available built-in filters to find if one matches your needs.
If you find a suitable built-in filter, proceed to Step 3, otherwise you will need to create a custom filter.
If no built-in filter matches your needs, you will need to create a custom filter. This requires JavaScript knowledge and familiarity with DOM manipulation.
Create a JavaScript file in the same folder and with the same name as your service declaration, but with .filters.js
extension.
For example, if your declaration is
declarations/MyService.json
, createdeclarations/MyService.filters.js
.
Define your filter function with the following signature:
export function myCustomFilter(document, [parameters]) {
// Your filter logic here
}
document
: JSDOM document instance representing the web pageparameters
: values passed from the declaration (optional)For example, let’s say you want to remove session IDs from text content:
<p>We collect your data for the following purposes:</p>
<ul>
<li>To provide our services</li>
<li>To improve user experience</li>
</ul>
<p class="session-id">Last updated on 2023-12-07 (Session: abc123def456)</p>
You can implement this filter as follows:
export function removeSessionIds(document) {
// Find all paragraphs that might contain session IDs
const paragraphs = document.querySelectorAll('.session-id');
paragraphs.forEach(paragraph => {
let text = paragraph.textContent;
// Remove session ID patterns like "Session: abc123" or "(Session: def456)"
text = text.replace(/\s*\(?Session:\s*[a-zA-Z0-9]+\)?/g, '');
paragraph.textContent = text.trim();
});
}
Result after applying the filter:
<p>We collect your data for the following purposes:</p>
<ul>
<li>To provide our services</li>
<li>To improve user experience</li>
</ul>
- <p class="session-id">Last updated on 2023-12-07 (Session: abc123def456)</p>
+ <p class="session-id">Last updated on 2023-12-07</p>
Open your service declaration file (e.g. declarations/MyService.json
) and locate the filter
property of the specific terms you want to apply the filter to. If it doesn’t exist, add it as an array.
For filters that don’t require parameters, add the filter name as a string:
{
"name": "MyService",
"terms": {
"Privacy Policy": {
"fetch": "https://my.service.example/en/privacy-policy",
"select": ".textcontent",
"filter": [
"removeSessionIds"
]
}
}
}
For filters that take parameters, use an object format, for example with the built-in filter removeQueryParams
to remove query parameters from URLs:
{
"name": "MyService",
"terms": {
"Privacy Policy": {
"fetch": "https://my.service.example/en/privacy-policy",
"select": ".textcontent",
"filter": [
{
"removeQueryParams": ["utm_source", "utm_medium", "utm_campaign"]
}
]
}
}
}
You can combine multiple filters in the same declaration:
{
"name": "MyService",
"terms": {
"Privacy Policy": {
"fetch": "https://my.service.example/en/privacy-policy",
"select": ".textcontent",
"filter": [
{
"removeQueryParams": ["utm_source", "utm_medium"]
},
"removeSessionIds"
]
}
}
}
After adding the filter, test your declaration to ensure it works correctly: