Extract and rewrite URLs in text
Using cerb_extract_uris()
The cerb_extract_uris() function return an array of URLs found in HTML content, along with metadata (e.g. tag, attributes, URI parts).
In the response, URLs are replaced with tokens in the template which can be modified with the |replace filter.
For instance, this function can be used to rewrite all links in an email template for click tracking.
{% set html %}
This is some <b>HTML</b> with <a href="https://cerb.ai/">links</a>.
{% endset %}
{% set results = cerb_extract_uris(html) %}
{% set new_urls = results.tokens|map(
(url,token) => "https://proxy.example/click?url=" ~ url|url_encode
)%}
{{results.template|replace(new_urls)}}
-
start: set/init: message@text: Visit our website at https://cerb.ai/ to learn more. urls@json: {{array_unique(cerb_extract_uris(message|markdown_to_html(is_untrusted=true)).tokens)|json_encode}} set/filter: urls@json: {{urls|filter((v) => v|parse_url.host ends with 'cerb.ai')|json_encode}} set/sort: urls@json: {{urls|sort((a,b) => b|length <=> a|length)|values|json_encode}} set/combine: urls@json: {{array_combine(urls, urls|map((url) => 'https://click.example/?url=' ~ url|url_encode))|json_encode}} return: output@text: {{message|replace(urls)}}
-
__return: output: Visit our website at https://click.example/?url=https%3A%2F%2Fcerb.ai%2F to learn more. message: Visit our website at https://cerb.ai/ to learn more. urls: https://cerb.ai/: https://click.example/?url=https%3A%2F%2Fcerb.ai%2F