Welcome to nh3’s documentation!

Installation

pip install nh3

Usage

import nh3

print(nh3.clean("<b><img src=\"\">I'm not trying to XSS you</b>"))

API Reference

Python binding to the ammonia HTML sanitizer Rust crate.

nh3.clean(html, tags=None, clean_content_tags=None, attributes=None, attribute_filter=None, strip_comments=True, link_rel='noopener noreferrer', generic_attribute_prefixes=None, tag_attribute_values=None, set_tag_attribute_values=None, url_schemes=None)

Sanitizes an HTML fragment in a string according to the configured options.

Parameters:
  • html (str) – Input HTML fragment

  • tags (set[str], optional) – Sets the tags that are allowed.

  • clean_content_tags (set[str], optional) – Sets the tags whose contents will be completely removed from the output.

  • attributes (dict[str, set[str]], optional) – Sets the HTML attributes that are allowed on specific tags, * key means the attributes are allowed on any tag.

  • attribute_filter (Callable[[str, str, str], str | None], optional) – Allows rewriting of all attributes using a callback. The callback takes name of the element, attribute and its value. Returns None to remove the attribute, or a value to use.

  • strip_comments (bool) – Configures the handling of HTML comments, defaults to True.

  • link_rel (str) –

    Configures a rel attribute that will be added on links, defaults to noopener noreferrer. To turn on rel-insertion, pass a space-separated list. If rel is in the generic or tag attributes, this must be set to None. Common rel values to include:

    • noopener: This prevents a particular type of XSS attack, and should usually be turned on for untrusted HTML.

    • noreferrer: This prevents the browser from sending the source URL to the website that is linked to.

    • nofollow: This prevents search engines from using this link for ranking, which disincentivizes spammers.

  • generic_attribute_prefixes (set[str], optional) – Sets the prefix of attributes that are allowed on any tag.

  • tag_attribute_values (dict[str, dict[str, set[str]]], optional) – Sets the values of HTML attributes that are allowed on specific tags. The value is structured as a map from tag names to a map from attribute names to a set of attribute values. If a tag is not itself whitelisted, adding entries to this map will do nothing.

  • set_tag_attribute_values (dict[str, dict[str, str]], optional) – Sets the values of HTML attributes that are to be set on specific tags. The value is structured as a map from tag names to a map from attribute names to an attribute value. If a tag is not itself whitelisted, adding entries to this map will do nothing.

  • url_schemes (set[str], optional) – Sets the URL schemes permitted on href and src attributes.

Returns:

Sanitized HTML fragment

Return type:

str

nh3.clean_text(html)

Turn an arbitrary string into unformatted HTML

This function is roughly equivalent to PHP’s htmlspecialchars and htmlentities. It is as strict as possible, encoding every character that has special meaning to the HTML parser.

Parameters:

html (str) – Input HTML fragment

Returns:

Cleaned text

Return type:

str

nh3.is_html(html)

Determine if a given string contains HTML

This function is parses the full string into HTML and checks if the input contained any HTML syntax.

Note: This function will return positively for strings that contain invalid HTML syntax like <g> and even Vec::<u8>::new().

Parameters:

html (str) – Input string

Return type:

bool

Indices and tables