Welcome to nh3’s documentation!¶
Installation¶
pip install nh3
Usage¶
import nh3
print(nh3.clean("<b><img src=\"\">I'm not trying to XSS you</b>"))
API Reference¶
Python binding to the ammonia HTML sanitizer Rust crate.
- nh3.clean(html, tags=None, clean_content_tags=None, attributes=None, attribute_filter=None, strip_comments=True, link_rel='noopener noreferrer', generic_attribute_prefixes=None, tag_attribute_values=None, set_tag_attribute_values=None, url_schemes=None)¶
Sanitizes an HTML fragment in a string according to the configured options.
- Parameters:
html (
str
) – Input HTML fragmenttags (
set[str]
, optional) – Sets the tags that are allowed.clean_content_tags (
set[str]
, optional) – Sets the tags whose contents will be completely removed from the output.attributes (
dict[str, set[str]]
, optional) – Sets the HTML attributes that are allowed on specific tags,*
key means the attributes are allowed on any tag.attribute_filter (
Callable[[str, str, str], str | None]
, optional) – Allows rewriting of all attributes using a callback. The callback takes name of the element, attribute and its value. ReturnsNone
to remove the attribute, or a value to use.strip_comments (
bool
) – Configures the handling of HTML comments, defaults toTrue
.link_rel (
str
) –Configures a
rel
attribute that will be added on links, defaults tonoopener noreferrer
. To turn on rel-insertion, pass a space-separated list. Ifrel
is in the generic or tag attributes, this must be set toNone
. Commonrel
values to include:noopener
: This prevents a particular type of XSS attack, and should usually be turned on for untrusted HTML.noreferrer
: This prevents the browser from sending the source URL to the website that is linked to.nofollow
: This prevents search engines from using this link for ranking, which disincentivizes spammers.
generic_attribute_prefixes (
set[str]
, optional) – Sets the prefix of attributes that are allowed on any tag.tag_attribute_values (
dict[str, dict[str, set[str]]]
, optional) – Sets the values of HTML attributes that are allowed on specific tags. The value is structured as a map from tag names to a map from attribute names to a set of attribute values. If a tag is not itself whitelisted, adding entries to this map will do nothing.set_tag_attribute_values (
dict[str, dict[str, str]]
, optional) – Sets the values of HTML attributes that are to be set on specific tags. The value is structured as a map from tag names to a map from attribute names to an attribute value. If a tag is not itself whitelisted, adding entries to this map will do nothing.url_schemes (
set[str]
, optional) – Sets the URL schemes permitted on href and src attributes.
- Returns:
Sanitized HTML fragment
- Return type:
str
- nh3.clean_text(html)¶
Turn an arbitrary string into unformatted HTML
This function is roughly equivalent to PHP’s htmlspecialchars and htmlentities. It is as strict as possible, encoding every character that has special meaning to the HTML parser.
- Parameters:
html (
str
) – Input HTML fragment- Returns:
Cleaned text
- Return type:
str
- nh3.is_html(html)¶
Determine if a given string contains HTML
This function is parses the full string into HTML and checks if the input contained any HTML syntax.
Note: This function will return positively for strings that contain invalid HTML syntax like
<g>
and evenVec::<u8>::new()
.- Parameters:
html (
str
) – Input string- Return type:
bool