python-w3lib – Collection of web-related functions for Python (Python 2)¶

Related packages

python3-w3lib

More information

License

External resources

Python module with simple, reusable functions to work with URLs, HTML, forms, and HTTP, that aren’t found in the Python standard library.

This module is used to, for example:

remove comments, or tags from HTML snippets
extract base url from HTML snippets
translate entites on HTML strings
encoding mulitpart/form-data
convert raw HTTP headers to dicts and vice-versa
construct HTTP auth header
RFC-compliant url joining
sanitize urls (like browsers do)
extract arguments from urls

The code of w3lib was originally part of the Scrapy framework but was later stripped out of Scrapy, with the aim of make it more reusable and to provide a useful library of web functions without depending on Scrapy.

This is the Python 2 version of the package.

Install this package

Report a bug

Package availability chart¶
Distribution	Base version	Our version	Architectures
Debian GNU/Linux 10.0 (buster)	1.20.0-1
Debian GNU/Linux 11.0 (bullseye)	1.22.0-3
Debian GNU/Linux 12.0 (bookworm)	2.1.1-1
Debian GNU/Linux 13.0 (trixie)	2.3.1-1
Debian GNU/Linux 9.0 (stretch)	1.14.3-1	1.11.0-1~nd90+1	i386, amd64, sparc, armel
Debian testing (forky)	2.4.1-1
Debian unstable (sid)	2.4.1-1	1.11.0-1~nd+1	i386, amd64, sparc, armel
Ubuntu 16.04 “Xenial Xerus” (xenial)	1.11.0-1
Ubuntu 18.04 “Bionic Beaver” (bionic)	1.19.0-1
Ubuntu 20.04 “Focal Fossa” (focal)	1.21.0-1
Ubuntu 22.04 “Jammy Jellyfish” (jammy)	1.22.0-3
Ubuntu 24.04 “Noble Numbat” (noble)	2.1.2-1.1
Ubuntu 25.04 “Plucky Puffin” (plucky)	2.3.1-1

python-w3lib – Collection of web-related functions for Python (Python 2)¶

Comments