python-w3lib – Collection of web-related functions for Python (Python 2)¶
- Related packages
- More information
- External resources
Python module with simple, reusable functions to work with URLs, HTML, forms, and HTTP, that aren’t found in the Python standard library.
This module is used to, for example:
- remove comments, or tags from HTML snippets
- extract base url from HTML snippets
- translate entites on HTML strings
- encoding mulitpart/form-data
- convert raw HTTP headers to dicts and vice-versa
- construct HTTP auth header
- RFC-compliant url joining
- sanitize urls (like browsers do)
- extract arguments from urls
The code of w3lib was originally part of the Scrapy framework but was later stripped out of Scrapy, with the aim of make it more reusable and to provide a useful library of web functions without depending on Scrapy.
This is the Python 2 version of the package.
Distribution | Base version | Our version | Architectures |
---|---|---|---|
Debian GNU/Linux 10.0 (buster) | 1.20.0-1 | ||
Debian GNU/Linux 11.0 (bullseye) | 1.22.0-3 | ||
Debian GNU/Linux 9.0 (stretch) | 1.14.3-1 | 1.11.0-1~nd90+1 | i386, amd64, sparc, armel |
Debian testing (bookworm) | 2.1.1-1 | ||
Debian unstable (sid) | 2.1.1-1 | 1.11.0-1~nd+1 | i386, amd64, sparc, armel |
Ubuntu 16.04 “Xenial Xerus” (xenial) | 1.11.0-1 | ||
Ubuntu 18.04 “Bionic Beaver” (bionic) | 1.19.0-1 | ||
Ubuntu 20.04 “Focal Fossa” (focal) | 1.21.0-1 | ||
Ubuntu 21.10 “Impish Indri” (impish) | 1.22.0-3 | ||
Ubuntu 22.04 “Jammy Jellyfish” (jammy) | 1.22.0-3 |