python-indexed-gzip – fast random access of gzip files in Python¶

Related packages

python3-indexed-gzip

More information

License

External resources

Drop-in replacement IndexedGzipFile for the built-in Python gzip.GzipFile class that does not need to start decompressing from the beginning of the file when for every seek(). It gets around this performance limitation by building an index, which contains seek points, mappings between corresponding locations in the compressed and uncompressed data streams. Each seek point is accompanied by a chunk (32KB) of uncompressed data which is used to initialise the decompression algorithm, allowing to start reading from any seek point. If the index is built with a seek point spacing of 1MB, only 512KB (on average) of data have to be decompressed to read from any location in the file.

This package provides the Python 2 module.

Install this package

Report a bug

Package availability chart¶
Distribution	Base version	Our version	Architectures
Debian GNU/Linux 10.0 (buster)	0.8.6-1	0.8.6-1~nd100+1	i386, amd64
Debian GNU/Linux 12.0 (bookworm)	1.7.0-1
Debian GNU/Linux 9.0 (stretch)		0.8.6-1~nd90+1	i386, amd64
Debian testing (trixie)	1.8.7-3
Debian unstable (sid)	1.8.7-3	0.8.6-1~nd+1	i386, amd64
Ubuntu 16.04 “Xenial Xerus” (xenial)		0.6.1-1~nd16.04+1	amd64
Ubuntu 18.04 “Bionic Beaver” (bionic)	0.6.1-1	0.8.6-1~nd18.04+1	i386, amd64
Ubuntu 20.04 “Focal Fossa” (focal)	0.8.6-1.1build1
Ubuntu 22.04 “Jammy Jellyfish” (jammy)	1.6.4-2build1
Ubuntu 24.04 “Noble Numbat” (noble)	1.7.0-1.1ubuntu1

python-indexed-gzip – fast random access of gzip files in Python¶

Comments