python3-indexed-gzip – fast random access of gzip files in Python

Drop-in replacement IndexedGzipFile for the built-in Python gzip.GzipFile class that does not need to start decompressing from the beginning of the file when for every seek(). It gets around this performance limitation by building an index, which contains seek points, mappings between corresponding locations in the compressed and uncompressed data streams. Each seek point is accompanied by a chunk (32KB) of uncompressed data which is used to initialise the decompression algorithm, allowing to start reading from any seek point. If the index is built with a seek point spacing of 1MB, only 512KB (on average) of data have to be decompressed to read from any location in the file.

This package provides the Python 3 module.

Package availability chart
Distribution Base version Our version Architectures
Debian GNU/Linux 10.0 (buster) 0.8.6-1 0.8.6-1~nd100+1 i386, amd64
Debian GNU/Linux 9.0 (stretch)   0.8.6-1~nd90+1 i386, amd64
Debian testing (bookworm) 1.6.4-2    
Debian unstable (sid) 1.6.4-2 0.8.6-1~nd+1 i386, amd64
Ubuntu 16.04 “Xenial Xerus” (xenial)   0.6.1-1~nd16.04+1 amd64
Ubuntu 18.04 “Bionic Beaver” (bionic) 0.6.1-1 0.8.6-1~nd18.04+1 i386, amd64
Ubuntu 20.04 “Focal Fossa” (focal) 0.8.6-1.1build1    
Ubuntu 21.04 “Hirsute Hippo” (hirsute) 0.8.6-1.2    
Ubuntu 21.10 “Impish Indri” (impish) 0.8.6-1.2    


blog comments powered by Disqus