python-indexed-gzip – fast random access of gzip files in Python

Drop-in replacement IndexedGzipFile for the built-in Python gzip.GzipFile class that does not need to start decompressing from the beginning of the file when for every seek(). It gets around this performance limitation by building an index, which contains seek points, mappings between corresponding locations in the compressed and uncompressed data streams. Each seek point is accompanied by a chunk (32KB) of uncompressed data which is used to initialise the decompression algorithm, allowing to start reading from any seek point. If the index is built with a seek point spacing of 1MB, only 512KB (on average) of data have to be decompressed to read from any location in the file.

This package provides the Python 2 module.

Package availability chart

Distribution

Base version

Our version

Architectures

Debian GNU/Linux 10.0 (buster)

0.8.6-1

0.8.6-1~nd100+1

i386, amd64

Debian GNU/Linux 12.0 (bookworm)

1.7.0-1

Debian GNU/Linux 9.0 (stretch)

0.8.6-1~nd90+1

i386, amd64

Debian testing (trixie)

1.7.0-1.1

Debian unstable (sid)

1.7.0-1.1

0.8.6-1~nd+1

i386, amd64

Ubuntu 16.04 “Xenial Xerus” (xenial)

0.6.1-1~nd16.04+1

amd64

Ubuntu 18.04 “Bionic Beaver” (bionic)

0.6.1-1

0.8.6-1~nd18.04+1

i386, amd64

Ubuntu 20.04 “Focal Fossa” (focal)

0.8.6-1.1build1

Ubuntu 22.04 “Jammy Jellyfish” (jammy)

1.6.4-2build1

Comments

blog comments powered by Disqus