The debdelta suite

Andrea C. G. Mennucci

debdelta is an application suite designed to compute changes between Debian packages. These changes (that we will call 'deltas') are similar to the output of the "diff" program in that they may be used to store and transmit only the changes between Debian packages. This suite contains 'debdelta-upgrade', that downloads deltas and use them to create all Debian packages needed for an 'apt-get upgrade'.


Table of Contents
1. Overview
1.1. debdelta
1.2. debpatch
1.3. debdeltas
1.4. debdelta-upgrade
1.5. debforensic
2. a delta
2.1. the info in a delta
2.2. how to apply a delta
3. new debdelta-upgrade service
3.1. The framework
3.2. The goals
3.3. The repository structure
3.4. The repository creation
3.5. size limit
3.6. /etc/debdelta/sources.conf
3.7. indexes
3.7.1. indexes of debs in APT
3.7.2. no indexes of deltas in debdelta
3.8. no incremental deltas
3.8.1. What "incremental" would be, and why it is not
3.9. Repository howto
3.9.1. debdeltas_server
3.9.2. debmirror --debmarshal
3.9.3. hooks and repository of old_debs
4. Goals, tricks, ideas and issues
4.1. exact patching
4.2. exact recompression
4.3. speed
4.3.1. some (old) numbers
4.3.2. speeding up
4.3.3. the 10kb trick
4.3.4. the choice, the predictor
4.3.5. State of the art
4.4. better deb compression is a worse delta
4.5. long time recovery
4.6. streaming
4.7. --format=unzipped
4.8. --format=preunpacked
5. Todo
5.1. todo list
5.2. things are getting worse

1. Overview

The debdelta application suite is really composed of different applications.

1.1. debdelta

debdelta computes the delta, that is, a file that encodes the difference between two Debian packages. Example:


	$ a=/var/cache/apt/archives 
	$ debdelta -v $a/emacs-snapshot-common_1%3a20060512-1_all.deb \
	    $a/emacs-snapshot-common_1%3a20060518-1_all.deb /tmp/emacs.debdelta
      
the result is: deb delta is 12.5% of deb ; that is, 15452kB would be saved

1.2. debpatch

debpatch can use the delta file and a copy of the old Debian package to recreate the new Debian package. (This process is called "applying the delta file"). If the old Debian package is not available, but is installed in the host, it can use the installed data; in this case, '/' is used in lieu of the old .deb.

Example:


	$ debpatch -A  /tmp/emacs.debdelta / /tmp/emacs.deb
      

1.3. debdeltas

debdeltas can be used to generate deltas for many debs at once. It will generate delta files with names such as package_old-version_new-version_architecture.debdelta. If the delta exceeds ~70% of the deb, 'debdeltas' will delete it and leave a stamp of the form package_old-version_new-version_architecture.debdelta-too-big. Example usages are in the man page; see also Section 3.9.

1.4. debdelta-upgrade

debdelta-upgrade will download necessary deltas and apply them to create debs for a successive apt-get upgrade. The deltas are available for upgrades in 'stable' , 'stable-security' , 'testing', 'unstable' and 'experimental', for architectures: all, i386 and amd64. As of 2024, there are also deltas to upgrade from any main release to its -backports or -security channel. Example usage:


	# apt-get update && debdelta-upgrade && apt-get upgrade
      
If run by a non-root user, debs are saved in /tmp/archives : do not forget to move them in /var/cache/apt/archives

debdelta-upgrade will also download .debs for which no delta is available (this is done in parallel to patching, to maximize speed). See the explanation of "debdelta-upgrade --deb-policy" in the man page for more informations and customization on which debs get downloaded.

More informations are in next sections.

1.5. debforensic

There is also another bunch of code (that though was never distributed.... it is available in the GIT repo). . debforensics creates and uses sqlite databases containing information regarding debian binary packages. debforensics --add will scan debian packages and add the list of files (and SHA1 hashes of them) to the database. debforensics --scan will check a file against multiple databases, to see if that file is part of any package. debforensics --forensic will scan a filesystem and list files that are part of a package, and files that are not (or are missplaced, or have strange permissions....).

If debdelta-upgrade fails to apply a delta, and '-d' is passed, then a debug file is generated, and then debforensic may be used to understand what went wrong (theoretically).

Important

Beware: a full database for main/amd64 is ~350MBs, without indexes. So in practice currently I cannot keep a database in my host.