| Past Project |
|
In this project, we study delta compression, the algorithmic task of
finding common strings between versions of data and using the commonality
between versions to compactly encode one version by describing it
as a set of changes from its companion. Previous differencing algorithms
run in O(n^2) time and O(n) space.
We have developed
a new algorithm that runs in O(n) time and O(1) space that closely
approximates the compression performance of the previous algorithms.
We have also developed two additional algorithms based on heuristics
which improve the compression performance of the O(n) algorithm
and the runtime performance of the O(n^2) algorithm.
Our treatment
includes theoretical results that limit the compression power of
algorithms under natural performance assumptions. All of our algorithms
for differencing operate at a fine granularity and make no assumptions
about the format or alignment of input data. We have also explored
their performance properties experimentally.
In addition, we have
developed an algorithm for modifying delta compressed files so that
the compressed versions may be reconstructed without scratch space.
This allows network clients with limited resources to efficiently
update software by retrieving delta compressed versions over a network.
Delta compression
for binary files (compactly encoding a version of data with only
the changed bytes from a previous version) may be used to efficiently
distribute software over low bandwidth channels, such as the Internet.
Traditional methods for rebuilding these delta files require memory
or storage space on the target machine for both the old and new
versions of the file to be reconstructed.
With the advent
of network computing and Internet-enabled devices, many of these
network attached target machines have limited additional scratch
space. We have developed an algorithm for modifying a delta compressed
version file so that it may rebuild the new file version in the
space that the current version occupies.
|
|
Technical Papers
R. C. Burns and D. D. E. Long
In-place Reconstruction of Delta Compressed Files
In Proceedings of the 1998 Conference on the Principles of Distributed Computing
(PODC), ACM, 1998.
(Acrobat PDF, 90.6 KB)
R. C. Burns and D. D. E. Long
Efficient Distributed Backup and Restore with Delta Compression
In Proceedings of the 1997 Workshop on I/O in Parallel and Distributed Systems (IOPADS), ACM, 1997.
(Acrobat PDF, 91.5 MB)
R. C. Burns and D. D. E. Long
A Linear Time, Constant Space Differencing Algorithm
In Proceedings of the 1997 International Performance, Computing and Communications Conference
(IPCCC), IEEE, 1997.
(Acrobat PDF, 84.3 KB)
R. C. Burns
Differential Compression: A Generalized Solution for Binary Files
A thesis in completion of the Master's of Science degree, Department of Computer Science,
University of California, Santa Cruz, December 1997.
(Acrobat PDF, 190 KB)
|