next up previous
Next: Configuration Up: Tips, Tricks, and Tools Previous: Applying a command to

Comparing Files with md5

md5 is an algorithm for computing a ``message digest'' - a fingerprint for a file. md5sum (part of the GNU fileutils package) computes md5 digests for files given on the command line.

eg1$ md5sum sluug.tex
407e6f4d8c2d008e0a46def8595e8671  sluug.tex

Suppose two large files reside on distinct computers separated by a slow link and you wish to know whether they're identical or not. It's much easier to compare the digest than to download a file and cmp them.

md5 is frequently used to ensure that packages are distributed unmolested. Many sites distribute a file formatted like

2c43e710dcea2089bcf139f5b1d3ecdd  README
b7d07cc5545baec15aa2539cf7e44e83  file1.c
9e68832ade6156495b5377a2bf2bdc3a  file2.c
To verify that the package has not been modified, simply type
$ md5sum -c md5-file
file1.c: FAILED
file2.c: OK

I use md5 to compare files at home with files at lab with

$ rsh dasher 'find src -type f | xargs md5sum | gzip -c' \
  | gzip -cd | md5sum -c

Reece Kimball Hart