Freedup

freedup is a program to scan directories or file lists for duplicate files. The file lists may be provided to an input pipe or internally generated using find with provided options. There are more options to specify the search conditions more detailed. Other options influence the performed actions, i.e. whether to display only or to specify what kind of link under which circumstances. freedup first compares file sizes, then on equal sizes the MD5 signatures, and before taking actions a byte-by-byte check for verification is performed. An interactive mode allows to decide individually which files to link soft or hard or to delete.

The comparison by ignoring metadata tags and comments is a unique feature of freedup. Filesize, start and end of unique content is kept for later processing. Comparing sound files you may ignore the tags, e.g. whether one is tagged with an ID3v1-tag while another sound file with identical music is tagged with ID3v2. It also works, if you copied and retagged the copy to fit into another album. This works for JPEG files (Exif) and mp4-Movies as well. An auto-Mode is supported to instruct freedup to ignore all tags that are recognized. The author will extend this function on demand, if there is sufficient documentation how to strip the tags.

freedup is written in POSIX compliant C and is released under the GNU General Public License. Its complexity is O(n log n) for full file comparison. This is done for equally long files after sorting according to filesize using qsort.