Find duplicate files using MD5SUMS plaintext files

Linux howto's, compile information, information on whatever we learned on working with linux, MACOs and - of course - Products of the big evil....
Post Reply
User avatar
^rooker
Site Admin
Posts: 1483
Joined: Fri Aug 29, 2003 8:39 pm

Find duplicate files using MD5SUMS plaintext files

Post by ^rooker »

md5sum ... > list_1.txt
md5sum ... > list_2.txt

Then you must replace the " " between the columns (MD5, filename) with a ";" to be able to use "cut" to parse the columns.

Code: Select all

#!/bin/bash
LIST_1="$1"
LIST_2="$2"

while read LINE; do
    #echo "Line: $LINE"
    CHECKSUM_1=$(echo $LINE | cut -d\; -f1)
    FILENAME_1=$(echo $LINE | cut -d\; -f2)

    MATCH=$(cat "$LIST_2" | grep "$CHECKSUM_1")

    if [ ! -z "$MATCH" ]; then
        CHECKSUM_MATCH=$(echo $MATCH | cut -d\; -f1)
        FILENAME_MATCH=$(echo $MATCH | cut -d\; -f2)

        echo "MATCH: "
        echo "  $CHECKSUM_1 ($FILENAME_1)"
        echo "  $CHECKSUM_MATCH ($FILENAME_MATCH)"
    fi
done < $LIST_1
Jumping out of an airplane is not a basic instinct. Neither is breathing underwater. But put the two together and you're traveling through space!
Post Reply