Page 1 of 1

Find duplicate files using MD5SUMS plaintext files

Posted: Tue Jan 22, 2013 4:41 pm
by ^rooker
md5sum ... > list_1.txt
md5sum ... > list_2.txt

Then you must replace the " " between the columns (MD5, filename) with a ";" to be able to use "cut" to parse the columns.

Code: Select all

#!/bin/bash
LIST_1="$1"
LIST_2="$2"

while read LINE; do
    #echo "Line: $LINE"
    CHECKSUM_1=$(echo $LINE | cut -d\; -f1)
    FILENAME_1=$(echo $LINE | cut -d\; -f2)

    MATCH=$(cat "$LIST_2" | grep "$CHECKSUM_1")

    if [ ! -z "$MATCH" ]; then
        CHECKSUM_MATCH=$(echo $MATCH | cut -d\; -f1)
        FILENAME_MATCH=$(echo $MATCH | cut -d\; -f2)

        echo "MATCH: "
        echo "  $CHECKSUM_1 ($FILENAME_1)"
        echo "  $CHECKSUM_MATCH ($FILENAME_MATCH)"
    fi
done < $LIST_1