Golang : Find duplicate files with filepath.Walk
Sometimes we downloaded a lot of files in a directory and although the files have different names, they could be duplicates of the same file . This small Golang program will scan a target directory and create a hash map for each files. If any files have similar sha512 hash, then they are ... essentially the same.
package main
import (
"crypto/sha512"
"fmt"
"io/ioutil"
"os"
"path/filepath"
)
var files = make(map[[sha512.Size]byte]string)
func checkDuplicate(path string, info os.FileInfo, err error) error {
if err != nil {
fmt.Println(err)
return nil
}
if info.IsDir() { // skip directory
return nil
}
data, err := ioutil.ReadFile(path)
if err != nil {
fmt.Println(err)
return nil
}
hash := sha512.Sum512(data) // get the file sha512 hash
if v, ok := files[hash]; ok {
fmt.Printf("%q is a duplicate of %q\n", path, v)
} else {
files[hash] = path // store in map for comparison
}
return nil
}
func main() {
if len(os.Args) != 2 {
fmt.Printf("USAGE : %s <target_directory> \n", os.Args[0])
os.Exit(0)
}
dir := os.Args[1] // get the target directory
err := filepath.Walk(dir, checkDuplicate)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
}
Sample output :
"/Users/sweetlogic/Applications/.localized" is a duplicate of "/Users/.localized"
"/Users/sweetlogic/Desktop/.localized" is a duplicate of "/Users/.localized"
"/Users/sweetlogic/Desktop/01.jpg" is a duplicate of "/Users/sweetlogic/01.jpg"
"/Users/sweetlogic/Desktop/02.jpg" is a duplicate of "/Users/sweetlogic/02.jpg"
"/Users/sweetlogic/Desktop/03.jpg" is a duplicate of "/Users/sweetlogic/03.jpg"
See also : Generate checksum for a file in Go
By Adam Ng
IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.
Advertisement
Tutorials
+28.7k Golang : Change a file last modified date and time
+9.2k Golang : Get curl -I or head data from URL example
+8.2k Findstr command the Grep equivalent for Windows
+19.5k Golang : Get RGBA values of each image pixel
+19.5k Golang : How to count the number of repeated characters in a string?
+16.2k Golang : Get sub string example
+11.6k Golang : Format numbers to nearest thousands such as kilos millions billions and trillions
+5.4k Python : Convert(cast) string to bytes example
+16.7k Golang : Merge video(OpenCV) and audio(PortAudio) into a mp4 file
+6.7k Golang : Embedded or data bundling example
+10.9k PHP : Convert(cast) bigInt to string
+11.7k Get form post value in Go