Golang : Recombine chunked files example
Problem :
You use the previous tutorial on how to split or chunk a big file into smaller pieces and now you want to recombine the smaller chunks back together again. How to do that?
Solution :
In Linux/Unix, you can use the cat
command to recombine the chunks with this example line:
cat bigfile_0 bigfile_1 bigfile_2 bigfile_3 bigfile_4 bigfile_5 > catNEWbigfile.zip
However, it can be a tedious process if you have many files to recombine back.
Below is an example of how to split a file and then recombine the chunks back again.
I've tested this solution on a 6 MB bigfile.zip
file and able to unzip the new file successfully. Please try on a larger file to see if this solution works or not with a file larger than 4GB.
Here you go!
package main
import (
"bufio"
"fmt"
"io/ioutil"
"math"
"os"
"strconv"
)
func main() {
fileToBeChunked := "./bigfile.zip" // change here!
file, err := os.Open(fileToBeChunked)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
defer file.Close()
fileInfo, _ := file.Stat()
var fileSize int64 = fileInfo.Size()
const fileChunk = 1 * (1 << 20) // 1 MB, change this to your requirement
// calculate total number of parts the file will be chunked into
totalPartsNum := uint64(math.Ceil(float64(fileSize) / float64(fileChunk)))
fmt.Printf("Splitting to %d pieces.\n", totalPartsNum)
for i := uint64(0); i < totalPartsNum; i++ {
partSize := int(math.Min(fileChunk, float64(fileSize-int64(i*fileChunk))))
partBuffer := make([]byte, partSize)
file.Read(partBuffer)
// write to disk
fileName := "bigfile_" + strconv.FormatUint(i, 10)
_, err := os.Create(fileName)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
// write/save buffer to disk
ioutil.WriteFile(fileName, partBuffer, os.ModeAppend)
fmt.Println("Split to : ", fileName)
}
// just for fun, let's recombine back the chunked files in a new file
newFileName := "NEWbigfile.zip"
_, err = os.Create(newFileName)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
//set the newFileName file to APPEND MODE!!
// open files r and w
file, err = os.OpenFile(newFileName, os.O_APPEND|os.O_WRONLY, os.ModeAppend)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
// IMPORTANT! do not defer a file.Close when opening a file for APPEND mode!
// defer file.Close()
// just information on which part of the new file we are appending
var writePosition int64 = 0
for j := uint64(0); j < totalPartsNum; j++ {
//read a chunk
currentChunkFileName := "bigfile_" + strconv.FormatUint(j, 10)
newFileChunk, err := os.Open(currentChunkFileName)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
defer newFileChunk.Close()
chunkInfo, err := newFileChunk.Stat()
if err != nil {
fmt.Println(err)
os.Exit(1)
}
// calculate the bytes size of each chunk
// we are not going to rely on previous data and constant
var chunkSize int64 = chunkInfo.Size()
chunkBufferBytes := make([]byte, chunkSize)
fmt.Println("Appending at position : [", writePosition, "] bytes")
writePosition = writePosition + chunkSize
// read into chunkBufferBytes
reader := bufio.NewReader(newFileChunk)
_, err = reader.Read(chunkBufferBytes)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
// DON't USE ioutil.WriteFile -- it will overwrite the previous bytes!
// write/save buffer to disk
//ioutil.WriteFile(newFileName, chunkBufferBytes, os.ModeAppend)
n, err := file.Write(chunkBufferBytes)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
file.Sync() //flush to disk
// free up the buffer for next cycle
// should not be a problem if the chunk size is small, but
// can be resource hogging if the chunk size is huge.
// also a good practice to clean up your own plate after eating
chunkBufferBytes = nil // reset or empty our buffer
fmt.Println("Written ", n, " bytes")
fmt.Println("Recombining part [", j, "] into : ", newFileName)
}
// now, we close the newFileName
file.Close()
}
Sample output:
Splitting to 6 pieces.
Split to : bigfile_0
Split to : bigfile_1
Split to : bigfile_2
Split to : bigfile_3
Split to : bigfile_4
Split to : bigfile_5
Appending at position : [ 0 ] bytes
Written 1048576 bytes
Recombining part [ 0 ] into : NEWbigfile.zip
Appending at position : [ 1048576 ] bytes
Written 1048576 bytes
Recombining part [ 1 ] into : NEWbigfile.zip
Appending at position : [ 2097152 ] bytes
Written 1048576 bytes
Recombining part [ 2 ] into : NEWbigfile.zip
Appending at position : [ 3145728 ] bytes
Written 1048576 bytes
Recombining part [ 3 ] into : NEWbigfile.zip
Appending at position : [ 4194304 ] bytes
Written 1048576 bytes
Recombining part [ 4 ] into : NEWbigfile.zip
Appending at position : [ 5242880 ] bytes
Written 907617 bytes
Recombining part [ 5 ] into : NEWbigfile.zip
NOTE: This example iterates the chunked files with data from a previous function. IF you need to create a separate program to scan the number of small files in a directory to determine how many files to loop through..... start by looking at this tutorial, https://www.socketloop.com/tutorials/golang-increment-string-example
Happy coding!
References:
https://www.socketloop.com/tutorials/golang-how-to-split-or-chunking-a-file-to-smaller-pieces
https://socketloop.com/references/golang-os-file-write-writestring-and-writeat-functions-example
https://www.socketloop.com/tutorials/golang-reset-buffer-example
See also : Golang : Increment string example
By Adam Ng
IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.
Advertisement
Tutorials
+10.6k Golang : How to determine a prime number?
+15.4k Golang : Update database with GORM example
+4.8k Golang : Calculate half life decay example
+5.5k Golang : List all packages and search for certain package
+12.4k Golang : http.Get example
+20.3k Golang : Convert PNG transparent background image to JPG or JPEG image
+7.7k Golang : Variadic function arguments sanity check example
+19.3k Golang : Determine if directory is empty with os.File.Readdir() function
+5.2k Unix/Linux : How to find out the hard disk size?
+26.2k Golang : Convert file content into array of bytes
+5.6k AWS S3 : Prevent Hotlinking policy
+5.2k Golang : Shortening import identifier