Chunk size best practices

Etienne     Jun 16 7:25AM 2017 CLI

How to choose the "best" value for the chunk size parameters ?

Eg: count of files in the repository, average file size size, file kind ( jpeg, raw, documents..), Size of storage... All these parameters have impact on the optimal chunk size ...

It would be convenient to have hints available in the documentation.. Etienne


gchen    Jun 16 1:08PM 2017

The default average chunk size of 4MB is set so mostly because of performance concern. Duplicacy performs 2 API calls for each new chunk, one for checking the existence and the other for actually uploading the chunk. So the larger the chunk, the less overhead there is.

The average chunk size also determines the deduplication efficiency. Any file smaller than the average chunk size is unlikely to see deduplication between different versions. So if you frequently edit text files that are about a few hundred kilobytes you may want to consider a smaller chunk size. However, the benefit from deduplication small files is limited so I think the default average chunk size of 4MB is a reasonable choice.

For photos or documents like Microsoft word files, I don't think there is any deduplication between version to exploit, since a small edit will render a completely different file from beginning to end.