thedaveCA Feb 4 2:59AM 2017
What are the upper bounds of what Duplicacy supports and/or has been tested against?
I have two main groups of data, one is 2.22 TB (436,566 files), the second is 12.4TB (94,229 files), are these likely to be something Duplicacy would be capable of handling?
Should I be optimizing to use a single large repository (well, two, one for each group), or split data into separate repositories, and if I split, what would be the ideal repository target sizes?
Note that we don't currently backup everything to cloud storage, we have our own internal replication and backups, with only "critical" data being backed up to cloud but we're strongly considering backing up at least the first (2.22TB) group of data. This data is already subdivided and we could easily set up backups to smaller subsets of data.
I expect to use the CLI, and I'm expecting to use Backblaze B2. Currently we run Windows Server 2012 R2 servers, although we expect to add at least one 2016 over the next few months.
Is this a reasonable use-case for Duplicity or not?
gchen Feb 4 8:27PM 2017
I've done some test backups on 1 million files, although their size never exceeded 1TB. I feel that the number of files is a more important indicator of the 'hardness' of backup, therefore there shouldn't be any reason why you can't backup data as much as 2TB or 12TB.
Dividing the data into smaller subsets isn't really necessary, but you should probably first try to upload a small subset (for instance, around 100GB) for testing purposes, just to see how fast your connection to B2 is and to estimate the total backup time on the complete data.