Duplicacy Issue: software comparison

All issues

software comparison

Richard Mar 14 11:05AM 2017 GUI

Can you outline advantages if your software over Arq and qBackup: https://www.qualeed.com/en/qbackup/?

Thanks

gchen Mar 14 11:45AM 2017

Our biggest advantage over Arq, or any other backup software, is the ability to take advantage of cross-computer deduplication. If you have many computers sharing a similar set of files (such as a large code base) then with Duplicacy these computers can back up to the same storage folder -- deduplication occurs across computers. With Arq you'll have to use a separate storage folder for each computer.

Duplicacy also supports more cloud storages than Arg, the most notable being Backblaze B2.

qBackup seems to be quite primitive at this moment and lacks many features (for instance, no built-in scheduler). They don't provide any documentation on how their deduplication works so I can't comment on this aspect.

Richard Mar 14 1:55PM 2017

Thanks. I just ran a test with Duplicacy, and while the backup was indexing the drive, I was seeing hundreds of these messages.

14:51:10.887 Failed to read the symlink: The system cannot find the file specified.

gchen Mar 14 2:42PM 2017

Is this on Windows? Can you run Duplicacy as administrator to see if this is a permission issue.

Unfortunately this log message doesn't include the file path. I'll fix it in the next update.

Richard Mar 14 2:47PM 2017

Yes, on Windows 2012 R2 and running as Admin. But, I'm also using Windows native dedup at the drive level so wondering if something odd is happening there.

gchen Mar 14 2:56PM 2017

I see. Each deduped file is actually a reparse point, but Duplicacy can only handle reparse points that are a symlink or a mount point. For others it returns ENOENT.

I think the solution is to treat dedup reparse points as if they are regular files. I'll fix it.

Richard Mar 14 3:03PM 2017

Thanks. BTW - if you install and text qBackup, I think you'll find the interface to be slicker than Duplicacy. And there can be multiple jobs. Schedule is done via command line and task scheduler so seems to work fine. I don't work for this company, but I've been testing it and others.

gchen Mar 15 10:18AM 2017

Version 1.2.1 fixed this deduplication reparse point issue.

As for our GUI, we have a different design rationale -- we wanted to have a simple configuration page that does most basic things. For advanced users there is a CLI version that is more powerful and more flexible.

Charles Apr 27 11:07AM 2017

I wanted to chime in on this for others who find this page since I have been testing for months now. Please note that this is all my experience and I have been looking for what best suits my needs and have not given thought to how these products might suit other's needs. gchen, feel free to delete this post if you don't like it here. I understand this is the "issues" section.

*skip to the bottom to hear my thoughts on Duplicacy.

I started with no backup even though I knew better. I finally kicked myself one day and went with CrashPlan free app for site to site + unlimited cloud storage. It seemed to do fine at first, but the deduplication would choke once the dataset size increased and backups of new data would takes weeks to months. I dealt with this for a long time and tried to manage it with multiple backup sets.

Duplicati looked nice, but in the first day of testing also choked after the first few hundred GB or so of data possibly due to the deduplication algorithms. I kept tweaking the settings, but I was never completely satisfied with any of the setups I was able to achieve. This might be fine for some, but it wasn't for me.

Arq worked and is even multithreaded, but I hardly noticed any deduplication (I've had similar results with just compression) which I could live with, but there were no options to manage how many versions to keep or for how long, the Windows UI was clunky, and there was no linux version. From what I can tell, it was developed for Mac, then ported to Windows. Overall it didn't seem like a good fit and I didn't feel like I had good control over the data. The pricing was pretty nice for personal use though. My experience with their support team for things that weren't working wasn't great. They had some documentation about the implementation.

Cloud Backo gave me hope for a bit since it posts documentation about how it works, however it kept failing, the logging system was a nightmare, and having to switch between the many different screens was a pain. One of the craziest things with this software was that you needed to have a backup of your backup settings in order to restore to another computer. It had no way of recognizing existing backups without having that or creating the backup set on the other computer. I particularly liked the pricing structure of Cloud Backo. I could buy the simple file backup now, and if I wanted, I could purchase other modules as I needed them. Even though there were so many options, it wasn't confusing like these things often can be when you split everything apart like that.

Cloudberry backup did everything I needed. It is fast, supports many storage options, local encryption, versioning, deletion policies, easy control over all of the settings for each backup set, and boy was it fast. There were no deduplication options unless you wanted to purchase separate software and setup a dedupe server. It does have block-level backup, but that was documented as a feature for diff comparisons, not dedupe. My major problem was that the pricing structure didn't really seem to support advanced home users. There were all of these hard coded restrictions preventing you from running certain versions of their software on certain machines, plus there was the data cap. Why in the world are they placing an additional cost based on how much data I backup when they aren't the ones storing it? A sales rep said he would give me a deal for some social media activity and I said sure. I tested the software, decided to buy it. All of a sudden he comes back with a $200 + maintenance plan to keep the software updated. I tried explaining to him that I'm a personal home user and I was lead to believe that he was going to give me a pricing somewhere between the personal edition and the server edition. This was just 100 shy of the most expensive package. All of a sudden he starts playing dumb and starts asking things like "so do you want the home edition". I suppose they may be a fine choice for a business, but the hidden fees don't make sense for personal use. They had documentation, but the organization made much of it hard to find.

Richard mentioned qBackup. I am not familiar with this but I like the promises it makes. I also like that it says that it has the same ui across all platforms and the ability to restore across platforms. I never confirmed cloudberry could restore across platforms, but it is one of my requirements. I will not test this software out though since it clearly states it does not support VSS in the FAQ. This is a deal breaker since I have had a mountain of issues come from backup services not using VSS when I am actively trying to use data.

After some initial tests, I am choosing to implement Duplicacy as my backup software. Why? It seems to excel at everything I've previously mentioned. There is plenty of design documentation which outlines what I believe to be a pretty clever implementation. It achieves deduplication (as far as I can tell) at a linear/ constant speed regardless of data size and does so across multiple backup sets and computers without having a deduplication server as an intermediary (which I have seen as a solution for a few dedup softwares now). It is moderately fast with single threaded uploads and from what I hear will support multi-threading for all storages soon. Support has been fantastic even though I haven't purchased anything yet. The pricing model is easy to understand and reasonable. The licensing is awesome and there are plans to release the source code.

Minor annoyances that may improve with time mainly come from the GUI. The GUI is nice and simple, but additional backup sets would be nice. The option to restore from additional backup repositories without switching the storage location would be nice, though now that I trust the software more after testing and researching, maybe I will combine my backup repositories into the same location. It seemed odd that there was no folder selection tool at first, but considering I normally select the root folder then add a lot of excludes for what I don't want in order to be sure that new folders will get added, it wasn't really that bad. Not for a data drive anyway. For a desktop computer with multiple root level directories, I ended up setting up a pseudo repository folder with symlinks in it.

gchen Apr 27 8:38PM 2017

Wow! I really appreciate that you share your experience in such thorough detail. This can be very helpful to other users.

It is great that Duplicacy works for you. Duplicacy is unique because of the idea of Lock-Free Deduplication and this should be the way how backup is done in this cloud age -- in my own opinion, any backup tool that does not follow this paradigm will have some flaws here and there. Of course we are still young and this is still room for improvements. Particularly, the GUI version, which is a simple wrapper and relies on inter-process communication to communicate with the CLI version, may not run as smoothly as a program with built-in backup/restore functionalities.

Here is the short-term development plan:

Multiple-threaded uploading and downloading (should be ready in a week or two)
A new backend for Google Cloud Storage based on the official Google client
Fair Source License
Rewrite the GUI version with a Go GUI library so it can run backup/restore without inter-process communication

By the way, your post deserves its own thread. Would you mind creating a new issue and then copying and pasting your post to there?