Alternative location for .duplicacy directory ?

Etienne     May 30 3:09PM 2017 CLI

Hello ,

I'm slowly designing my new backup system based on duplicacy.

Context:

  • All my data are stored on my QNAP NAS. (home folders, Shared folders)
  • I plan to use Onedrive as remote storage.
  • Quite a few directories are synched on mas NAS and on my mac desktops thanks to syncthing.
  • I Heavily rely on Docker for deploying additional capabilities on my NAS.
  • I do not want the .duplicacy directory being replicated on all the partners computers of synching
  • Moreover I'd like the container having only read only access to the repository while doing the back ups.

I though about creating the repository on the NAS and add a symbolic link to the real directory to back up, but due to the Volume mapping of docker, I doubt following symbolic links inside the container would be OK.

For those reasons, I'd like to be able to store the .duplicacy directory outside of the repository.

  • Is there a way to to it without modifying the duplicacy-cli source code ?

Else I though about the forking ducplicacy-cli and preparing a pull request with

  • a new string parameter valid to all the subcommands eg: -confdir absolute_path
  • Patching the getRepositoryPreference function in src/main/duplicacy_main.go to use the -confdir value instead of looking inside the current directory all the way up in the path hierarchy
  • Alternatively, keeping the .duplicacy directory inside the repository but put a config file in it redirecting to another path elsewhere... ( more robust in case of forgetting the -confdir parameter)

Any comments ?

Is there any interest beside me to add this extra feature ?

Thanks for your time reading me ;-) Etienne


gchen    May 30 8:30PM 2017

I think it is a good idea to add an option to specify the location of the .duplicacy directory. But is there a better name than -confdir? Something like -preference-dir or -pref-dir. I would avoid using anything similar to config, as it is the name of the file used by the storage to store settings.


Etienne    May 31 2:18AM 2017

Thanks for your answer !

I'd go for -pref-dir then.

I'll study the code more deeply and prepare a pull-request "asap" ;-)

Thanks Etienne


Etienne    Jun 1 2:21PM 2017

I'm trying to build the duplicacy-cli but build instructions do not work, probably due to

  • github repo being named duplicacy-cli
  • source code moved into the src directory

Any advice ?


gchen    Jun 1 3:13PM 2017

Follow the build instructions on the github page:

git clone https://github.com/gilbertchen/duplicacy-cli.git ~/go/src/github.com/gilbertchen/duplicacy
cd ~/go/src/github.com/gilbertchen/duplicacy/src
go get ./...
go build main/duplicacy_main.go

I made a change to main/duplicacy_main.go to change the import path from github.com/gilbertchen/duplicacy to github.com/gilbertchen/duplicacy/src. You need this change for the build instructions to work.


Etienne    Jun 2 6:18AM 2017

Thanks !!

I'll check tonight ( I'm at work right now)

IMHO, it would be "better" to stick with golang culture

  • using the same name for your repo and packages
  • put the go files at repo root and create a doc directory...
  • I tried to configure glide on the repo and it's failing because of the repo/package name difference....

My 0.02€ Etienne


gchen    Jun 2 3:39PM 2017

I wanted the README.md page to look neat so I moved all source files to the src directory. But now that so many people have difficulties building from source, I think this may be a bad decision. So I just moved them back and renamed the github repository to duplicacy. Here are the updated build instructions:

git clone https://github.com/gilbertchen/duplicacy.git
cd duplicacy
go get ./...
go build main/duplicacy_main.go


Etienne    Jun 3 3:19AM 2017

Great !!

I'll redo my fork and "really" get started !!

Thanks a lot for listening feedback ! Have a nice week-end Etienne


Etienne    Jun 4 5:42AM 2017

Hello

Need feedback ;-)

After studying the code a little bit more here is what I found out

  • .duplicacy directory servers two purposes: to hold config files/cache and serves as a marker "tagging" a directory as the "top level" directory of a specific depository. ( +/- the git way with .git )
  • If I move the .duplicacy elsewhere thanks to the -pref-dir option, there is no way anymore to detect the "top" of a repository.

Here is my proposition

Add the -pref-dir only to the init command: if the option is not specified -> unchanged behavior. If the option is specified,

  • create the pref-dir in the specified directory
  • create a file named .duplicacy-location in the repository "top" directory. This is simple text file containing only the directory specified in -pref-dir converted to an absolute path.

Implementation

  • finding "top" directory works by walking up the directory hierarchy, looking for .duplicacy then, if .duplicacy does not exist, look for .duplicay-location. I found -> stop: we found the top

-find .duplicacy directory:

  • find top of repository
  • if .duplicacy exists -> yea! we found it
  • else read .duplicacy-location and use the path stored .

In my opinion:

  • It's compatible with already initialized repositories
  • Only the init command is enhanced
  • no stuff like symbolic links

Any comment/suggestion is very welcome

Have a nice Day Etienne


gchen    Jun 4 10:04AM 2017

This .duplicacy-location is basically a symbolic link. So my question is, why not make .duplicacy a symbolic link if the -pref-dir option is passed to the init command? The logic would be much simpler this way.


Etienne    Jun 4 11:17AM 2017

Indeed .duplicacy-location is basically a symboly link. BUT...

First let me explain my use cases:

The data I want to back-up are stored on my NAS (QNAP). duplicacy will only run on the NAS itself (in a docker container) and backs up data on a remote cloud provider.

My users (my kids: I don't want them fiddling with duplicacy UI) use software like "Syncthing" [1] and QSync[2] to sync data on their laptops. This means the "repository"( data to back up) will be replicated on laptops unaware of duplicacy.

In this use case, having a .duplicacy-location being a simple file without any "semantic for the OS" ( eg: symbolic links) is more "kid-proof" because there won't be any link to inexistant directories replicated on the laptops.

In my case, I'd prefer having a normal hidden file in my repository than a symbolic link pointing to a (sometimes) inexistant location...

Is it more clear ? Thanks for you patience... Etienne PS: two options -pref-dir (my idea ;-)) and pref-dir-link (yours) can always be implemented ....

[1] https://syncthing.net/ [2] https://www.qnap.com/nl-nl/utilities ( scroll to find qsync)


gchen    Jun 4 8:16PM 2017

How about making .duplicacy a file containing the real preference location if you pass the -pref-dir option to the init command? This way we don't need to deal with another special file like .duplicacy-location.

The -pref-dir-link isn't really needed I think, because users can create the symbolic link themselves if this is what they want.


Etienne    Jun 5 3:29AM 2017

I'll study the .duplicacy idea more deeply but it might cause compatibility issues in case you use a old binary with a new repository format.

Scenario: external pref dir with a older client "unaware" of the pref-dir. with .repository-location the old binary will just fail ( not finding the .duplicacy directory) ( error: repo not initialized)

with the .duplicacy possibly being a file or a directory, how will a older client fail in such case is not yet clear in my mind..

I'll keep you posted..

Etienne PS: Please have a look at syncthing, they have the .stfolder marker file ( empty file) and they have .stversions directory for syncthing internal purpose ( revision tracking) PS2/ I'm very open to alternatives ways/ file names but I think the solution must be either compatible with older client versions, either fail safely in a predictive way and without side effects...


Etienne    Jun 5 4:35PM 2017

I just pushed an 'early adopter' branch on my fork

It's not ready for prime time yet, but, gchen, would you mind having a look checking if the coding style and overall logic is OK ? ( it of course needs more tests...)

Could you check file duplicacy_shadowcopy_windows.go ? path building code is different.. Is this intentional ?

Thanks for your time Etienne PS: This is my first golang code... please be indulgent ;-)

https://github.com/ech1965/duplicacy/tree/pref-dir


gchen    Jun 5 7:22PM 2017

My suggestions:

  • I would name the function GetDuplicationPreferencePath rather than GetDotDuplicacyPathName, and the variable holding the return value preferencePath.
  • It looks like the preference path variable can be a global variable, loaded first by LoadPreferences. And GetDuplicationPreferencePath can simply return the global variable.
  • I don't remember exactly why \\ is used when creating the symlink, but maybe we should just keep it, unless you have a test that verifies the change. The reason for creating the symlink is that Go can't handle the UNC path required to access the shadow copy, but I don't know if this is still the case with Go 1.8. We need to revisit this issue later.

Thank you for the good work. I think it is in good shape.


Etienne    Jun 6 2:47AM 2017

Thanks for the feedback,

I'll update my branch asap !

BTW do you have anything like an integration test suite ? else I'll try to enhance the bash script I wrote. Etienne


gchen    Jun 6 10:51AM 2017

I don't have an integration test other than those unit tests in the *_test.go files. Your bash script may be a good start.


Etienne    Jun 7 1:52PM 2017

... GetDuplicationPreferencePath rather than GetDotDuplicacyPathName, and the variable holding the return value preferencePath.

Are you sure ? Or did you mean GetDuplicacyPreferencePath ?


gchen    Jun 7 2:03PM 2017

Sorry. Yes, it should be GetDuplicacyPreferencePath.


Etienne    Jun 11 7:13AM 2017

Hello,

I rebased my pref-dir on the current master ( 2.0.2). I update the GUIDE.md document.

From my side, it look ready to be merged.

I currently use a custom built duplicacy on my backup scripts and report any issue.

Feel free to request here for changes needed for you to merge into the "main code"..

Thanks Etienne


gchen    Jun 11 7:34PM 2017

I'll take a look tomorrow. Thank you for your work!


gchen    Jun 12 10:14AM 2017

It looks good to me. I think it is ready for a PR.


Etienne    Jun 12 12:34PM 2017

Here is the pull request ...

https://github.com/gilbertchen/duplicacy/pull/71

Just rebased on the very latest master ( main directory renamed)


Etienne    Jun 21 3:34PM 2017

Hi !

I just found a regression in the -pref-dir behavior and I'm asking for advice :

situation

  • repository connected to Onedrive storage
  • pref-dir is located in another partition ( actually inside docker container in two differents volumes)
  • performed a full-backup -> OK
  • deleted a bunch of files ->Oups ;-)
  • trying to restore the deleted files from the backup ( both with and without -overwite option
  • restore fails with Failed to rename the file /pref-dirs/TEST_BACKUP/temporary to /datahome/TEST_BACKUP/CalibreKOBOEtienne/Veronica Roth/Allegeance (704)/Allegeance - Veronica Roth.epub: rename /pref-dirs/TEST_BACKUP/temporary /datahome/TEST_BACKUP/CalibreKOBOEtienne/Veronica Roth/Allegeance (704)/Allegeance - Veronica Roth.epub: invalid cross-device link

Look like the line

err = os.Rename(temporaryPath, fullPath) in duplicacy_backupmanager around line 1228 fails because os.Rename can't move files across filesystems. [1]

What's your preferred way to fix this ?

  • download file in the repository itself ( eg: directory where .duplicacy file is located) instead of .duplicacy directory ?
  • cope with exception and try to copy/delete instead of rename ? ( not so easy to do it right ...)
  • another clever idea ?

Sorry for the regression ...

Etienne PS: do you want me to file a bug in github directly ?

[1] https://groups.google.com/forum/#!topic/golang-dev/5w7Jmg_iCJQ


gchen    Jun 21 7:57PM 2017

I think the best option is to enforce the in-place mode if the preference path is not under the repository. Just set the variable inPlace to true and it should work.

Let me fix this as I've made some changes to the GetDuplicacyPreference function in my branch. Yes, please file a bug in github.


Etienne    Jun 22 5:34AM 2017

Issue created !

https://github.com/gilbertchen/duplicacy/issues/82


gchen    Jul 7 9:28PM 2017

Sorry about the delay. I've been busy working on the performance study of Duplicacy and finally had some time to fix bugs accumulating over the past two weeks!

Let me know if the fix works.