regex confusion with subdirectories

gregthegeek     Mar 15 10:33PM 2018 CLI

I'm confused about usage with regex patterns. I'm not able to replicate how they work using the standard patterns. For example, I have the following that works using original patterns. I want to get only the /bu/quick_dumps/current/ contents, from the system root. (I do have other subdirs I want, using the same setup, this is just one)

+bu/
+bu/quick_dumps/
+bu/quick_dumps/current/*
-bu/*
-*

If I try this with regex filters, the below excludes /bu completely, not what I want.

i:^bu/quick_dumps/current/.*
e:.*

I tried replicating the original pattern style as well, which pulls in all /bu contents, again, not what I want.

i:^bu
i:^bu/quick_dumps
i:^bu/quick_dumps/current/.*
e:.*

If I include any sort of e:^bu/.* at all, it will exclude the entire /bu folder, even the one I previously included.

Also, if I don't use the ^ character at the start, it pulls in another folder /zbu.

Anyway, how do I accomplish this with regex patterns? Ultimately what I want is a few separate folders in the system.

/bu/quick_dumps/current/*
/data/web/cli/*
/home/greg/*

I can accomplish this fine with the original patterns, but I'd like to use regex for more flexibility later. Thanks for any help!


saspus    Mar 16 3:44AM 2018

Just a guess - I haven't tried myself -- but your regex need to match all partial paths as well, same as with simple exclusions.

i:^bu/(quick_dumps/(current/.*)?)?

It gets ugly fast :)


gchen    Mar 16 9:12AM 2018

You need to anchor the patterns:

i:^bu/$
i:^bu/quick_dumps/$
i:^bu/quick_dumps/current/.*
e:.*

Otherwise, e:bu will match both zbu and bu/others.


Christoph    Mar 18 7:09AM 2018

The other day, a similar thing happened to me and I spent half an hour or so bug tracking my filters file until I remembered the anchoring thing. TBH, it's a bit annoying and people will keep running into this issue. I wonder if there is not a better way to solve this. Could duplicacy not do the anchoring internally by itself? In other words: make it interpret

i:foo/bar/.*

as

i:foo/bar/.*
i:foo/

unless, of course, there is an explicit exclusion rule that forbids that.


gregthegeek    Mar 19 2:47PM 2018

Thanks all! @saspus that actually does work with a $ anchor at the end.

I ended up spelling out each parent folder and anchor as gchen describes. I just didn't realize that each needed the $ anchor. Once I started adding multiple sub-folders using that ( ) method it was really confusing to figure out. Like you said, it gets ugly fast! Its annoying and counter-intuitive to need to spell out all those parent folders just to reach the one you want, but doable.

Thanks for the help! I'll probably have more questions on the regex patterns though.


gregthegeek    Mar 19 3:26PM 2018

Yeah, I do have more questions on regex, but I suspect this wont work because of the requirement to specify all the parent folders. So I have this folder structure:

/data/web/cli/ <several files and subfolders>
/data/web/gwmain/ <same>

In each of those there are subfolders data/tmp , data/cache, and logs/ that I'd like to skip. What I am testing on Regexr.com is something like this:

^\/data\/web\/(?:cli|gwmain)\/(?!data\/cache)(?!data\/tmp)(?!logs).*$

The above works to match all cli or gwmain root folders. But skip any of those specified. In Regexr.com that tests perfectly. In duplicacy it errors.

Invalid regular expression encountered for filter: "i:^data\/web\/(?:cli|gwmain)\/(?!data\/cache)(?!data\/tmp)(?!logs).*", error: error parsing regexp: invalid or unsupported Perl syntax: `(?!`

I tried without the escaping \/ characters too, same issue. So anyway, it occurs to me that this can not work with the need to define each parent folder. (unless there's maybe a different syntax I need?) Which leads me to my next question, whats the advantage to having regex if you can't do this?

Note: I am able to get what I want with all the defined lines, for each possible subfolder include and exclude. It just seems like regex would be much better for complex structures if they could be used this way.

Thanks again for any input.


gchen    Mar 19 10:20PM 2018

Go's regex library doesn't support full perl regex syntax, such as ?: and ?! in your example, but I don't think these special patterns are needed. Would this work for you?

e:^data/web/(cli|gwmain)/(data/cache|data/tmp|logs)/