boolean search by the string in the beginning of the name

Please check the FAQ (https://www.xyplorer.com/faq.php) before posting a question...
Leopoldus
Posts: 237
Joined: 24 Jun 2004 10:58

Post by Leopoldus »

jacky wrote:...if it is only a list of words that must all be present in the filenames, then maybe this would work for you :

Code: Select all

>^(?=.*?\bdog\b)(?=.*?\bfood\b)(?=.*?\bcat).*$
This regexp sould match all filenames that contains the words "dog" and "food" and "cat*" (so cat, cats or category but not advocate)
Well, now this expression is a bit too complex for everyday using, don't you think so?
However could you please explain the syntax of it? I can recognize only the beginning and the end of line and .* (anykey with iterator), but not those subexpressions inside brackets :?
And... are you sure, that regular expressions are not sensitive to the order of words? I'm afraid your expression will return only "Dog and food for cats.txt" (the same order of words as in the inquire), but not "Food and cats for a dog.txt".

jacky
XYwiki Master
Posts: 3106
Joined: 23 Aug 2005 22:25
Location: France
Contact:

Post by jacky »

No, regexp are sensitive to the order of what is in the patern, for sure. But ths is why this regexp uses lookahead - the (?=....) bits. For this one to work each of the three lookaheads must be met, and they all match one of the words you're looking for. So in this particular case, the order of the words won't matter, no.

I'm no regexp-expert, but I found this from the great regular-expressions.info :
regular-expressions.info wrote:If a line can meet any out of series of requirements, simply use alternation in the regular expression. ^.*\b(one|two|three)\b.*$ matches a complete line of text that contains any of the words "one", "two" or "three".

If a line must satisfy all of multiple requirements, we need to use lookahead. ^(?=.*?\bone\b)(?=.*?\btwo\b)(?=.*?\bthree\b).*$ matches a complete line of text that contains all of the words "one", "two" and "three". Again, the anchors must match at the start and end of a line and the dot must not match line breaks. Because of the caret, and the fact that lookahead is zero-width, all of the three lookaheads are attempted at the start of the each line. Each lookahead will match any piece of text on a single line (.*?) followed by one of the words. All three must match successfully for the entire regex to match. Note that instead of words like \bword\b, you can put any regular expression, no matter how complex, inside the lookahead. Finally, .*$ causes the regex to actually match the line, after the lookaheads have determined it meets the requirements.
Proud XYplorer Fanatic

Leopoldus
Posts: 237
Joined: 24 Jun 2004 10:58

Post by Leopoldus »

jacky wrote:I'm no regexp-expert, but I found this from the great regular-expressions.info
Thank you for this very intresting resource! I'll try to study some of materials located there.

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Post by admin »

Leopoldus wrote:
admin wrote:This should work [!A-Za-z] (or this? [!A-Z!a-z] ... no time to try now...)
Afraid that either of them does not work (returns nothing). I've tried to use <^> ("not" in regular expressions AFAIK) instead of <!>, it does not work too.
If and when you have a bit free time to play with it, may be you'll find the right construction. But for the practical needs one can use this clumsy syntax cat* or *[ ,.-_ ;'\!\(\&@#£$€¤%{[%=+~§“”«»1234567890]cat*.
Thank you again for help!
In Boolean mode one has to escape the "!", that's all: These work fine:
cat* or *[\!a-z]cat*
cat* or *[\!a-zA-Z]cat*

Leopoldus
Posts: 237
Joined: 24 Jun 2004 10:58

Post by Leopoldus »

admin wrote:
Leopoldus wrote:
admin wrote:In Boolean mode one has to escape the "!", that's all: These work fine:
cat* or *[\!a-z]cat*
cat* or *[\!a-zA-Z]cat*
Now it really works! Thank you!

Post Reply