If I have a set of filenames in a directory that are all formatted the same way (i.e.: {descriptor}{3 character code}{3 character code}~{remaining codes}.ext) of which I have broken down by regular expression already, is there a way with a script that will check each filename in the list against all the other filenames in the list to see if there are two or more matches to the first two groups of the regular expression and list them?
If it helps, the regular expression I am using is: ([0-9a-zA-Z_.#\-]*)([a-zA-Z0-9]{3})([0-9]{3})(~)([0-9a-z\-]*)(\.)([a-z]*)$
Thank you for any help!
Similar filename matches in a directory
Re: Similar filename matches in a directory
And now post a real world example of file names...
One of my scripts helped you out? Please donate via Paypal or highend (at) web (dot) de
Re: Similar filename matches in a directory
A real world example would be:
12345678a05001~a.ext
So the regular expression should be able to group this as:
12345678 a05 001 ~ a .ext
The first group (12345678) could be longer or shorter than 8 characters and also contain letters.
The second group (a05) will always have 3 characters.
The third group (001) group will always have 3 characters.
A tilde (~) separator.
Then the last group before the extension could be alphanumeric characters.
So if I had 2 or more files that shared the same alphanumerics in the first two groups, it would note the two files. So the two filenames in the middle would get noted.
12345678a05001~a.ext
87654321a05002~b.ext
87654321a05003~c.ext
65432178b09000~b.ext
I hope I explained that enough to make some sense.
12345678a05001~a.ext
So the regular expression should be able to group this as:
12345678 a05 001 ~ a .ext
The first group (12345678) could be longer or shorter than 8 characters and also contain letters.
The second group (a05) will always have 3 characters.
The third group (001) group will always have 3 characters.
A tilde (~) separator.
Then the last group before the extension could be alphanumeric characters.
So if I had 2 or more files that shared the same alphanumerics in the first two groups, it would note the two files. So the two filenames in the middle would get noted.
12345678a05001~a.ext
87654321a05002~b.ext
87654321a05003~c.ext
65432178b09000~b.ext
I hope I explained that enough to make some sense.
Re: Similar filename matches in a directory
Code: Select all
$files = listfolder(, , 1+4, <crlf>);
$log = "";
while ($files) {
$id = regexreplace(gettoken($files, 1, <crlf>), "^([0-9a-zA-Z_.#-]*)([a-zA-Z0-9]{3})([0-9]{3})(.*)", "$1$2", 1);
$escaped = regexreplace($id, "([\\.+(){\[^$])", "\$1");
$matches = regexmatches($files, "^" . $escaped . ".*?(?=\r?\n|$)", <crlf>, 1);
if (gettoken($matches, "count", <crlf>) >= 2) {
$log .= $matches . <crlf 2> . strrepeat("-", 20) . <crlf 2>;
}
$files = formatlist(regexreplace($files, "^" . $escaped . ".*?(?=\r?\n|$)", , 1), "e", <crlf>);
}
if ($log) {
text "Matching files...<crlf>" . strrepeat("=", 17) . <crlf 2> . $log;
} else {
text "No matches found!";
}
One of my scripts helped you out? Please donate via Paypal or highend (at) web (dot) de
Re: Similar filename matches in a directory
highend,
I don't even know what to say except for amazing and thank you.
Just as an added bonus, if I had a certain set of three characters for the second grouping (i.e.: btr or imp), is there a way to exclude a set or two if needed? If not, I can very much work with what you have already done.
I don't even know what to say except for amazing and thank you.
Just as an added bonus, if I had a certain set of three characters for the second grouping (i.e.: btr or imp), is there a way to exclude a set or two if needed? If not, I can very much work with what you have already done.
Re: Similar filename matches in a directory
Add another check in the
block that tests via regexmatches if the second group does NOT contain
any of the ignored patterns and only do the
stuff if that's true.
if (gettoken($matches, "count", <crlf>) >= 2) {
block that tests via regexmatches if the second group does NOT contain
any of the ignored patterns and only do the
$log .= $matches . <crlf 2> . strrepeat("-", 20) . <crlf 2>;
stuff if that's true.
One of my scripts helped you out? Please donate via Paypal or highend (at) web (dot) de
Re: Similar filename matches in a directory
highend,
Thank you so much!
Thank you so much!