Command Number for "Copy Containing Folder(s)"?

Please check the FAQ (https://www.xyplorer.com/faq.php) before posting a question...
highend
Posts: 14950
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: Command Number for "Copy Containing Folder(s)"?

Post by highend »

A slightly changed version...

Tested on 26.547 files in 4.279 folders (SSD).

The last version needs 2350 msecs, the new one 1400, so a 40% increase in speed.
Most of it comes because of deriving the folder list of the $files variable instead of
processing a second folderreport().

The last version wasn't working correctly because it cut off parts of matching paths.
The new version adds a trailing pattern to avoid this.

Code: Select all

    $startingFolder = inputfolder("C:\", "Please select folder to search");
    $excludedFiles = input("Enter the file name(s) that should NOT be in any of the folders", "File names must include their extension but NOT the path! Separate all items with a pipe '|'. Wildcards are not allowed!");

    $files = folderreport("files", "r", $startingFolder, "r", , "<crlf>");
    // Derive folders from $files (faster than an extra folderreport)
    $folders = formatlist(regexmatches($files, "^.*(?=\\)", "<crlf>"), "dents", "<crlf>");

    $metaCharacters = "(\\|\*|\^|\$|\.|\+|\(|\)|\[|\{)";
    $escapedCharacters = "\$1";

    $excludedFiles = regexreplace($excludedFiles, $metaCharacters, $escapedCharacters);
    $matches = regexmatches($files, "^.*?(" . $excludedFiles . ")$", "<crlf>");

    if ($matches) {
        // Get everything in each line up to (but not including) the last backslash -> path component
        $pattern = regexreplace(formatlist(regexmatches($matches, "^.*(?=\\)", "|"), "dents"), $metaCharacters, $escapedCharacters);
        // To remove only full paths we have to add an additonal trailing pattern
        // It omits the need for the formerly used formatlist at the end as well
        $pattern = trim(regexreplace($pattern, "(\||$)", "(\r?\n|$)|"), "|", "R");
        $folders = regexreplace($folders, "($pattern)");
    }
    text $folders;
One of my scripts helped you out? Please donate via Paypal

Jeff Bellune
Posts: 284
Joined: 13 Dec 2007 12:55

Re: Command Number for "Copy Containing Folder(s)"?

Post by Jeff Bellune »

As long as we're going for speed, wouldn't a negative character class be much faster than the lazy quantifier? Like this:

Code: Select all

//Old command:
$matches = regexmatches($files, "^.*?(" . $excludedFiles . ")$", "<crlf>");
//New command:
$matches = regexmatches($files, "^.*[^\r\n](" . $excludedFiles . ")$", "<crlf>");
In my simple tests, it cuts out at least one-third of the engine's backtracking steps.

What do you think?

Jeff

highend
Posts: 14950
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: Command Number for "Copy Containing Folder(s)"?

Post by highend »

Run some real life speedtest (20-30k files) on both of them and then show us the results :)
Don't know if it really matters considering how fast regexmatches() is.
One of my scripts helped you out? Please donate via Paypal

Jeff Bellune
Posts: 284
Joined: 13 Dec 2007 12:55

Re: Command Number for "Copy Containing Folder(s)"?

Post by Jeff Bellune »

highend wrote:Run some real life speedtest (20-30k files) on both of them and then show us the results :)
Is that your way of saying, "It doesn't matter."? :)

Jeff Bellune
Posts: 284
Joined: 13 Dec 2007 12:55

Re: Command Number for "Copy Containing Folder(s)"?

Post by Jeff Bellune »

Working my way through this with RegexBuddy, I have to say that this section of code is brilliant:

Code: Select all

         // Get everything in each line up to (but not including) the last backslash -> path component
        $pattern = regexreplace(formatlist(regexmatches($matches, "^.*(?=\\)", "|"), "dents"), $metaCharacters, $escapedCharacters);
        // To remove only full paths we have to add an additonal trailing pattern
        // It omits the need for the formerly used formatlist at the end as well
        $pattern = trim(regexreplace($pattern, "(\||$)", "(\r?\n|$)|"), "|", "R");
        $folders = regexreplace($folders, "($pattern)");
From my reading it seems that "\r?\n" makes the carriage return optional. Is that correct, and if so, is that for Linux or other OS compatibility?

Jeff

highend
Posts: 14950
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: Command Number for "Copy Containing Folder(s)"?

Post by highend »

Is that your way of saying, "It doesn't matter."?
Nope. It's my way to say: I don't know how much this affects the execution time of the command on a large base of files. If it's 10 miliseconds, who cares but if it's a few seconds...
it seems that "\r?\n" makes the carriage return optional. Is that correct, and if so, is that for Linux or other OS compatibility?
Windows text files use \r\n to terminate lines while UNIX text files use only \n. So by making the \r optional it will match any windows / linux line terminator.
One of my scripts helped you out? Please donate via Paypal

Jeff Bellune
Posts: 284
Joined: 13 Dec 2007 12:55

Re: Command Number for "Copy Containing Folder(s)"?

Post by Jeff Bellune »

highend wrote:
Is that your way of saying, "It doesn't matter."?
Nope. It's my way to say: I don't know how much this affects the execution time of the command on a large base of files. If it's 10 miliseconds, who cares but if it's a few seconds...
it seems that "\r?\n" makes the carriage return optional. Is that correct, and if so, is that for Linux or other OS compatibility?
Windows text files use \r\n to terminate lines while UNIX text files use only \n. So by making the \r optional it will match any windows / linux line terminator.
Regarding the use of "?" versus "[^\r\n}": On a set of 7,900 folders containing 28,000 files, the difference is about 0.5 seconds when only a single file is listed in the excluded files list. (NB: the single file is found in many of the test folders.) As the excluded files list grows to 10-12 files, the time difference is essentially nil. I assume that's because more excluded files means more folders will be excluded, and the time to replace, format, trim and remove folders from the list far exceeds the time to process the string of excluded folders. I can't think of any other reason for the time difference to shrink as more excluded files are input by the user.

Thanks for confirming my hypothesis about OS compatibility.

Cheers,
Jeff

Post Reply