Folder Flatten & Segregate script - Has anyone...?

Discuss and share scripts and script files...
Papoulka
Posts: 455
Joined: 13 Jul 2013 23:41

Folder Flatten & Segregate script - Has anyone...?

Post by Papoulka »

I need a script to do two things, starting in a given 'home' folder:

1) Flatten the home folder by finding all files in all folders beneath it and moving them to 'home'. I don't expect many collisions and would like auto-increment to handle that because I don't mind a few dupes. But if manual resolution is required that's OK. Then delete all the now-empty subfolders.

2) Segregate the files now in 'home' according to extension. All .ABC files will go in new auto-created subfolder 'home'\ABC. All .DEF into new folder 'home'\DEF. Some files won't be either and will stay in the 'home' folder.


Part 1 above may well have been done already since it would probably be useful to many people, but I couldn't find it. I did run across the following which will help a lot if I have to write this myself:

http://www.xyplorer.com/xyfc/viewtopic. ... der#p72610


Part 2 also may have been done but again I couldn't find anything.

--> If there are no scripts already made, any suggestions about what commands to use will be appreciated. I'm thinking that "moveto" using a list (created as in the example link) will be most likely to succeed.

I do realize that XY has some features that would facilitate doing all this manually on each folder. But I have many hundreds of folders to process and the mouse clicks really add up.

Thx

Papoulka

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

Code: Select all

    $fileExtensions = "|abc|def|";

    setting "BackgroundFileOps", 0;
    $filesRoot = listfolder("<curpath>", , 1);
    $foldersRoot = listfolder("<curpath>", , 2);
    $allFiles  = folderreport("files", "r", "<curpath>", "r", , "|");
    $filesInSubfolders = formatlist(replacelist($allFiles, $filesRoot, "", "|"), "dents");
    backupto "<curpath>", $filesInSubfolders, 4, , 0, 0, 0, 0, 0;
    delete 1, 0, $foldersRoot;

    $filesRoot = listfolder("<curpath>", , 1); // Get all files in root dir (second time)
    foreach($file, $filesRoot) {
        $curExtension = getpathcomponent($file, "ext");
        if (strpos($fileExtensions, "|$curExtension|") != -1) {
            moveto "<curpath>\$curExtension", $file, , 2;
        }
    }
Quick and dirty...

Edit the file extensions variable like this:
$fileExtensions = "|jpg|png|psd|";
In other words, DO NOT REMOVE the leading and trailing pipe ("|") symbol!

TEST IT BEFORE YOU USE IT ON REAL FILES!
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

Papoulka
Posts: 455
Joined: 13 Jul 2013 23:41

Re: Folder Flatten & Segregate script - Has anyone...?

Post by Papoulka »

@highend

Thank you highend! I'm impressed with how rapidly some people can put together ideas and code them.

Also note the donation in your Paypal account, for this and the other times you've helped.

This script does almost exactly what I was looking for, and is very instructive. I did change the "backupto" to "moveto", because each folder can have several hundred MB of files, and the copy process takes a long time. I understand that you chose "backupto" because it can handle collisions automatically, whereas "moveto" has fewer options and can't do that. But I don't have many collisions in this process.

A glitch in using "moveto" is that I occasionally get a "Rich Move" prompt. If I say "No" to that, all works fine. If I say "Yes", I get duplicates in a new folder but no problems. However, if I "cancel" the Rich Move, all the files are deleted! So if anyone is using my modified script below, be careful with that.

Here is the "Flatten Folder" part only, as modified to move rather than copy files:

Code: Select all

msg "Flatten? [Say NO to any Rich Move]", 1;
    focus "L";
    setting "BackgroundFileOps", 0;
    $filesRoot = listfolder("<curpath>", , 1);
    $foldersRoot = listfolder("<curpath>", , 2);
    $allFiles  = folderreport("files", "r", "<curpath>", "r", , "|");
    $filesInSubfolders = formatlist(replacelist($allFiles, $filesRoot, "", "|"), "dents");
    moveto "<curpath>", $filesInSubfolders,  , 3;
    delete 1, 0, $foldersRoot;

   echo "all files collected in root";
I wish there was a way to turn off Rich Move in general. In most of my work I don't need it so the prompts become an annoyance.

Thanks again - this code is saving me a lot of time already.

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

Also note the donation in your Paypal account, for this and the other times you've helped.
Thanks a lot!
I understand that you chose "backupto"
I wanted to be on the safe side. Deleting files / folders in scripts is always a risk (even if you delete them to the recycle bin -> what happens if there isn't enough space...).
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

Papoulka
Posts: 455
Joined: 13 Jul 2013 23:41

Re: Folder Flatten & Segregate script - Has anyone...?

Post by Papoulka »

Hello highend -

Well, I've used the Flatten script you wrote last year thousands of times, and it has helped me a lot. However, it has some fundamental issues which weren't apparent at first, mostly because the problem cases are relatively rare. Though when they do appear it's pretty dramatic e.g. all files vanish...

The basic problem is clearly with same-name files and folders. Any of the following will cause issues: Root (parent) folder has same name as a subfolder; two subfolders have the same name; one or more files has the same name as a subfolder (especially if this includes an extension); two files in the same folder have the same base names; folders (especially root) with no files.

These situations often occur when unpacking archives; I could speculate on why that's common but in any case it happens a lot. Of course these can be manually fixed beforehand but that's tedious and unreliable. In my own working copy of this script I have added some checks and protections, but they are inadequate for heavy use. Just to note: simple filename collisions aren't the problem; XY handles those fine. The problem arises in the processing before the Moveto command.

I really wish XY had a native Flatten command; this has been debated before and maybe I'll revive that request. But before that I wanted to bring this back up to you and see if you have any ideas. Maybe a different basic approach is needed, with the found filenames stored in separate strings rather than in one long one...?

Thanks again for your help

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

Test setup:
R:\a.folder\a.tx1
R:\a.folder\a.txt
R:\a.folder\a.folder
R:\a.folder\a.folder\a.folder
R:\a.folder\a.folder\a.tx1
R:\a.folder\a.folder\a.txt
R:\a.folder\b.folder
R:\a.folder\b.folder\a.folder
R:\a.folder\b.folder\a.folder\a.tx1
R:\a.folder\b.folder\a.folder\a.txt
Root folder:
R:\a.folder

2 files in it:
R:\a.folder\a.tx1
R:\a.folder\a.txt

2 direct subfolders:
R:\a.folder\a.folder
R:\a.folder\b.folder

The first one contains three files (two of them have the same name as the ones in the root dir, one the name of the root folder (and the name of one of it's direct subfolders)
R:\a.folder\a.folder\a.folder
R:\a.folder\a.folder\a.tx1
R:\a.folder\a.folder\a.txt

The second subfolder contains only one subfolder again:
R:\a.folder\b.folder\a.folder

This one again two files with the same name as the ones in the root dir

This setup should reflect all of your problem cases:
// Root (parent) folder has same name as a subfolder
// two subfolders have the same name
// one or more files has the same name as a subfolder (especially if this includes an extension)
// two files in the same folder have the same base names

I can identify one major (script related) problem and a minor one (more related to XY's scripting commands).

1. The minor one:
replacelist stalls if $filesRoot is an empty string (-> the root folder contains only subfolders but no files).
Easy to fix, just a simple check if $filesRoot is not empty

2. The major one
R:\a.folder\a.folder\a.folder

This file would be backupped to R:\a.folder\a.folder

Which will always fail because there is already a subfolder with that name in the root folder.

There would be several ways to fix this:

1. Replace all dots in all direct subfolders of the root dir with e.g. "_"
-> This could still lead to a problem if you have files in subfolders that have no extension (e.g. from a .zip archive with linux files in it)

2. Replace all dots AND rename all direct subfolders with a random extension (not dot terminated)
E.g.:
R:\a.folder\a.folder
-> R:\a.folder\a_folder_37385212
The chance that there is a file in any of the subfolders that is called "a_folder_37385212" is rather low
-> This should resolve this problem

3. Going through each file (only in subfolders) to see if their name matches one of the direct subfolders of the root dir and rename it (again, with some kind of random pattern)
-> Way slower than nr. 2


I can't see any problems with:
// Root (parent) folder has same name as a subfolder
// two subfolders have the same name
// two files in the same folder have the same base names

If you have problems with any of these provide a test case with file and foldernames

Neither problems are related to the way of how file- or foldernames are stored (in a long string, separated by "|").
replacelist is rather slow, I can speed that up with a regexreplace...
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

Try this and report any problems...

Don't forget to adjust the $fileExtensions variable. To see if any problems occur change this line:

Code: Select all

backupto "<curpath>", regexreplace($filesInSubfolders, $sep, "|"), 4, , 0, 0, , , 0;
to

Code: Select all

backupto "<curpath>", regexreplace($filesInSubfolders, $sep, "|"), 4, , 1, 0, , , 0;
It will ask you to save a report. Do so, look over it, any failures?

For debugging purposes I used a linefeed instead of a bar as the delimiter. Easier to read file / folder structures...

Code: Select all

    $sep = "<crlf>";
    $fileExtensions = "|abc|def|";

    setting "BackgroundFileOps", 0;
    $filesRoot = listfolder("<curpath>", , 1, $sep);
    $foldersRoot = listfolder("<curpath>", , 2, $sep);

    // Rename direct subfolders
    if ($foldersRoot) {
        $newRootSubFolders = "";
        foreach($folder, $foldersRoot, $sep) {
            $newName = replace(getpathcomponent($folder, "component", -1), ".", "_") . "_" . rand(1000000, 99000000);
            rename "b", $newName, , $folder;
            $newRootSubFolders = $newRootSubFolders . "<curpath>\" . $newName . "|";
        }

        $allFiles = folderreport("files", "r", "<curpath>", "r", , $sep);
        $filesInSubfolders = $allFiles;
        if ($filesRoot) { $filesInSubfolders = formatlist(replacelist($allFiles, $filesRoot, "", "|"), "dents", $sep); }

        backupto "<curpath>", regexreplace($filesInSubfolders, $sep, "|"), 4, , 0, 0, , , 0;
        delete 1, 0, $newRootSubFolders;
    }

    $filesRoot = listfolder("<curpath>", , 1); // Get all files in root dir (second time)
    foreach($file, $filesRoot) {
        $curExtension = getpathcomponent($file, "ext");
        if (strpos($fileExtensions, "|$curExtension|") != -1) {
            moveto "<curpath>\$curExtension", $file, , 2;
        }
    }
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

Papoulka
Posts: 455
Joined: 13 Jul 2013 23:41

Re: Folder Flatten & Segregate script - Has anyone...?

Post by Papoulka »

Highend -

Thanks for your attention to this. I'm always impressed with your coding skills; and this utility is important to me. Please see your donation account as there should be enough for a few :beer: :D

This script is definitely more robust than the previous, and could be the general "Flatten" util that XY needs. One small fix is required which I have made to my copy: the script fails if there is no subfolder in the base folder.

Beyond that, I have to use "moveto" instead of "backupto". Each flatten I do involves at least 300MB, sometimes 3GB, and it's all on external USB drives. I wish "moveto" could totally avoid any RichMove prompts but there it is.

I'll let you know if any issues arise but it seems strong now. Thx.
Last edited by Papoulka on 14 Jun 2015 21:32, edited 1 time in total.

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

Thanks :)
the script fails if there is no subfolder in the base folder.
Yeah, I added it to the last script.
Beyond that, I have to use "moveto" instead of "backupto".
Mh, how does that handle file collisions?
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

Papoulka
Posts: 455
Joined: 13 Jul 2013 23:41

Re: Folder Flatten & Segregate script - Has anyone...?

Post by Papoulka »

MoveTo pops up the "Newer / older / keep both" option box when there are collisions. That's adequate for my use. I could also go with the Backupto behavior you had scripted which adds a suffix, but that choice doesn't exist for Moveto.

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

So it's a tradeoff between clicking or waiting (depending on how large the filesize is). An additional flag for adding a suffix would be a nice option in that case (to get away without clicking).
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

1. Defining $fileExtensions is now a bit easier (you do not have to add a leading or trailing "|" to this list)
2. Rename all direct subfolders of the root dir in one batch (no foreach loop anymore)
3. No slow replacelist command anymore
4. Now files aren't moved one by one into their respective subfolder (derived from $fileExtensions) but the script
builds lists of files via regexes (yeah!) first and moves them together. Way faster than before.

The only thing that would make the whole thing a bit faster would be a moveto command that supports suffixes if a file / folder already exists

Code: Select all

    $fileExtensions = "txt|xys";

    setting "BackgroundFileOps", 0;
    $sep = "<crlf>";
    $fRegExPat   = "([\\^$.+*|?(){\[])";
    $filesRoot   = listfolder("<curpath>", , 1, $sep);
    $foldersRoot = listfolder("<curpath>", , 2, $sep);

    // Rename direct subfolders
    if ($foldersRoot) {
        rename "b", "*_" . rand(1000000, 9000000), , replace($foldersRoot, "<crlf>", "|"), 12, "_";
        $foldersRoot = listfolder("<curpath>", , 2);

        // Remove all files from rootdir from the list of all files
        $allFiles = folderreport("files", "r", "<curpath>", "r", , $sep);
        $filesInSubfolders = $allFiles;
        if ($filesRoot) {
            $filesRoot = regexreplace($filesRoot, $fRegExPat, "\$1");
            $filesRoot = replace("(" . regexmatches($filesRoot, "^.*(\r?\n|$)", "|") . ")", $sep);
            $filesInSubfolders = formatlist(regexreplace($allFiles, "$filesRoot"), "e", $sep);
        }

        if ($filesInSubfolders) { backupto "<curpath>", regexreplace($filesInSubfolders, $sep, "|"), 4, , 0, 0, , , 0; }
        delete 1, 0, $foldersRoot;
    }

    // Get all files in root dir (second time)
    $filesRoot = listfolder("<curpath>", , 1, $sep);
    foreach($fileExt, $fileExtensions) {
        if ($fileExt) {
            $files = regexmatches($filesRoot, "^.*\." . regexreplace($fileExt, $fRegExPat, "\$1") . "$");
            if ($files) { moveto "<curpath>\$fileExt", $files, , 2; }
        }
    }
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

Papoulka
Posts: 455
Joined: 13 Jul 2013 23:41

Re: Folder Flatten & Segregate script - Has anyone...?

Post by Papoulka »

Hello highend -

FYI, your latest script (of 14 Jun 2015 23:50 ) - although better internally and theoretically faster - has a subtle weakness. When it is used on large external (USB) drives with big trees (10K+ folders and 200K+ files), its method of renaming things apparently shocks XY into one of its "reexamine everything" refresh fits. So each flatten, thought it involves only one of these folders and a few subfolders, will often take 1 to 2 min to complete.

Fortunately for me, since this is my primary use case, your earlier script of 14 Jun 2015 09:12 does not have this problem. So I have adopted it for my work. Again, I am forced to use "moveto" rather than "backupto" because my folders are 300MB+ and recopying all that stuff is not practical. These will flatten in around 5 sec and XY rarely hangs during the process.

The only point to note is that this earlier script throws an error if all subfolders are empty. That's not real-world and I only happened on it while testing, so that's FYI too.

Thanks

highend
Posts: 10053
Joined: 06 Feb 2011 00:33

Re: Folder Flatten & Segregate script - Has anyone...?

Post by highend »

So each flatten, thought it involves only one of these folders and a few subfolders, will often take 1 to 2 min to complete

Code: Select all

setting "autorefresh", 0;
that this earlier script throws an error if all subfolders are empty
A simple check if $allFiles is empty would fix that...
One of my scripts helped you out? Please donate via Paypal or paypal_donate (at) stdmail (dot) de

hermhart
Posts: 160
Joined: 13 Jan 2015 18:41

Re: Folder Flatten & Segregate script - Has anyone...?

Post by hermhart »

highend,

I like what Papoulka did with trying to break down the script that you did to just flatten the directory, but I'm having trouble getting this to work. Would you be able to tweak this so that it just flattens the current directory?

Thanks...

Code: Select all

msg "Flatten then recycle all sub-folders to currently selected tree item?", 1;
    setting "BackgroundFileOps", 0;
    $filesRoot = listfolder("<curpath>", , 1);
    $foldersRoot = listfolder("<curpath>", , 2);
    $allFiles  = folderreport("files", "r", "<curpath>", "r", , "|");
    $filesInSubfolders = formatlist(replacelist($allFiles, $filesRoot, "", "|"), "dents");
    backupto "<curpath>", $filesInSubfolders, 4, , 0, 0, 0, 0, 0;
    delete 1, 0, $foldersRoot;

    msg "All files flattened.";

Post Reply