Page 1 of 1

Need a script to find file extensions with conditions

Posted: 08 Feb 2012 00:50
by kelwin
Help: Need a script to find file extensions with conditions

I haven't been able to find a program to do this as not quite a "find duplicate file" scenario. Can anyone help?

Scenario:
I need to convert drawings from .pdf to .tiff/tif to be able to import into another program.
However, I often also receive the drawing as a .tiff so don't always have a corresponding .pdf.
Otherwise it would be a simple solution of deleting all .tiff.

I have about 3500 project folders, with a squillion subfolders, and backups are getting too large. If I could filter out the duplicate .tiff, that would significantly reduce the total space required.

Hence, I want to be able to search for and then move all .tiff/tif files to the local recycle bin (ie delete), where there is a matching .pdf in that subfolder location. (reduce size of backups)


So ideally the location/file path will be part of the file name for finding duplicates, as the file name may not necessarily be unique, and then examine the file extension to find "duplicates".


Example files:

1 d:\projects\House1\floorplan.pdf
2 d:\projects\House1\floorplan.tif
3 d:\projects\House2\floorplan.pdf
4 d:\projects\House2\floorplan.tif
5 d:\projects\House2\revised plan\floorplan.tif
6 d:\projects\House3\floorplan.pdf
7 d:\projects\House3\floorplan.tif

Result:

I would only want to find (and then delete) files 2,4&7

I don't want to delete file 5 as there is no matching .pdf in that subfolder...


A bonus would be to include a variable to search only files older than say 3months....


I have no scripting ability, so if someone is able to help, it would need to be the full monty.


Can anyone assist?


Thanks in advance.

Re: Need a script to find file extensions with conditions

Posted: 08 Feb 2012 02:21
by highend
Quick shot...

Code: Select all

	$files = folderreport("files:{fullname}|{modified yyyy-mm-dd}", "r", , "r", , "<crlf>");
	$list = "";

	foreach($entry, $files, "<crlf>") {
		//step;
		$firstToken = gettoken($entry, 1, "|");
		$lastBS = strpos($firstToken, "\", -1);
		$curDir = substr($firstToken, 0, $lastBS);
		$curFile = substr($firstToken, $lastBS +1);
		$lastDot = strpos($curFile, ".", -1);
		$curName = substr($curFile, 0, $lastDot);
		$curExt = substr($curFile, $lastDot+1);
		if($curExt == "pdf" && exists("$curDir\$curName.tif") == 1) {
			$list = $list . "$curDir\$curName.tif" . "|";
		}
	}
	text $list;
	delete 1, 1, $list;
Hadn't time to think about the the time thing but the script should do the basic task.
Test it extensively (take a look at the text output) before you use it on your folders...

Re: Need a script to find file extensions with conditions

Posted: 08 Feb 2012 05:31
by kelwin
Thankyou sooo much

I have done some preliminary testing on mock project files/folders, and have increased the complexity of the subfolders, and so far it works 100% perfect.
"Restoring" from Recycle Bin also works a treat.

I am currently jumping around the room with joy!

I can't wait to run it on the ~3500 folders, ~14500 subfolders, and ~ 260000 files and see what savings in diskspace I can achieve.

ps I have "backedup" the backups just in case too.

Re: Need a script to find file extensions with conditions

Posted: 08 Feb 2012 05:37
by j_c_hallgren
kelwin wrote:I can't wait to run it on the ~3500 folders, ~14500 subfolders, and ~ 260000 files and see what savings in diskspace I can achieve.
I would certainly suggest you do this in various phases as while XY is great, that might be a bit stressful on your system.

Re: Need a script to find file extensions with conditions

Posted: 08 Feb 2012 05:43
by kelwin
top tip

thanks

I'll take that on board - maybe breakup the folders into A-E; F-K etc and run it on smaller scale

Re: Need a script to find file extensions with conditions

Posted: 08 Feb 2012 05:50
by Muroph
seems like i'm a bit too late.
just to avoid wasting the effort, here's a sligth modification of a script i use to find any mkv with a matching mp4.
remove the "//" before #359 to make the script filter the list and show only the files that matched.
BTW, this script only selects the files, then you can just hit del to get rid of them.

Code: Select all

#263;
  selfilter "*.pdf",f;
  $items=report("{fullpath}\{basename}.tif|",1);
  $items2=report("{fullpath}\{basename}.tiff|",1);
  selectitems $items,0,1,n;
  selectitems $items2,0,1,a;
  //#359;sel a;
  $total=getinfo(countselected);
  status "$total files selected",77ff00;

Re: Need a script to find file extensions with conditions

Posted: 13 Feb 2012 00:22
by kelwin
Thanks Highend for script.

Ran it over the weekend and worked a treated (especially after breaking it down into smaller lots)