Page 2 of 2

Re: Script Request - Save Offline Web Pages easily

Posted: 03 Dec 2010 13:16
by SkyFrontier
Improved version.
This one covers practically all my needs regarding the OT.
EDIT (forgot to add): -it eliminates periods at the end of a phrase, if any (it's usual to find stuff like that and causes a file to be named "File..htm", depending on your workflow).
-detects extensions and ignores them (so if the input is "any page.htm", the output will be "AnyPage.htm");
-makes underscores out of dots in any other position of the base name (note: it will treat as "base" stuff like test.yyyy [out: Test_Yyyy] and Tst.zz [out: Tst_Zz], and will properly treat ".xxx" as extension);
-it will correctly eliminate parenthesis at the end of a sentence (I expect such behavior) and properly convert question marks and others to their dictionary based counterparts in some sort of transliteration, like in:

Code: Select all

XYcopy 1.00.0028 missing from latest beta? > XYcopy1_00_0028MissingFromLatestBetaG
XYcopy 1.00.0028 missing from latest beta?.htm > XYcopy1_00_0028MissingFromLatestBetaG.htm
BETA version (with detailed history information) > BETAVersion_WithDetailedHistoryInformation
BETA version (with detailed history information > BETAVersion_WithDetailedHistoryInformation
BETA version (with detailed history information).htm > BETAVersion_WithDetailedHistoryInformation.htm
-it has version check, since some +9.70.0001 features are required (majorly related to camelCasing/fix extension casing).
Enjoy... :wink:

Code: Select all


   end (<xyver> < "9.70.0001"), "Sorry - this script requires XYplorer version 9.70.0001 or higher.<crlf>Click 'Ok' to exit.";
  $str = "<clipboard>";

/*

  //after Capital there MUST be a lower case
  while(1)
  {
    $test = regexreplace($str, "[A-Z][A-Z]", "XXX", 1);
    if ($test==$str){break;}
    else{
       $a = regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$1", 1 );
       $b = regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$2", 1 );
       $c = recase(regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$3", 1 ), "lower");
       $d = regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$4", 1 );
       $str = $a$b$c$d;
    }
  }

*/

  $str = recase("$str", "camel");
//  $str = recase("$str", "title", 1);
  $str = replace($str," ","");
   $str = RegexReplace($str, '[()"{}\[\]]', '_');
   $str = RegexReplace($str, '[-'':=\\/<>ῳ|*^$#@~’,]', '_');
   $str = replace($str,"!","I");
   $str = replace($str,"?","G");
   $str = replace($str,"&","And");
   $str = replace($str,"+","Plus");
   $str = RegexReplace($str, "any excessive expression which must be deleted - ", "");
  //start a sentence with capitals
  $str = recase( substr($str, 0, 1), "upper") . substr($str, 1);

  //make capitals after "_"
  while(1)
  {
    $test = regexreplace($str, "_[a-z]", "XXX", 1);
    if ($test==$str){break;}
    else{
       $a = regexreplace($str, "(.*?)_([a-z])(.*)", "$1", 1 ) . "_";
       $b = recase(regexreplace($str, "(.*?)_([a-z])(.*)", "$2", 1 ), "upper");
       $c = regexreplace($str, "(.*?)_([a-z])(.*)", "$3", 1 );
       $str = $a$b$c;
    }
  }


  //make capitals after numbers
  while(1)
  {
    $test = regexreplace($str, "\d[a-z]", "XXX", 1);
    if ($test==$str){break;}
    else{
       $a = regexreplace($str, "(.*?\d+)([a-z])(.*)", "$1", 1 );
       $b = recase(regexreplace($str, "(.*?\d+)([a-z])(.*)", "$2", 1 ), "upper");
       $c = regexreplace($str, "(.*?\d+)([a-z])(.*)", "$3", 1 );
       $str = $a$b$c;
    }
  }

  //regEx to get rid of underscores when they are followed by any extension: File_.exe >> File.exe.
   $str = regexreplace($str, "(.+)_(\..+)", "$1$2");
   $str = replace($str,"_.",".");
   $dot = substr($str, -4, 1);
   IF ($dot == ".") { 

   $base = regexreplace($str, "(.+)(\..+)", "$1");
   $ext = regexreplace($str, "(.+)(\..+)", "$2");
   $base2 = replace($base, ".", "_");
   $str2 = "$base2$ext";
   $cmp = substr($str2, -1);
   IF ($cmp == _) { $grr = substr($str2, 0, -1); $grr = recase("$grr", "camel", 1); copytext "$grr"; status "OK: $grr"; }
   ELSE ($cmp != _) {  $grr2 = recase("$str2", "camel", 1); copytext "$grr2"; status "OK: $grr2"; }

                   }

   ELSE ($dot != ".") { 

   $str3 = replace($str, ".", "_");
   $cmp = substr($str3, -1);
   IF ($cmp == _) { $grr = substr($str3, 0, -1); copytext "$grr"; status "OK: $grr"; }
   ELSE ($cmp != _) { copytext "$str3"; status "OK: $str3"; }

                    }