Page 1 of 1

RegEx Help - CamelCasing.

Posted: 22 Oct 2010 19:22
by SkyFrontier
Hi.
Can someone please help me on this?
I intend to use this with regexReplace.
-A regEx to start a sentence with capitals; also, make capitals after "_" and numbers (camelCased sentences) and check that after a Capital there MUST be a lower case: _test >>_Test; test23succeed >> Test23Succeed; thisIsAsample_testTExtWith8numbersToo >> ThisIsAsample_TestTextWith8NumbersToo;

-A regEx to get rid of underscores when they are followed by any extension: File_.exe >> File.exe.

Thanks in advance!

Re: RegEx Help - CamelCasing.

Posted: 14 Nov 2010 01:41
by Stefan
---------------------------
XYplorer
---------------------------
thisIsAsample_testTExt_with8numbe7rsToo_test23succ_.eed_.exten
ThisIsAsample_TestText_With8Numbe7RsToo_Test23Succ_.eed.exten
---------------------------
OK
---------------------------

Code: Select all

$str = "thisIsAsample_testTExt_with8numbe7rsToo_test23succ_.eed_.exten";
  $orig = $str;

  //start a sentence with capitals
  $str = recase( substr($str, 0, 1), "upper") . substr($str, 1);

  //make capitals after "_" 
  while(1)
  {
    $test = regexreplace($str, "_[a-z]", "XXX", 1);
    if ($test==$str){break;}
    else{
       $a = regexreplace($str, "(.*?)_([a-z])(.*)", "$1", 1 ) . "_";
       $b = recase(regexreplace($str, "(.*?)_([a-z])(.*)", "$2", 1 ), "upper");
       $c = regexreplace($str, "(.*?)_([a-z])(.*)", "$3", 1 );
       $str = $a$b$c;
    }
  }


  //make capitals after numbers
  while(1)
  {
    $test = regexreplace($str, "\d[a-z]", "XXX", 1);
    if ($test==$str){break;}
    else{
       $a = regexreplace($str, "(.*?\d+)([a-z])(.*)", "$1", 1 );
       $b = recase(regexreplace($str, "(.*?\d+)([a-z])(.*)", "$2", 1 ), "upper");
       $c = regexreplace($str, "(.*?\d+)([a-z])(.*)", "$3", 1 );
       $str = $a$b$c;
    }
  }

  //after Capital there MUST be a lower case
  while(1)
  {
    $test = regexreplace($str, "[A-Z][A-Z]", "XXX", 1);
    if ($test==$str){break;}
    else{
       $a = regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$1", 1 );
       $b = regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$2", 1 );
       $c = recase(regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$3", 1 ), "lower");
       $d = regexreplace($str, "(.*)([A-Z])([A-Z])(.*)", "$4", 1 );
       $str = $a$b$c$d;
    }
  }


  //regEx to get rid of underscores when they are followed by any extension: File_.exe >> File.exe.
   $str = regexreplace($str, "(.+)_(\..+)", "$1$2");

  msg $orig<crlf>$str;

Re: RegEx Help - CamelCasing.

Posted: 14 Nov 2010 09:04
by Stefan
SkyFrontier wrote:Also this: regex to "remove this expression before" keep this "remove all after - including preceding space plus .htm" - the "remove expression" I may can deal with a simple regexreplace "", but the " -..." part...
Input:
Fixed Part. Main Title - Variable Comment.htm
Output:
Main Title
("Fixed Part.<space>" can be dealt via regexReplace)
FROM:
Fixed Part. Main Title - Variable Comment.htm
TO:
Main Title

DO:
First group: match all till an dot followed by an space ==> .+\.\s
Second group: match all signs till an space followed by an dash ==> (.+)\s-
Last group: match all signs till the end ==> .+
The second group is the one we want to keep, so we put the expression
into brackets to us the match of the second expression as backreference by $1

Expression: .+\.\s(.+)\s-.+
Replace w: $1

Code: Select all

$a = "Fixed Part. Main Title - Variable Comment.htm";
  $b = regexreplace($a, "^.+\.\s(.+)\s-.+$", "$1");
  msg "-$b-";

Re: RegEx Help - CamelCasing.

Posted: 16 Nov 2010 03:44
by SkyFrontier
Works like a charm - tested and approved! Helped me a lot since yesterday.
Thanks much!

Re: RegEx Help - CamelCasing.

Posted: 19 Nov 2010 20:07
by SkyFrontier
@Stefan:
Is it possible for the "//after Capital there MUST be a lower case" section have an exception for cases when one or more of the elements of a phrase is a single vowel?
Real usage is proving that cases like the following would take a benefit out of this exception:

Code: Select all

This is a unique expression - ThisIsAUniqueExpression.

This is a El-Niño year - ThisIsAEl_NiñoYear.
Thank you!

Re: RegEx Help - CamelCasing.

Posted: 19 Nov 2010 22:54
by Stefan
SkyFrontier wrote:@Stefan:
Is it possible for the "//after Capital there MUST be a lower case" section
have an exception for [...] a single vowel?
How about using this regex before you remove the spaces?

Re: RegEx Help - CamelCasing.

Posted: 19 Nov 2010 23:20
by SkyFrontier
Stefan wrote:How about using this regex before you remove the spaces?
-tried and it works perfectly! (yet to analyze how, though...)
That thing led me to revise an old script (Save Offline Pages), too, which I use almost each and every day on a reasonable basis depending on the task I'm working with (like last week).
Thank you very much!