Better Auto Replace Invalid Characters

fishgod · Post by **fishgod** » 04 Feb 2020 18:02

May I propose a better solution to replace all invalid characters with one replacement:

Since the list of invalid characters is rather small, one could use a list of replacment-characters for every invalid character(combination) and use similar unicode-variants like this: (I use this allready for scripts that take name-inputs)

Here is my list: (invalid ascii character(combination) on the left, possible unicode replacement character on the right)
?!

⁈
?

？ (preferred) or ﹖ or ︖
<

﹤
>

﹥
\

﹨
:

꞉
*

﹡
/

∕
"

‟ or “ or ” (maybe context-sensitive on word-boundaries, see script)
|

│
The exact visual appearance depends on the font, but with Tahoma this looks good to me.

Here the script-Variant with smart-logic for double-quotes:

Code: Select all

"Make UnicodeFileName : _unicode_file"
  global $CLEAN_NICE_NAME;
  //ASCII illegal in filenames, replace with unicode variant
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, "?!",  "⁈", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, "?",  "？", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, "<",  "﹤", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, ">",  "﹥", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, "\",  "﹨", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, ":",  "꞉", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, "*",  "﹡", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, "/",  "∕", 1);

  $CLEAN_NICE_NAME = regexreplace($CLEAN_NICE_NAME, '^"',  "“", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, '" ',  "” ", 1);

  $CLEAN_NICE_NAME = regexreplace($CLEAN_NICE_NAME, '"$',  "”", 1);
  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, ' "',  " “", 1);

  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, '"',  "‟", 1);

  $CLEAN_NICE_NAME = replace($CLEAN_NICE_NAME, "|",  "│", 1);