Looks ok to me!admin wrote:What about a variable for single characters in this format:Code: Select all
<U+200E>
So I could input (for example) a tab either as <tab> or <U+0009>, right?
Looks ok to me!admin wrote:What about a variable for single characters in this format:Code: Select all
<U+200E>
Oh yes I remember, well iirc there's an easy workaround for that, will post it asap...admin wrote:OK, good for reading and writing, but I can support only Unicode between -32,768 and +32,767, so no 5-digit hex codes.
Code: Select all
function surrogatepairexpander($codepoint) {
// Returns a representation in the form U+XXXX;U+YYYY (ie. using
// surrogate pairs) of a codepoint in the range U+10000 to U+10FFFF (see
// https://en.wikipedia.org/wiki/UTF-16#U.2B10000_to_U.2B10FFFF).
//
// Codepoint must be given in the form U+ZZZZZ, with ZZZZZ between
// 0x10000 and 0x10FFFF. If anything else is given as input, the
// function throws an error and aborts.
//
// Source of the code: http://www.russellcottrell.com/greek/utilities/SurrogatePairCalculator.htm
//
// $codepoint the codepoint to convert
assert regexmatches("$codepoint", "U\+[0-9A-F]{5}") != "", "Invalid argument";
$codepoint = eval("0x" . substr("$codepoint", 2));
assert ($codepoint >= 0x10000) AND ($codepoint <= 0x10FFFF), "Invalid argument";
$h_dec = ($codepoint - 0x10000) \ 0x400 + 0xD800;
$l_dec = ($codepoint - 0x10000) % 0x400 + 0xDC00;
$surrogatepair = "";
foreach ($dec, "$h_dec,$l_dec", ",", "r") {
$hex = "";
while ("$dec" != "0") {
$digit = $dec % 16;
if ("$digit" Like "#") {break 0;}
elseif ("$digit" == "10") {$digit = "A";} elseif ("$digit" == "11") {$digit = "B";} elseif ("$digit" == "12") {$digit = "C";}
elseif ("$digit" == "13") {$digit = "D";} elseif ("$digit" == "14") {$digit = "E";} elseif ("$digit" == "15") {$digit = "F";};
$hex = "$digit$hex";
$dec = $dec \ 16;
};
$surrogatepair = "U+$hex;$surrogatepair";
};
$surrogatepair = trim("$surrogatepair", ";", "r");
return "$surrogatepair";
}Code: Select all
name:<U+0425>;<U+200E>;<U+200F>;<U+202A>;<U+202B>;<U+202C>;<U+202D>;<U+202E>