Page 1 of 1

Unicode in scripting

Posted: 23 Aug 2009 05:09
by nf_xp
2009-8-23 10-36-19.gif
2009-8-23 10-36-19.gif (15.38 KiB) Viewed 1660 times
Don, I just tried Muroph's Tag Manager v2.2, and encountered the above error - After processed line #23, the first keyword 'input' of line #24 was broken into lines #23 and #24, as well as the 'replace' in line #25, 'substr' in line #26, etc.
After replaced ▲ and ▼ in line #23 (in raw script, they're in lines #190, #191, #211 and #212), this script runs well. So I guess Unicode is the cause of this bug.

Re: Unicode in scripting

Posted: 23 Aug 2009 16:01
by admin
nf_xp wrote:After replaced ▲ and ▼ in line #23 (in raw script, they're in lines #190, #191, #211 and #212), this script runs well. So I guess Unicode is the cause of this bug.
Yes, looks like. Currently no idea how to fix that because I don't even know exactly where/why the script was broken. It seems that the initial line parsing already chokes but why is "input" broken between "in" and "put"... :? What happens if you put any other Unicode character there -- same error?

Re: Unicode in scripting

Posted: 23 Aug 2009 18:21
by nf_xp
Test 1:

Code: Select all

    $a = "▲";
    msg "test";
2009-8-23 23-36-47.gif
2009-8-23 23-36-47.gif (9.76 KiB) Viewed 1595 times
Test 2:

Code: Select all

    $a = "▲▲";
    msg "test";
2009-8-23 23-37-06.gif
2009-8-23 23-37-06.gif (9.77 KiB) Viewed 1595 times
Raw view of the first test file in MBCS system:
2009-8-23 23-55-20.gif
2009-8-23 23-55-20.gif (4.42 KiB) Viewed 1597 times
My guess: You can see the MBCS string '$a = "▲";' takes 10 bytes in the raw view, but there are actually 9 Unicode chars after it's read into memory. Extra chars (from next line) will be read if using the byte number to break lines.

Re: Unicode in scripting

Posted: 23 Aug 2009 19:01
by admin
Thanks, really interesting! I forgot that there are Unicode that resolve to 2 chars when converted to ANSI.

Looks like I have to rewrite a couple of heavily used functions.

Re: Unicode in scripting

Posted: 24 Aug 2009 15:37
by nf_xp
Fixed :)