Unicode in scripting

Things you’d like to miss in the future...
Forum rules
:warnred20: :warnred20: :warnred20: :warnred20: :warnred20: READ THIS AND DO IT!!! :warnred20: :warnred20: :warnred20: :warnred20: :warnred20:

:info: Please include the following information:
1) Your XYplorer Version (e.g., v28.00.0801)
2) Your Windows Version (e.g., Win 11)
3) Your Screen Scaling Percentage (e.g., 125%).

:info: We strongly recommend adding your Windows Version and Screen Scaling Percentage to the Location field in your Profile or to your Signature. That way, you only have to type them once, and we won't have to search for that vital information.

:info: When attaching an Image, please use the Attachment tab at the bottom of your post and click "Add files".

:warnred20: :warnred20: :warnred20: :warnred20: :warnred20: READ THIS AND DO IT!!! :warnred20: :warnred20: :warnred20: :warnred20: :warnred20:
Post Reply
nf_xp
Posts: 35
Joined: 10 Jul 2009 08:05

Unicode in scripting

Post by nf_xp »

2009-8-23 10-36-19.gif
2009-8-23 10-36-19.gif (15.38 KiB) Viewed 1700 times
Don, I just tried Muroph's Tag Manager v2.2, and encountered the above error - After processed line #23, the first keyword 'input' of line #24 was broken into lines #23 and #24, as well as the 'replace' in line #25, 'substr' in line #26, etc.
After replaced ▲ and ▼ in line #23 (in raw script, they're in lines #190, #191, #211 and #212), this script runs well. So I guess Unicode is the cause of this bug.

admin
Site Admin
Posts: 65338
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Re: Unicode in scripting

Post by admin »

nf_xp wrote:After replaced ▲ and ▼ in line #23 (in raw script, they're in lines #190, #191, #211 and #212), this script runs well. So I guess Unicode is the cause of this bug.
Yes, looks like. Currently no idea how to fix that because I don't even know exactly where/why the script was broken. It seems that the initial line parsing already chokes but why is "input" broken between "in" and "put"... :? What happens if you put any other Unicode character there -- same error?

nf_xp
Posts: 35
Joined: 10 Jul 2009 08:05

Re: Unicode in scripting

Post by nf_xp »

Test 1:

Code: Select all

    $a = "▲";
    msg "test";
2009-8-23 23-36-47.gif
2009-8-23 23-36-47.gif (9.76 KiB) Viewed 1635 times
Test 2:

Code: Select all

    $a = "▲▲";
    msg "test";
2009-8-23 23-37-06.gif
2009-8-23 23-37-06.gif (9.77 KiB) Viewed 1635 times
Raw view of the first test file in MBCS system:
2009-8-23 23-55-20.gif
2009-8-23 23-55-20.gif (4.42 KiB) Viewed 1637 times
My guess: You can see the MBCS string '$a = "▲";' takes 10 bytes in the raw view, but there are actually 9 Unicode chars after it's read into memory. Extra chars (from next line) will be read if using the byte number to break lines.

admin
Site Admin
Posts: 65338
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Re: Unicode in scripting

Post by admin »

Thanks, really interesting! I forgot that there are Unicode that resolve to 2 chars when converted to ANSI.

Looks like I have to rewrite a couple of heavily used functions.

nf_xp
Posts: 35
Joined: 10 Jul 2009 08:05

Re: Unicode in scripting

Post by nf_xp »

Fixed :)

Post Reply