SC: UTF-8 decoding the '§'?

Things you’d like to miss in the future...
Filehero
Posts: 2644
Joined: 27 Feb 2012 18:50
Location: Windows 10 Pro x64

SC: UTF-8 decoding the '§'?

Post by Filehero »

Code pages, locales and transformations, my beloved topic. So I add :oops: the right upfront.

When run from XY's script editor (Run script...) this

Code: Select all

msg("§§ $ &");
shows what it should.

When this string sequence is read from an UTF-8 encoded script or ini file it (the paragraph char '§') shows up as

Code: Select all

msg("§§ $ &");
Why is that?

highend
Posts: 13317
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: SC: UTF-8 decoding the '§'?

Post by highend »

A bug, it's read correctly from an UTF-8 with BOM
One of my scripts helped you out? Please donate via Paypal

admin
Site Admin
Posts: 60567
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: SC: UTF-8 decoding the '§'?

Post by admin »

If you want BOM-less UTF-8 detection you have to tick this:
Configuration | Preview | Text preview | UTF-8 auto-detection

highend
Posts: 13317
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: SC: UTF-8 decoding the '§'?

Post by highend »

It's not about previewing files

Create two .xys files:
<xyscripts>\1.xys
<xyscripts>\2.xys

Save the first with UTF-8 BOM
and the second with UTF-8 (no BOM!)

Content of both

Code: Select all

msg("§§ $ &");
Execute them with "Scripting - Load Selected Script File"

The first displays the § correctly, the second does not.
One of my scripts helped you out? Please donate via Paypal

admin
Site Admin
Posts: 60567
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: SC: UTF-8 decoding the '§'?

Post by admin »

Setting "UTF-8 auto-detection" does not only affect preview, but reading UTF8 files in general.

highend
Posts: 13317
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: SC: UTF-8 decoding the '§'?

Post by highend »

That may be the case but it fails on this specific case.

UTF-8, no BOM .xys script file and
Configuration | Preview | Text preview | [x] UTF-8 auto-detection
-> Wrong display of §
One of my scripts helped you out? Please donate via Paypal

admin
Site Admin
Posts: 60567
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: SC: UTF-8 decoding the '§'?

Post by admin »

Okay, got it. It's a bug. Fix comes.

Filehero
Posts: 2644
Joined: 27 Feb 2012 18:50
Location: Windows 10 Pro x64

Re: SC: UTF-8 decoding the '§'?

Post by Filehero »

Thanks highend for deeper analysis, thanks Don for fixing in advance.

Filehero
Posts: 2644
Joined: 27 Feb 2012 18:50
Location: Windows 10 Pro x64

Re: SC: UTF-8 decoding the '§'?

Post by Filehero »

Code: Select all

v18.40.0001 - 2017-09-21 16:14
    ! Scripting | Load Selected Script File: Did not decode BOM-less 
      UTF8-encoded script files when "Configuration | Preview | Text preview | 
      UTF-8 auto-detection" was ON. Fixed.
Hmm, I still have this issue with 18.40.0002.

highend
Posts: 13317
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: SC: UTF-8 decoding the '§'?

Post by highend »

Hmm, I still have this issue with 18.40.0002
Works fine here since 0001

Zip that script file and attach it?
One of my scripts helped you out? Please donate via Paypal

Filehero
Posts: 2644
Joined: 27 Feb 2012 18:50
Location: Windows 10 Pro x64

Re: SC: UTF-8 decoding the '§'?

Post by Filehero »

Key <key_var_group> / line 21
Attachments
launcher.zip
(1.66 KiB) Downloaded 33 times

highend
Posts: 13317
Joined: 06 Feb 2011 00:33
Location: Win Server 2022 @100%

Re: SC: UTF-8 decoding the '§'?

Post by highend »

Displaying the content after using readfile() works fine
so I guess you're using e.g. getkey()?
One of my scripts helped you out? Please donate via Paypal

Filehero
Posts: 2644
Joined: 27 Feb 2012 18:50
Location: Windows 10 Pro x64

Re: SC: UTF-8 decoding the '§'?

Post by Filehero »

highend wrote:so I guess you're using e.g. getkey()?
Yes. I should have mentioned it, sorry.

admin
Site Admin
Posts: 60567
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: SC: UTF-8 decoding the '§'?

Post by admin »

BOM-less UTF-8 ini-reading is not supported. Simply add the BOM and all is good.

Filehero
Posts: 2644
Joined: 27 Feb 2012 18:50
Location: Windows 10 Pro x64

Re: SC: UTF-8 decoding the '§'?

Post by Filehero »

Like this?
utf-8-bom.png
utf-8-bom.png (73.01 KiB) Viewed 1042 times
I know why I h*te this darn encoding hell, sorry for bothering..

Post Reply