File information prompts Chinese garbled

Things you’d like to miss in the future...
admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: File information prompts Chinese garbled

Post by admin »

Did you see/try this?
viewtopic.php?p=168137#p168137

zhchgao
Posts: 101
Joined: 25 May 2012 09:53

Re: File information prompts Chinese garbled

Post by zhchgao »

?é1?ìì?? ¢ò

zhchgao
Posts: 101
Joined: 25 May 2012 09:53

Re: File information prompts Chinese garbled

Post by zhchgao »

After the script was tried, it was such garbled.

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: File information prompts Chinese garbled

Post by admin »

So, it cannot be fixed.

sj515064
Posts: 51
Joined: 21 Feb 2019 06:14

Re: File information prompts Chinese garbled

Post by sj515064 »

I've done some boring tests:

"蒙古天韵 Ⅱ" saved with GB2312 (936), opened with Windows-1252: "ÃɹÅÌìÔÏ ¢ò"
"ÃɹÅÌìÔÏ ¢ò" saved with GB2312 (936), reopened with GB2312 (936): "?é1?ìì?? ¢ò" (This is what "text dbcsdecode("ÃɹÅÌìÔÏ ¢ò", 936);" returns, in beta 19.70.0122)
"ÃɹÅÌìÔÏ ¢ò" saved with Windows-1252, opened with GB2312 (936): "蒙古天韵 Ⅱ" (correct)

Still, not worth the time solving it.
Last edited by sj515064 on 12 Mar 2019 15:45, edited 1 time in total.

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: File information prompts Chinese garbled

Post by admin »

Thanks, I will look into this later again...

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: File information prompts Chinese garbled

Post by admin »

I switched my system to Chinese locale to track the bug, but it was not there. All worked fine. :veryconfused:

So I re-added the code to v19.80.0003.

Try this new script:
text dbcsdecode("¹þÁÕ", 936, 1); //CP_GB2312, debug
You should see this line 1 time:

Code: Select all

00000000: B9 00 FE 00 C1 00 D5 00                           ¹.þ.Á.Õ.        
... and this line 4 times:

Code: Select all

00000000: B9 FE C1 D5                                       ¹þÁÕ            
Attachments
2019-03-12_202131.png
2019-03-12_202131.png (5.67 KiB) Viewed 2347 times
2019-03-12_202047.png
2019-03-12_202047.png (23.89 KiB) Viewed 2347 times

sj515064
Posts: 51
Joined: 21 Feb 2019 06:14

Re: File information prompts Chinese garbled

Post by sj515064 »

It's a bit different (one line per output, 6 times):

Code: Select all

00000000: B9 00 FE 00 C1 00 D5 00                           ¹.þ.Á.Õ.        
00000000: 31 74 A8 A2 3F                                    1t¨¢?           
00000000: B9 FE C1 D5                                       ¹þÁÕ            
00000000: B9 FE C1 D5                                       ¹þÁÕ            
00000000: B9 FE C1 D5                                       ¹þÁÕ            
1tá?
Seems that B9 FE C1 D5 is the correctly grabbed HEX code from *.mp3.
Decode B9 FE C1 D5 with Windows-1252: ¹þÁÕ (wrong)
with GB2312 (the system encoding): 哈琳 (correct)

So the root of the problem is that the HEX code is decoded with improper encoding (see below for details).
Last edited by sj515064 on 13 Mar 2019 07:00, edited 4 times in total.

sj515064
Posts: 51
Joined: 21 Feb 2019 06:14

Re: File information prompts Chinese garbled

Post by sj515064 »

A guess of what's going on.jpg
A guess of what's going on.jpg (128.26 KiB) Viewed 2335 times

sj515064
Posts: 51
Joined: 21 Feb 2019 06:14

Re: File information prompts Chinese garbled

Post by sj515064 »

Guess 1: XY is able to grab the ID3 tags stored in *.mp3 (in the form of HEX code).
Guess 2: XY is hardcoded to decode the grabbed HEX code with Windows-1252. (A quick and direct fix for this problem of garbled text is to replace the hardcoded Windows-1252 with the detected system encoding, whereas the dbcsdecode SC may not be necessary. If the hardcoded Windows-1252 cannot be changed, then simply add a two-step postprocessing: 1) encode the output (e.g. "ÃɹÅÌìÔÏ ¢ò" or "¹þÁÕ") into hex codes with Windows-1252; 2) decode it back with the system encoding)
Guess 3: The dbcsdecode SC takes in a string, encodes it with the system encoding, and decodes it back with again the system encoding.

The truth: “Correctly detecting the encoding all times is impossible.” (cited from https://stackoverflow.com/questions/436 ... ng-of-text)

So what can XY do? The best possible attempt is to decode the HEX codes with the most likely encoding, that is, the one used by the user’s system. But what if the text is encoded with UNICODE? In this case decoding it with the system encoding is also problematic.

So the conclusion: forget about the workaround, simply remove it! And forget about the problem. (Or just let the users choose the proper encoding themselves, with the default one set to the system encoding)

Anyway, thank you for the hard work and hope it hasn't cost you too much time :)

zhchgao
Posts: 101
Joined: 25 May 2012 09:53

Re: File information prompts Chinese garbled

Post by zhchgao »

Well upstairs, not far from success.

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: File information prompts Chinese garbled

Post by admin »

sj515064 wrote: 13 Mar 2019 02:39 A guess of what's going on.jpg
Yeah, well done! (BTW, how did you create this nice graphics?)

I know what to do now. Should be fixed in next version.

sj515064
Posts: 51
Joined: 21 Feb 2019 06:14

Re: File information prompts Chinese garbled

Post by sj515064 »

Thanks :)
(the figure is created with Microsoft PowerPoint :mrgreen: )

Fix confirmed.
Snipaste_2019-03-13_19-19-25.png
Snipaste_2019-03-13_19-19-25.png (46.92 KiB) Viewed 2252 times

Post Reply