writefile() default mode bug: text is unexpectedly chopped

Things you’d like to miss in the future...
Post Reply
zhaowu
Posts: 30
Joined: 24 Oct 2016 16:03

writefile() default mode bug: text is unexpectedly chopped

Post by zhaowu »

Example: ( Chinese is included in the data )

Code: Select all

"Main"
	$text = "纽约灾星 The Jinx: The Life and Deaths of Robert Durst (2015)";
	writefile("<xydata>\Log\XYplorer.log", $text.<crlf>, 'a', 't');
The result text is unexpectedly chopped:

Code: Select all

纽约灾星 The Jinx: The Life and Deaths of Robert Durst (201

zhaowu
Posts: 30
Joined: 24 Oct 2016 16:03

Re: writefile() default mode bug: text is unexpectedly chopp

Post by zhaowu »

Same code works well when writefile with mode `tu`.

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: writefile() default mode bug: text is unexpectedly chopp

Post by admin »

Interesting!

Please run these two lines:

Code: Select all

echo isunicode("纽约灾星 The Jinx: The Life and Deaths of Robert Durst (2015)");

Code: Select all

echo isunicode("纽约灾星 The Jinx: The Life and Deaths of Robert Durst (2015)", 1);
What do they return (0 or 1)?

zhaowu
Posts: 30
Joined: 24 Oct 2016 16:03

Re: writefile() default mode bug: text is unexpectedly chopp

Post by zhaowu »

admin wrote:Interesting!

Code: Select all

echo isunicode("纽约灾星 The Jinx: The Life and Deaths of Robert Durst (2015)");
1

Code: Select all

echo isunicode("纽约灾星 The Jinx: The Life and Deaths of Robert Durst (2015)", 1);
0

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: writefile() default mode bug: text is unexpectedly chopp

Post by admin »

Cool, that's enough for me to fix it. :tup:

zhaowu
Posts: 30
Joined: 24 Oct 2016 16:03

Re: writefile() default mode bug: text is unexpectedly chopp

Post by zhaowu »

v17.30.0102 - 2016-11-09 20:53
! SC writefile: On DBCS locales (using double-byte character sets: Chinese,
Japanese, Korean) text with Unicode characters would be cropped at the end
when you passed 't' as mode parameter. Fixed.
Thanks for the fix. However, this update writes file as utf-16le which is the same as `tu` mode. It should write file as utf-8 :!: Is there anyway to write file as utf-8?

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: writefile() default mode bug: text is unexpectedly chopp

Post by admin »

SC writefile does not support utf-8 conversion. You can use utf8encode() on the string to convert it before passing it to writefile.

zhaowu
Posts: 30
Joined: 24 Oct 2016 16:03

Re: writefile() default mode bug: text is unexpectedly chopp

Post by zhaowu »

admin wrote:SC writefile does not support utf-8 conversion. You can use utf8encode() on the string to convert it before passing it to writefile.

Code: Select all

	$text = "纽约灾星 The Jinx: The Life and Deaths of Robert Durst (2015)";
	writefile("<xydata>\Log\XYplorer.log", utf8encode($text.<crlf>), 'a', 't');
This method does not work.

Code: Select all

?o??o|?????? The Jinx: The Life and Deaths of Robert Durst (2015)

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: writefile() default mode bug: text is unexpectedly chopp

Post by admin »

This should be the Raw View of the created file:

Code: Select all

00000000: E7 BA BD E7 BA A6 E7 81 BE E6 98 9F 20 54 68 65 ; 纽约灾星 The
00000010: 20 4A 69 6E 78 3A 20 54 68 65 20 4C 69 66 65 20 ;  Jinx: The Life 
00000020: 61 6E 64 20 44 65 61 74 68 73 20 6F 66 20 52 6F ; and Deaths of Ro
00000030: 62 65 72 74 20 44 75 72 73 74 20 28 32 30 31 35 ; bert Durst (2015
00000040: 29 0D 0A                                        ; )..             
I attach an image because some characters are shown differently in the browser:
Attachments
2016-11-10_094754.png
2016-11-10_094754.png (6.63 KiB) Viewed 2538 times

zhaowu
Posts: 30
Joined: 24 Oct 2016 16:03

Re: writefile() default mode bug: text is unexpectedly chopp

Post by zhaowu »

admin wrote:This should be the Raw View of the created file:

Code: Select all

00000000: E7 BA BD E7 BA A6 E7 81 BE E6 98 9F 20 54 68 65 ; 纽约灾星 The
00000010: 20 4A 69 6E 78 3A 20 54 68 65 20 4C 69 66 65 20 ;  Jinx: The Life 
00000020: 61 6E 64 20 44 65 61 74 68 73 20 6F 66 20 52 6F ; and Deaths of Ro
00000030: 62 65 72 74 20 44 75 72 73 74 20 28 32 30 31 35 ; bert Durst (2015
00000040: 29 0D 0A                                        ; )..             
This is the correct raw view for the file. However, writefile() converts those wide chars into ?. Therefore, the result file is

Code: Select all

00000000: 3F 6F 3F 3F 6F 7C 3F 3F 3F 3F 3F 3F 20 54 68 65 ; ?o??o|?????? The
00000010: 20 4A 69 6E 78 3A 20 54 68 65 20 4C 69 66 65 20 ;  Jinx: The Life 
00000020: 61 6E 64 20 44 65 61 74 68 73 20 6F 66 20 52 6F ; and Deaths of Ro
00000030: 62 65 72 74 20 44 75 72 73 74 20 28 32 30 31 35 ; bert Durst (2015
00000040: 29 0D 0A E7 BA BD E7 BA A6 E7 81 BE E6 98 9F 20 ; )

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: writefile() default mode bug: text is unexpectedly chopp

Post by admin »

I assume it makes no difference if you use 'ta' instead of 't'?

(BTW, all this only happens under Chinese locale -- in case anybody here tries to reproduce it.)

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: writefile() default mode bug: text is unexpectedly chopp

Post by admin »

Next beta has a new "utf8" mode. :)

...

ADD: I finally solved it. Actually this has nothing to do with UTF-8. Although it looks similar DBCS is totally different from UTF-8. I assume what you wanted was non-cropped DBCS, not UTF-8. And it will work in the next beta. :tup:

highend
Posts: 13274
Joined: 06 Feb 2011 00:33

Re: writefile() default mode bug: text is unexpectedly chopp

Post by highend »

v17.30.0108 - 2016-11-10 19:07
+ SC writefile enhanced. Added new mode "utf8".
Syntax: writefile(filename, data, [on_exist], [mode])
mode:
utf8: Converts data to UTF-8 before writing it to file. A UTF-8 BOM is
not added.
Can you add a [flags] parameter to let it write a BOM for UTF-8? UTF-16 LE is written with a BOM by default.
One of my scripts helped you out? Please donate via Paypal

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: writefile() default mode bug: text is unexpectedly chopp

Post by admin »

OK, I add a mode "utf8bom".

highend
Posts: 13274
Joined: 06 Feb 2011 00:33

Re: writefile() default mode bug: text is unexpectedly chopp

Post by highend »

Thanks :ninja:
One of my scripts helped you out? Please donate via Paypal

Post Reply