Page 1 of 1

Text Decoder - UTF-8 Decoder.

Posted: 28 Aug 2010 23:37
by SkyFrontier
(This may not be fully understood since some machines will interpret and display characters as they are and other will translate them all, but... let's try!)

People:

Code: Select all

::writefile("<curpath>\Decoded.txt", ÇÃ, ,"b"); open "<curpath>\Decoded.txt";
has this ÇÃ (something like A+Af, for those who will have it appearing as the next same string "çã") in the code as input but will produce "çã" (c, cedilla, a, tilde) as output.
So I'd like to have something like:

Code: Select all

::writefile("<curpath>\Decoded.txt", <clipboard>, ,"b"); open "<curpath>\Decoded.txt";
but no matter which parameter I use (t, b, u) the thing doesn't properly writes a "decoded" file having this stuff like "ÇÃ" ("A+Af") visually translated as "çã" ("c, cedilla, a, tilde").
How do I incorporate the "utf8decode" function into this?
Using

Code: Select all

::text report(utf8decode(<clipboard>))
displays a numbered loop of the clipboard content correctly decoded. Confuse, therefore useless.

Code: Select all

::text report "utf8decode <clipboard>"
shows clipboard content UN-decoded.

Code: Select all

::msg(utf8decode(<clipboard>))
pops a window with the correct content decoded, but I can't grab it and edit/copy.
Also, decoded text displays both "Á" and "Í" as "?"
I'd like to simply have a text decoder that could grab UTF-8 encoded text out of clipboard and have it decoded as a .txt in the current folder.
Any help, please...?
Thanks!

TAG: UTF-8 decoder - UTF8 - weird character - corrupted - unreadable

Re: Text Decoder.

Posted: 28 Aug 2010 23:54
by SkyFrontier
Heh. Image
My very 100% own first script, from scratch.

Code: Select all

::$decoded = utf8decode(<clipboard>); writefile("<curpath>\Decoded.txt", $decoded, ,"u"); open "<curpath>\Decoded.txt";
Nice one, I'd say... useful, too!
Now tell your 8 yrs old scripter he has competition! :D

EDIT: Changed parameter "b" for "u".

Re: Text Decoder.

Posted: 29 Aug 2010 00:15
by SkyFrontier
...but I'd still like a little help with the
decoded text displays both "Á" and "Í" as "?"
part.

Re: Text Decoder.

Posted: 29 Aug 2010 00:43
by SkyFrontier
v1.1: Has separator and appending function. Writes in "t" mode (see below: "auto-detects whether text can be written as ASCII or needs to be written as UNICODE"). Has comments regarding available modes.

Code: Select all

   $decoded = utf8decode(<clipboard>); writefile("<curpath>\Decoded.txt", "<crlf>$decoded<crlf><crlf>====<crlf>", a,"t"); open "<curpath>\Decoded.txt";
   //
   //v7.70.0004 - 2008-10-21 14:41
   //
   // * SC WriteFile(): Now, the filename may be relative to the current 
   //   path. For example, this line will generate a standard report of the 
   //   current folder in the current folder:
   //     ::writefile("report.txt", report());
   // * SC WriteFile(): Changed the mode argument to better match 
   //   ReadFile() (see below).
   //   Syntax: writefile(filename, data, [on_exist], [mode])
   //     mode:
   //       t:  [default] text;
   //           auto-detects whether text can be written as ASCII or needs
   //           to be written as UNICODE
   //       ta: text ASCII (1 byte per char);
   //           wide chars (upper Unicode) are represented by "?"
   //       tu: text UNICODE (2 bytes per char);
   //           with LE BOM at file beginning
   //           LE BOM = Little Endian Byte Order Mark: 0xFFFE
   //       b:  binary: raw bytes
   //           each byte is internally stored as double-byte character 
   //           with a zero big byte
   //           corresponds to mode "b" in ReadFile()
   // + Scripting got a new function.
   //   Name:   ReadFile
   //   Action: Read data from file into string.
   //   Syntax: readfile(filename, [mode])
   //     filename: file full path/name, or relative to current path
   //     mode:
   //       t:  [default] text
   //           whether file is ASCII or UNICODE is auto-detected
   //       b:  binary: raw bytes
   //           each byte is internally stored as double-byte character 
   //           with a zero big byte
   //           corresponds to mode "b" in WriteFile()
   //OLDER: v7.70.0007 - 2008-10-24 17:40
   //
   //     mode:
   //       t:  [default] text ASCII (1 byte per char);
   //           wide chars (upper Unicode) are represented by "?"
   //       u:  utf16: 2 bytes per char; with LE BOM at file beginning
   //           LE BOM = Little Endian Byte Order Mark: 0xFFFE
   //       b:  binary: raw bytes (also 2 bytes per char, but no BOM)

Re: Text Decoder.

Posted: 29 Aug 2010 01:38
by SkyFrontier
Can someone please
1) revise this little script?
2) tell me what dumbness am I doing to *not* get this thing open after creation?

Code: Select all

   $filename = input("Enter the name of the .txt file:");
   $decoded = utf8decode(<clipboard>);
   $xfile = writefile("<curpath>\$filename.txt", $decoded, ,"t");
   open $xfile;
Thanks!

Note: This is v1.0.a - provides a GUI for entering a name for the .txt, plus writing in "t" mode (see previous post). Except for the "open newly created file" part which is currently broken, everything else (ooooh, such complex subroutines it has to watch out for... :| ) works just fine.

Re: Text Decoder.

Posted: 29 Aug 2010 04:44
by serendipity
SkyFrontier wrote:Can someone please
1) revise this little script?
2) tell me what dumbness am I doing to *not* get this thing open after creation?

Code: Select all

   $filename = input("Enter the name of the .txt file:");
   $decoded = utf8decode(<clipboard>);
   $xfile = writefile("<curpath>\$filename.txt", $decoded, ,"t");
   open $xfile;
Thanks!
Problem is you have assigned a variable to writefile function. The return values for writefile are:
0 = failed
1 = data written
2 = file existed

so your next line "open $xfile"; is trying to open 0,1 or 2 which is why it fails.
Instead you should open the file you wrote, like this:

Code: Select all

  
   open "<curpath>\$filename.txt",w;

Re: Text Decoder.

Posted: 29 Aug 2010 06:51
by SkyFrontier
Now it works as expected. Thank you!
-still trying to revert "Á" and "Í" (possible others, too) so I don't have the question mark as output among letters. Problem seems to be specifications, since online tools like this do a worse job than that.
If anyone comes up with something, it'll be welcome!