Page 2 of 3
Re: Replace Umlauts
Posted: 15 Jan 2013 11:56
by Marco
Seems ok here too. Other possible replacements:
™ as TM;
• as .;
« as <<; (and the mirrored ones)
‘ as `; (up to you whether ’ should be replaced the same way or as ')
So Greek and Russian turn into a bunch of dashes?
Re: Replace Umlauts
Posted: 15 Jan 2013 12:06
by admin
Marco wrote:Seems ok here too. Other possible replacements:
™ as TM;
• as .;
« as <<; (and the mirrored ones)
‘ as `; (up to you whether ’ should be replaced the same way or as ')
So Greek and Russian turn into a bunch of dashes?
1. Yep.
2. I tried my luck with transliterating cyrillic and greek, but it turned out to be difficult and a source of trouble. It seems to be impossible to find a transliteration table that everybody agrees upon. If anybody can help me here: I'd love to offer transliteration!
Re: Replace Umlauts
Posted: 15 Jan 2013 12:13
by Marco
1. I just noticed that single quotes have a different tilt depending on the font. Maybe you could replace all them with the same thing, i.e. '. And double quotes with ". And all the kind of hypens and dashes with -.
2. Ok, then you should wait for natives' opinions about the most agreed upon transliteration tables.
Re: Replace Umlauts
Posted: 15 Jan 2013 12:25
by SkyFrontier
For dictionary based solutions these threads may turn things easier, then:
http://www.xyplorer.com/xyfc/viewtopic. ... dictionary
http://www.xyplorer.com/xyfc/viewtopic. ... y&start=15
There's a 3rd one with a bigger list but I currently can't find it, sorry.
As a note, I'd like the XY custom replace char (mine is "_") to be used instead of "-". Exactly because it's a matter of personal taste, of course.
Re: Replace Umlauts
Posted: 15 Jan 2013 12:44
by admin
Thanks, but no cyrillic there.
SkyFrontier wrote:As a note, I'd like the XY custom replace char (mine is "_") to be used instead of "-". Exactly because it's a matter of personal taste, of course.
Yep, was my plan anyway.
Re: Replace Umlauts
Posted: 15 Jan 2013 13:02
by Borut
admin wrote:It seems to be impossible to find a transliteration table that everybody agrees upon. If anybody can help me here: I'd love to offer transliteration!
Yep, there are sure many possibilities for a given language and some people use one, some another, also depending on the purpose of transliteration.
I find that it would be nice to have the possibility of transliterating files too, perhaps via scripting, where one could also point to a specific transliteration table file. (One such file could be factory supplied and referable in the XYplorer Configuration or as a Tweak.)
Some time ago I had to transliterate Russian (and a few more special characters from other languages) into Latin. It was necessary to be a 1:1 character mapping and anything was better then nothing. I have used a special program, which in turn used the following transliteration table (might be a testing bed for you):
Re: Replace Umlauts
Posted: 15 Jan 2013 13:22
by admin
Thanks!
You know that SC replacelist IS a transliteration function already existing in XY?
For the GUI I want to make totally simple. One click and done. No tables, no included files.
Re: Replace Umlauts
Posted: 15 Jan 2013 13:33
by Borut
admin wrote:You know that SC replacelist IS a transliteration function already existing in XY?
Ah, you mean:
Code: Select all
::RTFM "idh_scripting_comref.htm#idh_sc_replacelist"

No, I did not know - thanks!
Re: Replace Umlauts
Posted: 15 Jan 2013 14:02
by admin
OK; I found something for Cyrillic here. Let's see if "you" (you cyrillicophiles) like it.

Re: Replace Umlauts
Posted: 15 Jan 2013 14:32
by admin
OK, I found this (I had to apply some changes) at some Windows 8 support page from microsoft (
http://code.msdn.microsoft.com/windowsd ... n-f7e88b29):
Code: Select all
' Uppercase modern Cyrillic characters.
.Add ChrW(&H410), "A"
.Add ChrW(&H411), "B"
.Add ChrW(&H412), "V"
.Add ChrW(&H413), "G"
.Add ChrW(&H414), "D"
.Add ChrW(&H415), "E"
.Add ChrW(&H416), "Zh"
.Add ChrW(&H417), "Z"
.Add ChrW(&H418), "I"
.Add ChrW(&H419), "I"
.Add ChrW(&H41A), "K"
.Add ChrW(&H41B), "L"
.Add ChrW(&H41C), "M"
.Add ChrW(&H41D), "N"
.Add ChrW(&H41E), "O"
.Add ChrW(&H41F), "P"
.Add ChrW(&H420), "R"
.Add ChrW(&H421), "S"
.Add ChrW(&H422), "T"
.Add ChrW(&H423), "U"
.Add ChrW(&H424), "F"
.Add ChrW(&H425), "Kh"
.Add ChrW(&H426), "Ts"
.Add ChrW(&H427), "Ch"
.Add ChrW(&H428), "Sh"
.Add ChrW(&H429), "Shch"
.Add ChrW(&H42A), "'" ' Hard sign
.Add ChrW(&H42B), "Ye"
.Add ChrW(&H42C), "'" ' Soft sign
.Add ChrW(&H42D), "E"
.Add ChrW(&H42E), "Iu"
.Add ChrW(&H42F), "Ia"
' Lowercase modern Cyrillic characters.
.Add ChrW(&H430), "a"
.Add ChrW(&H431), "b"
.Add ChrW(&H432), "v"
.Add ChrW(&H433), "g"
.Add ChrW(&H434), "d"
.Add ChrW(&H435), "e"
.Add ChrW(&H436), "zh"
.Add ChrW(&H437), "z"
.Add ChrW(&H438), "i"
.Add ChrW(&H439), "i"
.Add ChrW(&H43A), "k"
.Add ChrW(&H43B), "l"
.Add ChrW(&H43C), "m"
.Add ChrW(&H43D), "n"
.Add ChrW(&H43E), "o"
.Add ChrW(&H43F), "p"
.Add ChrW(&H440), "r"
.Add ChrW(&H441), "s"
.Add ChrW(&H442), "t"
.Add ChrW(&H443), "u"
.Add ChrW(&H444), "f"
.Add ChrW(&H445), "kh"
.Add ChrW(&H446), "ts"
.Add ChrW(&H447), "ch"
.Add ChrW(&H448), "sh"
.Add ChrW(&H449), "shch"
.Add ChrW(&H44A), "'" ' Hard sign
.Add ChrW(&H44B), "yi"
.Add ChrW(&H44C), "'" ' Soft sign
.Add ChrW(&H44D), "e"
.Add ChrW(&H44E), "iu"
.Add ChrW(&H44F), "ia"
I think this is fairly self-explaining. Now if anybody could provide such a list for Greek, or Turkish, or whatever other script, it would be a snap for me to add it.
Re: Replace Umlauts
Posted: 15 Jan 2013 18:11
by admin
OK, in the meantime I added non-russian cyrillic characters used in South Slavic Languages and Turkish. This should be enough for now.
Re: Replace Umlauts
Posted: 16 Jan 2013 13:42
by Marco
Silly me that I didn't consider this before...
In Italian wovels (a,e,i,o,u) may have accents (acute and grave). This function currently strips them out, but what about converting "à" to "a'"? This is currently very frequent in sms and on facebook. Also keep in mind that here in Italy we don't have "`" on keyboards, so we always rely on the apostrophe, for acute accents as well as grave ones. Another example is "È" (bad beast because is quite frequent) which is almost always typed as "E'" unless you are on Mac or enter Alt+0200 (numpad)...
Re: Replace Umlauts
Posted: 16 Jan 2013 13:48
by admin
Marco wrote:Silly me that I didn't consider this before...
In Italian wovels (a,e,i,o,u) may have accents (acute and grave). This function currently strips them out, but what about converting "à" to "a'"? This is currently very frequent in sms and on facebook. Also keep in mind that here in Italy we don't have "`" on keyboards, so we always rely on the apostrophe, for acute accents as well as grave ones. Another example is "È" (bad beast because is quite frequent) which is almost always typed as "E'" unless you are on Mac or enter Alt+0200 (numpad)...
OK for SMS, but filenames? Not sure...
Re: Replace Umlauts
Posted: 16 Jan 2013 13:50
by Marco
SMS was just an example. Is becoming frequent in computer written texts too.
Re: Replace Umlauts
Posted: 06 Feb 2013 10:06
by Pagat
admin wrote:Now if anybody could provide such a list for Greek, or Turkish, or whatever other script, it would be a snap for me to add it.
There is a character conversion table for financial transactions (SEPA) provided by the European Payments Council which includes greek:
http://www.europeanpaymentscouncil.eu/k ... nts_id=332
This list covers all languages that are spoken in the "Single European Payments Area" so it may be of use for your transliteration function.
edit: btw. the accompanying document states that
this table is based on international rules for romanisation, in particular ISO 9 and ISO 843 and input of the Greek community.