Page 1 of 2

How to find files with a Non Western European Character?

Posted: 08 Apr 2025 18:17
by XYZeternity
Dear club members of the best file manager in the known universe,
:info:
Is there a way to Find files that contain non ABCDabcd01234 and dash [-] characters?
I want to "purify and simplify" file names for my MP3 player that does not understand anything besides Western European characters.
Files with strange characters get corrupted and cannot be displayed.
I want to delete or rename all "unreadable" characters to simple basic characters.
It would be nice if there was a way to find them all with a smart search query in my favouritest file manager. :om:

So that for example it finds me files like:
となりのトトロ My Neighbor Totoro! SoundTrack — Ghibli.mp3

Which I then can manually and safely rename to:
My Neighbor Totoro SoundTrack - Ghibli.mp3

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 21:41
by admin
Quick makeshift solution, but it will probably go some way for you: *[ÿ-휀]* /matching=e <-- enter this pattern into Quick Search (F3). Will match files with characters from 0x00FF to 0xD700 (includes Japanese, Chinese, and most of Korean).

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 22:12
by jupe
Here is another alternative that should work as a search term to find non ascii,

>[^\x00-\x7F]+

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 22:51
by admin
:tup:

Does it also support higher values like \x7FFF? Does not seem so...

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 23:03
by jupe
It might need extending if those chars are in use.

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 09:18
by XYZeternity
Wonderful :tup: Both work perfectly. I will save both suggestions. Thanks very much!
Is there a (video) tutorial that shows how the magic behind these powerful shorthands work?

*[ÿ-휀]* /matching=e
>[^\x00-\x7F]+

I suspect there is a steep learning curve but being able to stand on the very first step on that ladder would be probably enough for most of us :cup:

(PS. Dear Admins, please extend the icons in the right side of this message window a little. Most icons are so dark :blackstorm: they don't portray the thankfulness I am trying to convey here haha :om: Also, replacing the first Micky glove (/smilies/tup.png) with a recognizable standard yellow thumbsup icon would make them stand out better against the forum background).

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 12:08
by admin
[char - char] defines a from-till range of characters. In jupe's regexp pattern the characters are represented by their ordinal number in hex (eg \x00).

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 12:25
by admin
Here is a slightly improved version of my first pattern (range 256-65289):
*[Ā-)]* /matching=e

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 12:28
by admin
jupe wrote: 08 Apr 2025 22:12 Here is another alternative that should work as a search term to find non ascii,

>[^\x00-\x7F]+
When I compared the results with my pattern I found a strange anomaly. This regexp pattern fails at the Turkish Uppercase Dotted İ (ordinal 304). :veryconfused:

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 13:21
by XYZeternity
That looks a lot like the upside down exclamation point:

İ ¡ ! ¿

Thanks for the improved version. I have made four files each containing one of these Non-Western European letters above.
None of the regexes so far finds all 4 files containing the Non standard 4 characters above.

I want only Western European characters and numbers, without any special characters besides the normal dash -.

Is there a way to broaden the search regex so it finds a bit more Non Western European characters, such as İ ¡ ! ¿ " ' ` @ $ # © ° « » etc?

That way I can find and intercept more strange characters that I can delete manually from the file names.

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 14:14
by admin
Try this: *[!a-zA-Z0-9 ._()&-]* /matching=e

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 16:57
by XYZeternity
Awesome!

That did the trick! All files with potentially corruptable characters for my MP3 player have been found :appl:

*[!a-zA-Z0-9 ._()&-]* /matching=e << Finds everything that I seeked

Even found songs like: CVLTVRΣ, 明るい夜 Bright NiGHT.mp3
that was renamed to : CVLTVRE - Bright Night.mp3

Thank you very much!

(Icon of happy grandma with reading glasses offering cake she cooked with fire stove and rocking chair in background.)

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 19:59
by phred
XYZeternity wrote: 09 Apr 2025 16:57 (Icon of happy grandma with reading glasses offering cake she cooked with fire stove and rocking chair in background.)
Sometimes AI is spot on...
image.jpeg
image.jpeg (94.3 KiB) Viewed 2305 times

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 20:38
by admin
The rocking chair is burning in the background, so cozy.

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 20:41
by jupe
½ Rocking chair, ½ Stove, its the original heated seat. :P