Page 1 of 2

How to find files with a Non Western European Character?

Posted: 08 Apr 2025 18:17
by XYZeternity
Dear club members of the best file manager in the known universe,
:info:
Is there a way to Find files that contain non ABCDabcd01234 and dash [-] characters?
I want to "purify and simplify" file names for my MP3 player that does not understand anything besides Western European characters.
Files with strange characters get corrupted and cannot be displayed.
I want to delete or rename all "unreadable" characters to simple basic characters.
It would be nice if there was a way to find them all with a smart search query in my favouritest file manager. :om:

So that for example it finds me files like:
となりのトトロ My Neighbor Totoro! SoundTrack — Ghibli.mp3

Which I then can manually and safely rename to:
My Neighbor Totoro SoundTrack - Ghibli.mp3

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 21:41
by admin
Quick makeshift solution, but it will probably go some way for you: *[ÿ-휀]* /matching=e <-- enter this pattern into Quick Search (F3). Will match files with characters from 0x00FF to 0xD700 (includes Japanese, Chinese, and most of Korean).

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 22:12
by jupe
Here is another alternative that should work as a search term to find non ascii,

>[^\x00-\x7F]+

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 22:51
by admin
:tup:

Does it also support higher values like \x7FFF? Does not seem so...

Re: How to find files with a Non Western European Character?

Posted: 08 Apr 2025 23:03
by jupe
It might need extending if those chars are in use.

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 09:18
by XYZeternity
Wonderful :tup: Both work perfectly. I will save both suggestions. Thanks very much!
Is there a (video) tutorial that shows how the magic behind these powerful shorthands work?

*[ÿ-휀]* /matching=e
>[^\x00-\x7F]+

I suspect there is a steep learning curve but being able to stand on the very first step on that ladder would be probably enough for most of us :cup:

(PS. Dear Admins, please extend the icons in the right side of this message window a little. Most icons are so dark :blackstorm: they don't portray the thankfulness I am trying to convey here haha :om: Also, replacing the first Micky glove (/smilies/tup.png) with a recognizable standard yellow thumbsup icon would make them stand out better against the forum background).

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 12:08
by admin
[char - char] defines a from-till range of characters. In jupe's regexp pattern the characters are represented by their ordinal number in hex (eg \x00).

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 12:25
by admin
Here is a slightly improved version of my first pattern (range 256-65289):
*[Ā-)]* /matching=e

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 12:28
by admin
jupe wrote: 08 Apr 2025 22:12 Here is another alternative that should work as a search term to find non ascii,

>[^\x00-\x7F]+
When I compared the results with my pattern I found a strange anomaly. This regexp pattern fails at the Turkish Uppercase Dotted İ (ordinal 304). :veryconfused:

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 13:21
by XYZeternity
That looks a lot like the upside down exclamation point:

İ ¡ ! ¿

Thanks for the improved version. I have made four files each containing one of these Non-Western European letters above.
None of the regexes so far finds all 4 files containing the Non standard 4 characters above.

I want only Western European characters and numbers, without any special characters besides the normal dash -.

Is there a way to broaden the search regex so it finds a bit more Non Western European characters, such as İ ¡ ! ¿ " ' ` @ $ # © ° « » etc?

That way I can find and intercept more strange characters that I can delete manually from the file names.

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 14:14
by admin
Try this: *[!a-zA-Z0-9 ._()&-]* /matching=e

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 16:57
by XYZeternity
Awesome!

That did the trick! All files with potentially corruptable characters for my MP3 player have been found :appl:

*[!a-zA-Z0-9 ._()&-]* /matching=e << Finds everything that I seeked

Even found songs like: CVLTVRΣ, 明るい夜 Bright NiGHT.mp3
that was renamed to : CVLTVRE - Bright Night.mp3

Thank you very much!

(Icon of happy grandma with reading glasses offering cake she cooked with fire stove and rocking chair in background.)

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 19:59
by phred
XYZeternity wrote: 09 Apr 2025 16:57 (Icon of happy grandma with reading glasses offering cake she cooked with fire stove and rocking chair in background.)
Sometimes AI is spot on...
image.jpeg

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 20:38
by admin
The rocking chair is burning in the background, so cozy.

Re: How to find files with a Non Western European Character?

Posted: 09 Apr 2025 20:41
by jupe
½ Rocking chair, ½ Stove, its the original heated seat. :P