Page 1 of 1

pYreX — Rev. 0.10 / 2015/07/31

Posted: 31 Jul 2015 01:03
by Marco
Proof of concept: use a better regex engine (https://bitbucket.org/mrabarnett/mrab-regex) for SC regexmatches() and regexreplace().

This is work in progress, and I don't even know if it will ever become stable, as it has some bugs that I can't figure out how to solve.

How to use, or better, how I use it.
- Download and extract the latest WinPython from https://sourceforge.net/projects/winpyt ... 4/3.4.3.4/
- Install the regex package by running in WinPython Command Prompt

Code: Select all

pip install regex
- Find the hwnd of the XY window you want to use and paste it at line 85 (where you see the question marks) of pYreX.py
- Open the IDLE, Ctrl+O to open pYreX.py, and then press F5
- Now, run in XY

Code: Select all

text substr(<get CopiedData>, 4, -1);
and you should get the hwnd of the Python script (more or less). You can use it in the UDFs provided below.

How does it work.
The script relies on WM_COPYDATA as a mean to exchange strings between XY and Python, therefore no temporary files are necessary.
The script starts and communicates its hwnd to XY, which in turn knows where to send the strings and regexes to manipulate.

UDF

Code: Select all

function pYreXmatches($string, $pattern, $separator="|", $flags = "IMX", $pyrexhwnd) {
// See XYplorer Help at regexmatches(), since the behaviour is almost identical.
// See https://bitbucket.org/mrabarnett/mrab-regex for further information on
// Python RE syntax.
//
//   $flags   flags let you modify some aspects of how regular expressions work.
//            Multiple flags can be specified.
//
//            A 	Makes several escapes like \w, \b, \s and \d match only
//                      on ASCII characters with the respective property.
//            S 	Make "." match any character, including newlines
//            I 	Do case-insensitive matches
//            L 	Do a locale-aware match
//            M 	Multi-line matching, affecting "^" and "$"
//            X 	Enable verbose REs, which can be organized more cleanly and understandably

 $string_len    = strlen($string);
 $pattern_len   = strlen($pattern);
 $separator_len = strlen($separator);
 $flags_len     = strlen($flags);

 $message = "<hwnd>|$string_len|$pattern_len|$separator_len|$flags_len|matches $string$pattern$separator$flags";

 copydata $pyrexhwnd, $message, 1;

 return substr(<get CopiedData>, 4, -1);
}

function pYreXreplace($string, $pattern, $replacement="", $flags = "IMX", $pyrexhwnd) {
// See XYplorer Help at regexreplace(), since the behaviour is almost identical.
// See https://bitbucket.org/mrabarnett/mrab-regex for further information on
// Python RE syntax.
//
//   $flags   flags let you modify some aspects of how regular expressions work.
//            Multiple flags can be specified.
//
//            A 	Makes several escapes like \w, \b, \s and \d match only
//                      on ASCII characters with the respective property.
//            S 	Make "." match any character, including newlines
//            I 	Do case-insensitive matches
//            L 	Do a locale-aware match
//            M 	Multi-line matching, affecting "^" and "$"
//            X 	Enable verbose REs, which can be organized more cleanly and understandably

 $string_len      = strlen($string);
 $pattern_len     = strlen($pattern);
 $replacement_len = strlen($replacement);
 $flags_len       = strlen($flags);

 $message = "<hwnd>|$string_len|$pattern_len|$replacement_len|$flags_len|replace $string$pattern$replacement$flags";

 copydata $pyrexhwnd, $message, 1;

 return substr(<get CopiedData>, 4, -1);
}

pYreX.py

Code: Select all

# Sending part thanks to       : http://stackoverflow.com/questions/19886633/sending-wm-copydata-with-python-3
# Receiving part thanks to     : http://stackoverflow.com/questions/5249903/receiving-wm-copydata-in-python
# Exiting part thanks to       : http://stackoverflow.com/questions/5113791/in-windows-using-python-how-do-i-kill-my-process
# Tuple-style parsing thanks to: http://stackoverflow.com/questions/5749195/how-can-i-split-and-parse-a-string-in-python

import win32con, win32api, win32gui, ctypes, ctypes.wintypes, sys, regex

class Listener:
    def __init__(self):
        message_map = {
            win32con.WM_COPYDATA: self.OnCopyData
        }
        wc = win32gui.WNDCLASS()
        wc.lpfnWndProc = message_map
        wc.lpszClassName = 'MyWindowClass'
        hinst = wc.hInstance = win32api.GetModuleHandle(None)
        classAtom = win32gui.RegisterClass(wc)
        self.hwnd = win32gui.CreateWindow (
            classAtom,
            "win32gui test",
            0,
            0, 
            0,
            win32con.CW_USEDEFAULT, 
            win32con.CW_USEDEFAULT,
            0, 
            0,
            hinst, 
            None
        )

    def OnCopyData(self, hwnd, msg, wparam, lparam):
        pCDS = ctypes.cast(lparam, PCOPYDATASTRUCT)
        input = ctypes.wstring_at(pCDS.contents.lpData)
        
        if input.startswith('EXIT'):
            sys.exit()
        
        header = input.partition(" ")[0]
        body = input.partition(" ")[2]
        
        header = regex.split("\|", header)
        xyhwnd = int(header[0])
        string_len = int(header[1])
        pattern_len = int(header[2])
        seprep_len = int(header[3])
        flags_len = int(header[4])
        mode = header[5]
        
        string = body[:string_len]
        pattern = body[string_len:string_len+pattern_len]
        seprep = body[string_len+pattern_len:string_len+pattern_len+seprep_len]
        flags = body[-flags_len:]
        flags = regex.compile(r'\B').sub(' | ', flags)
        flags = regex.compile(r'(\w)').sub(r'regex.\1', flags)
        
        if mode == 'matches':
            message = seprep.join(regex.compile(pattern, eval(flags)).findall(string))
        else:
            message = regex.compile(pattern, eval(flags)).sub(seprep, string)

        cds = COPYDATASTRUCT()
        cds.dwData = 0
        cds.cbData = ctypes.sizeof(ctypes.create_unicode_buffer(message))
        cds.lpData = ctypes.c_wchar_p(message)

        SendMessage(xyhwnd, win32con.WM_COPYDATA, 0, ctypes.byref(cds))
        
        return 1
        
class COPYDATASTRUCT(ctypes.Structure):
    _fields_ = [
        ('dwData', ctypes.wintypes.LPARAM),
        ('cbData', ctypes.wintypes.DWORD),
        ('lpData', ctypes.c_wchar_p) 
    ]
    
PCOPYDATASTRUCT = ctypes.POINTER(COPYDATASTRUCT)

l = Listener()

SendMessage = ctypes.windll.user32.SendMessageW

pyhwnd = str(l.hwnd)
xyhwnd = ??? #sys.argv[1]

cds = COPYDATASTRUCT()
cds.dwData = 0
cds.cbData = ctypes.sizeof(ctypes.create_unicode_buffer(pyhwnd))
cds.lpData = ctypes.c_wchar_p(pyhwnd)

SendMessage(xyhwnd, win32con.WM_COPYDATA, 0, ctypes.byref(cds))

win32gui.PumpMessages()

Re: pYreX — Rev. 0.10 / 2015/07/31

Posted: 31 Jul 2015 07:37
by bdeshi
:appl: :appl: :appl:
A laudable contribution!

what bugs?

Re: pYreX — Rev. 0.10 / 2015/07/31

Posted: 31 Jul 2015 08:42
by highend
Interesting concept! :appl: :appl: :appl:

My personal small problem: 1GB installed file size (WinPython x86) to use a better regex engine? :mrgreen:

Let's hope that Don will add a better regex.dll some day :cup:

Re: pYreX — Rev. 0.10 / 2015/07/31

Posted: 31 Jul 2015 11:23
by Marco
Thank you both!!!

@Sammy:
Not all flags work right. Replacing with IMX works fine, but just MX doesn't.
In general, messages are shown by XY with an extra NULL which must be stripped with substr.
Again, using "|" as separator in pyrexmatches gives some trouble with <get copieddata 3> because XY uses "|" as separator in <get copieddata>, so I must use substr again.
I can't pass nicely XY hwnd to the script.
And, while not a bug, I can't find a way to compile all this reliably to a single exe.

@highend:
I know! I would just use EditPadPro if it could send/receive messages.

Unfortunately this is the best I can do so far.

Re: pYreX — Rev. 0.10 / 2015/07/31

Posted: 31 Jul 2015 18:01
by bdeshi
Marco wrote:And, while not a bug, I can't find a way to compile all this reliably to a single exe.
you might as well include a complete python environment. Even simple python scripts compiled to exe generally take up upwards of 5-6 mbs. Since this one uses pywin32 modules (in addition to whatever winpython loads by default), the size can easily jump to the 15mbs or more! :blackstorm:

Re: pYreX — Rev. 0.10 / 2015/07/31

Posted: 31 Jul 2015 18:55
by bdeshi
me again.
Did you know AutoIt3 and Autohotkey both use the pcre libary for their regex?
So a simple UDF "interface" to their StringRegExpReplace()/RegExReplace(), StringRegExp()/RegExMatch() functions via wm_copydata will be be easy and lighweight. :ugeek: