Page 1 of 1

xypcre: PCRE functions for XYplorer

Posted: 24 Aug 2015 07:00
by bdeshi
xypcre.xyi: https://github.com/bdeshi/xypcre

This is a collection of user-defined functions for xyscripts that adds PCRE support to XYplorer.

PCRE1 2|3 4 is an advanced standard of Regular Expressions, which allows you to do many advanced search/replace operations not possible in XYplorer's default regexp engine.
These functions act as alternatives to builtin regexmatches() and regereplace(), and allow XYplorer scripts to use a PCRE-compatible RegularExpression engine instead of the limited Visual Basic implementation. This is achieved by offloading regexp operations to a small handler program written, at present, in AutoIt3. See usage notes for details.

:arrow: Function Reference.
:arrow: Instructions.
:arrow: Downloads.
:!: Please read usage instructions carefully before including and/or using the functions.

This may have some complicated or downright ridiculous perks, but I still hope this helps relieve some of our regexp woes. :)

functions included in xypcre

Posted: 24 Aug 2015 07:00
by bdeshi
[up-to-date reference is here: https://github.com/bdeshi/xypcre/blob/master/XYPCRE.md]
CORE functions
  • pcrematch()
    Returns regexp pattern match(es) in a given string.
    Syntax: pcrematch(string, pattern, sep='||', index=0, format=2)
        string     string to work on (haystack).
        pattern  the regexp pattern to match (needle).
        sep        separator between returned matches. Must be at least two characters long.
        index     1-based index of one match to return when there are multiple matches. Ineffective if < 1. returns last match if > total count.
        format   format or return data. Can be 0, 1, or 2. Values cannot be combined. See Remarks.
    Returns matching substring(s) in defined format.
  • pcrereplace()
    Replaces regexp pattern match(es) in a given string.
    Syntax: pcrereplace(string, pattern, replace)
        string     string to work on (haystack).
        pattern  the regexp pattern to match (needle).
        replace  The string or pattern to replace match with.
    Returns resulting string after replacement.
  • pcrecapture()
    Returns matches of a capturing group in the regexp pattern
    Syntax: pcrecapture(string, pattern, index=1, sep='||', format=2)
        string     string to work on (haystack).
        pattern  the regexp pattern to match (needle). Should have at least one capturing group.
        sep        separator between returned matches. Must be at least two characters long.
        index     1-based index of the capturing group to return. Returns 1st group if < 1, or last one if > total count. Pass namedgroups by their ordinal index.
        format   format or return data. Can be 0, 1, or 2. Values cannot be combined. See Remarks.
    Returns matching substring(s) of the group in defined format.
  • pcresplit()
    Splits a string at each regexp pattern match, and returns resulting substrings.
    Syntax: pcresplit(string, pattern, sep='||', format=2)
        string     string to work on (haystack).
        pattern  the regexp pattern to split at (needle). Matching text is destroyed while splitting. Use lookahead/lookbehinds to retain portions.
        sep        separator between returned substrings. Must be at least two characters long.
        format   format or return data. Can be 0, 1, or 2. Values cannot be combined. See Remarks.
    Returns split substrings in defined format.
  • pcretoken()
    Returns a substring/match/token in it's original form, from a tokenlist returned by xypcre functions.
    This is equivalent to gettoken() for the special xypcre return data formats.
    Syntax: pcretoken(data, index=1, format=2, sep='||')
        data       The source tokenlist.
        index     1-based idnex of substring to return, or total count of tokens if index value is 'count'.
        format   format of data. Can be 0, 1, or 2. This must be the same format used in data.
        sep        separator used in data. This must be the same sep used in data.
    Returns asked token/substring or total count. The token is return in it's original form (ie, unescaped).
HELPER functions
(CORE functions depend on these and will not work without)
  • xypcrefind():Finds a valid xypcre.exe and returns the path. The utility is downloaded if not found.
  • xypcrewaiter(): Synchronizes communication between xyscript and xypcre. Also handles aborting when xypcre becomes nonresponsive.

Remarks
  • These function do not have a matchcase parameter, but case and a host of other options can be defined in the regexp pattern itself.
  • Most if not all of PCRE syntax is available. See 3 4 for syntax that's sure to be supported. These pages also describe some assumptions or defaults of the syntax.
  • For functions that can return multiple substring, as a tokenlist:
    • sep must be at least two characters long to work around the dilemma of sep characters already existing in the source string.
      It's recommended that sep be a single character, repeated twice. sep is irrelevant if format is set to 2.
    • format decides the format of returned data. Possible values are 0 or 1 or 2.
      • 0: return tokens are separated by sep, and not processed in any way. Not even if sep chars already exist in the strings.
        In this format, a gettoken() call might not be able to retrieve a complete token.
        But this format is fastest when it's known that no sep character exists in the source string.
        For example when the sep is <crlf 2>, and the source string is all in one line.
      • 1: sep characters are escaped with square brackets in each token.
        For example, if sep is '<>', a token 'abc>def' becomes 'abc[>]def'.
        In this format, a gettoken() call is able to retrieve a complete token, but it will have to be unescaped later.
      • 2: return is in this format: 'token1length+token2length|token1token2'
        Eg, if the tokens are 'data', '' and 'info|intel', the return becomes: '4+0+10|datainfo|intel'
        In this format, the sep parameter is irrelevant.
    • Regardless of which format is used or how complicated it may look, the pcretoken() function is able to return one token in it's original format.
    The reason behind all this elaborate escaping and formatted return data is to retrieve complete matches even when the matched text may contain the separator characters.

Re: xypcre: PCRE functions for XYplorer

Posted: 09 Sep 2015 20:41
by bdeshi
bugfix update: FIXED: xyplorer's copydata cannot send empty strings, so a call like pcrereplace('[abc]', '[\[\]]', '') wouldn't work. This is fixed in v1.1.0.9.
The hack/fix chosen for now is to simply prefix one character ($op) to every outgoing string and xypcre in turn trims the beginning char it's received strings. Beautifuller fix may come later.

1.1.0.9 is rendered unnecessary due to native bug fix. Rolled back to 1.1.0.8.

Re: xypcre: PCRE functions for XYplorer

Posted: 09 Sep 2015 20:51
by admin
SammaySarkar wrote:xyplorer's copydata cannot send empty strings...
You could have told me that. :) Next version it can.

Re: xypcre: PCRE functions for XYplorer

Posted: 30 Sep 2017 17:43
by bdeshi
Anyone using this? Committed to a git repository: https://github.com/smsrkr/xypcre

Re: xypcre: PCRE functions for XYplorer

Posted: 21 Sep 2018 00:39
by Dustydog
Just an FYI - I'd definitely allow the script to download the .exe - it avoids a very annoying warning popup that's a PITA to get rid of manually (I'd suggest you put it in your documents folder and unblock if from the right-click menu if you go that route and set the perm variable - documents folder (or similar) easiest for a single file, imho). And no, I have no idea why letting the script download it avoids the popup - or is somehow safer for that matter! And yes, I like keeping the warning set. I've been surprised once, and that was enough.

Thanks for some great work, Sammy! And yes, I'm using it.

Re: xypcre: PCRE functions for XYplorer

Posted: 21 Sep 2018 09:21
by bdeshi
Thanks for the feedback!

The warning you speak of, where does this come from? your antivirus? the untrusted downloaded file warning from windows?

(btw, If you already have a copy of the autoit interpreter installed, you can use the au3 script itself, no difference at all.)

Re: xypcre: PCRE functions for XYplorer

Posted: 11 Mar 2019 07:12
by bdeshi
There was a serious bug regarding empty parameters. It's fixed in the new v1.3.1 release.

Downloads: https://github.com/bdeshi/xypcre/releases/tag/v1.3.1

Changelog:
v1.3.0
* fixed bug with 0-length copydata arguments.
* missing binary is now downloaded from this github repo.
* the xypcrefind function detects both binary and source xypcre correctly.
* invalid xypcre now stops script with a failed assertion instead of returning empty string.
* removed minified version. Please remove any local xypcre.min.xyi to avoid script version mismatch.
v1.3.1
* fixed d/l url.