Backreferences in codeless language modules

43 views
Skip to first unread message

iain barnett

unread,
Apr 2, 2020, 9:18:26 AM4/2/20
to bbe...@googlegroups.com
Hi,

I made a CLM for Awk[1] just for convenience and I came up with a regex to capture functions.

From the CLM reference[2]:

> Because of the way BBEdit processes regular-expression patterns internally, if you need to make backreferences in a pattern, you must use named subpatterns instead of positional (numbered) backreferences

The pattern I've used has named backreferences of the ?& flavour but the functions aren't being picked up. I've checked it works as a pattern in Find so I'm just wondering if this kind of backref is okay? The pattern fails if I use ?P= to refer back to it.

Any help or insights are much appreciated

     <key>BBLMScansFunctions</key><true/>
    <key>Function Pattern</key>
    <string><![CDATA[
      (?x:
        (?>function[^\S]+)(?>\w+[^\S]*\()
        (?P<args>\s*(?:(?:\[\w+\])|\w+)?\s*(?:,(?&args))?)
        \)
        (?P<funcbody>
        \s*\{
            (?:
              [^}{]+
               |
              (?&funcbody)
            )*+
          \}
        )
      )
    ]]></string>


Regards,
iain

jj

unread,
Apr 2, 2020, 2:01:24 PM4/2/20
to BBEdit Talk
Hi Ain,

Some of the required BBLM keys in your example appear to be misplaced or absent.

Here is a not thoroughly tested example that seems to work.

The modified <key>Function Pattern</key> is commented out because standard function type keys in <key>Language Features</key> will probably do a better job than regular expressions for C type functions.

Best regards.

Jean Jourdain

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
<plist version="1.0">

<!--  Copyright (c) 2020 Iain Barnett -->
<dict>
<!-- You must identify the plist as a CLM: -->
<key>BBEditDocumentType</key>
<string>CodelessLanguageModule</string>
<key>BBLMLanguageDisplayName</key>
<string>Awk</string>
<key>BBLMLanguageCode</key>
<string>Awk!</string>
    <key>BBLMPreferredFilenameExtension</key>
    <string>awk</string>
    
<key>BBLMSuffixMap</key>
    <array>
        <dict>
            <key>BBLMLanguageSuffix</key>
            <string>.awk</string>
        </dict>
        <dict>
            <key>BBLMLanguageSuffix</key>
            <string>.sh</string>
        </dict>
    </array>
    
<key>BBLMColorsSyntax</key>
<true/>

<key>BBLMKeywordList</key>
<array>
<string>BEFORE</string>
<string>END</string>
<string>print</string>
<string>printf</string>
<string>next</string>
<string>exit</string>
<string>if</string>
<string>else</string>
        <string>OFS</string><!-- The Output Field Separator Variable -->
        <string>NF</string><!-- The Number of Fields Variable -->
        <string>NR</string><!-- The Number of Records Variable -->
        <string>RS</string><!-- The Record Separator Variable -->
        <string>ORS</string><!-- The Output Record Separator Variable -->
        <string>FILENAME</string><!-- The Current Filename Variable -->
        <string>cos</string><!-- cosine GAWK,AWK,NAWK -->
        <string>exp</string><!-- Exponent GAWK,AWK,NAWK -->
        <string>int</string><!-- Integer GAWK,AWK,NAWK -->
        <string>log</string><!-- Logarithm GAWK,AWK,NAWK -->
        <string>sin</string><!-- Sine GAWK,AWK,NAWK -->
        <string>sqrt</string><!-- Square Root GAWK,AWK,NAWK -->
        <string>atan2</string><!-- Arctangent GAWK,NAWK -->
        <string>rand</string><!-- Random GAWK,NAWK -->
        <string>srand</string><!-- Seed Random GAWK,NAWK -->
        <string>index</string><!-- index(string,search) AWK, NAWK, GAWK -->
        <string>length</string><!-- length(string) AWK, NAWK, GAWK -->
        <string>split</string><!-- split(string,array,separator) AWK, NAWK, GAWK -->
        <string>substr</string><!-- substr(string,position) AWK, NAWK, GAWK -->
        <string>sub</string><!-- sub(regex,replacement) NAWK, GAWK -->
        <string>gsub</string><!-- gsub(regex,replacement) NAWK, GAWK -->
        <string>match</string><!-- match(string,regex) NAWK, GAWK -->
        <string>tolower</string><!-- tolower(string) GAWK -->
        <string>toupper</string><!-- toupper(string) GAWK -->
        <string>asort</string><!-- asort(string,[d]) GAWK -->
        <string>asorti</string><!-- asorti(string,[d]) GAWK -->
        <string>gensub</string><!-- gensub(r,s,h [,t]) GAWK -->
        <string>strtonum</string><!-- strtonum(string) GAWK -->
        <string>getline</string><!-- getline AWK, NAWK, GAWK -->
        <string>system</string><!-- system(command) NAWK, GAWK -->
        <string>close</string><!-- close(command) NAWK, GAWK -->
        <string>systime</string><!-- systime() GAWK -->
        <string>strftime</string><!-- strftime(string) GAWK -->
        <string>while</string>
        <string>do</string>
        <string>for</string>
    </array>

<key>BBLMCommentLineDefault</key>
<string>#</string>
    <key>BBLMSupportsTextCompletion</key>
    <true/>

    <key>BBLMScansFunctions</key>
    <true/>

<key>Language Features</key>
<dict>
        <key>Prefix for Functions</key>
        <string>function</string>
   <key>Open Parameter Lists</key>
   <string>(</string>
   <key>Close Parameter Lists</key>
   <string>)</string>
   <key>Open Statement Blocks</key>
   <string>{</string>
   <key>Close Statement Blocks</key>
   <string>}</string>
        <key>Open Line Comments</key>
        <string>#</string>
<key>Identifier and Keyword Character Class</key>
<string>A-Za-z0-9_</string>
<!--
        <key>Function Pattern</key>
        <string><![CDATA[(?x)                               (?# Ignore white space and comments.            )
(?s)                                                        (?# Dot does match newlines.                    )
(?n)                                                        (?# No auto capture.                            )
(?P<indent>^[[:blank:]]*)                                   (?# Capture indentation for closing bracket.    )
(?P<function>                                               (?# Capture function.                           )
    function                                                (?# Function keyword.                           )
    \s+
    (?P<function_name>[a-zA-Z0-9_]+)                        (?# Capture function name.                      )
    \s*
    \([^\)]+?\)                                             (?# Arguments list.                             )
    \s*
    \{                                                      (?# Opening bracket.                            )
    .+?
    [\r\n](?P=indent)\}
)
]]></string>
-->
        <key>String Pattern</key>
        <string><![CDATA[(?x)
(?:"(?:\\"|[^"\r]|\\\r)*") | (?# Double-quote)
(?:'(?:\\'|[^'\r]|\\\r)*') | (?# Single-quote)
]]></string>

        <key>Comment Pattern</key>
        <string>\#.*?$</string>

        <key>Skip Pattern</key>
        <string><![CDATA[(?x)
(?P>comment) | (?P>string)
]]></string>
</dict>
</dict>
</plist>

Reply all
Reply to author
Forward
0 new messages