Introduction
The re:
module wraps Go’s regexp
package. See the Go’s doc for supported
regular expression syntax.
Function usages notations follow the same convention as the builtin module doc.
The following options are supported by multiple functions in this module:
-
&posix=$false
: Use POSIX ERE syntax. See also doc in Go package. -
&longest=$false
: Prefer leftmost-longest match. See also doc in Go package. -
&max=-1
: If non-negative, limits the maximum number of results.
Functions
re:awk
re:awk &sep='[ \t]+' &sep-posix=$false &sep-longest=$false $f $inputs?
For each value input, calls $f
with the input
followed by all its fields.
The &sep
option is a regular expression for the field separator. For the
&sep-posix
and &sep-longest
options, see the
introduction; the sep-
prefix is added for clarity.
Calling break
in $f
exits both $f
and re:awk
, and can be used to
stop processing inputs early. Calling continue
exits $f
but not
re:awk
, and can be used to stop $f
early but continue processing inputs.
This command allows you to write code resembling AWK scripts, using an anonymous function instead of a string containing AWK code. A simple example:
~> echo " lorem ipsum\n1 2" | awk '{ print $1 }'
lorem
1
~> echo " lorem ipsum\n1 2" | re:awk {|line a b| put $a }
▶ lorem
▶ 1
Note: Since Elvish allows variable names consisting solely of digits, you can do something like this to emulate AWK even more closely:
~> echo " lorem ipsum\n1 2" | re:awk {|0 1 2| put $1 }
▶ lorem
▶ 1
If the number of fields differ between lines, use a rest argument:
~> echo "a b\nc d e" | re:awk {|@a| echo (- (count $a) 1)' fields' }
2 fields
3 fields
This command is roughly equivalent to the following Elvish function:
fn my-awk {|&sep='[ \t]+' &sep-posix=$false &sep-longest=$false f @rest|
each {|line|
var @fields = (re:split $sep &posix=$sep-posix &longest=$sep-longest (str:trim $line " \t"))
$f $line $@fields
} $@rest
}
re:find
re:find &posix=$false &longest=$false &max=-1 $pattern $source
Find all matches of $pattern
in $source
.
Each match is represented by a map-like value $m
; $m[text]
, $m[start]
and
$m[end]
are the text, start and end positions (as byte indices into $source
)
of the match; $m[groups]
is a list of submatches for capture groups in the
pattern. A submatch has a similar structure to a match, except that it does not
have a group
key. The entire pattern is an implicit capture group, and it
always appears first.
Examples:
~> re:find . ab
▶ [&end=(num 1) &groups=[[&end=(num 1) &start=(num 0) &text=a]] &start=(num 0) &text=a]
▶ [&end=(num 2) &groups=[[&end=(num 2) &start=(num 1) &text=b]] &start=(num 1) &text=b]
~> re:find '[A-Z]([0-9])' 'A1 B2'
▶ [&end=(num 2) &groups=[[&end=(num 2) &start=(num 0) &text=A1] [&end=(num 2) &start=(num 1) &text=1]] &start=(num 0) &text=A1]
▶ [&end=(num 5) &groups=[[&end=(num 5) &start=(num 3) &text=B2] [&end=(num 5) &start=(num 4) &text=2]] &start=(num 3) &text=B2]
re:match
re:match &posix=$false $pattern $source
Determine whether $pattern
matches $source
. The pattern is not anchored.
Examples:
~> re:match . xyz
▶ $true
~> re:match . ''
▶ $false
~> re:match '[a-z]' A
▶ $false
re:quote
re:quote $string
Quote $string
for use in a pattern. Examples:
~> re:quote a.txt
▶ a\.txt
~> re:quote '(*)'
▶ '\(\*\)'
re:replace
re:replace &posix=$false &longest=$false &literal=$false $pattern $repl $source
Replace all occurrences of $pattern
in $source
with $repl
.
The replacement $repl
can be any of the following:
-
A string-typed replacement template. The template can use
$name
or${name}
patterns to refer to capture groups, wherename
consists of letters, digits and underscores. A purely numeric patterns like$1
refers to the capture group with the corresponding index; other names refer to capture groups named with the(?P<name>...)
) syntax.In the
$name
form, the name is taken to be as long as possible;$1
is equivalent to${1x}
, not${1}x
;$10
is equivalent to${10}
, not${1}0
.To insert a literal
$
, use$$
. -
A function that takes a string argument and outputs a string. For each match, the function is called with the content of the match, and its output is used as the replacement.
If $literal
is true, $repl
must be a string and is treated literally instead
of as a pattern.
Example:
~> re:replace '(ba|z)sh' '${1}SH' 'bash and zsh'
▶ 'baSH and zSH'
~> re:replace '(ba|z)sh' elvish 'bash and zsh rock'
▶ 'elvish and elvish rock'
~> re:replace '(ba|z)sh' {|x| put [&bash=BaSh &zsh=ZsH][$x] } 'bash and zsh'
▶ 'BaSh and ZsH'
re:split
re:split &posix=$false &longest=$false &max=-1 $pattern $source
Split $source
, using $pattern
as separators. Examples:
~> re:split : /usr/sbin:/usr/bin:/bin
▶ /usr/sbin
▶ /usr/bin
▶ /bin
~> re:split &max=2 : /usr/sbin:/usr/bin:/bin
▶ /usr/sbin
▶ /usr/bin:/bin