top of page

Perl Regular Expression

One of the most difficult parts of coding is learning how pattern matching works. In Mudlet, you have several options ranging from 'exact match' and 'substring' to 'Lua Function'. This tutorial focuses on the 'Regex' match type.

Perl Regular Expression is what you use for alias and trigger patterns. Unlike in zScript (zMUD and cMUD) where you can simply enter a pattern without any special symbols, both aliases and triggers, with the regex type, require special symbols to function properly in Mudlet. These range from carets to dollar signs or backslashes. Each symbol has a distinct function, even though some of the differences seem negligible. I will begin by covering some of the common symbols you will see in most scripts.

^

This is one of two anchors. This one in particular is for the beginning of the line. When it is added to the start of a pattern, it tells mudlet that nothing can come before it.

$

This is one of two anchors. This one in particular is for the end of the line. When it is added to the very end of a pattern, it tells mudllet that nothing can come after it.

^Hello world$

These two may be used together. When used together, it means that nothing can come before or after a pattern.

()

This sets a capture group. By putting something into a capture group, it tells Mudlet that you want to save that bit to the matches table (explained later).

^(Hello) world$

This will save "Hello" to matches[2].

\

A backslash is the default escape character for Regex. It tells Mudlet that the character following it is not a special symbol so that Mudlet will read it properly.

^(Hello) world\.$

Since a period is already a special symbol, a \ is used to make it read as a normal part of the pattern. It would also send "hello" to the matches table as matches[2] because it is in parenthesis.

Wildcards

Wildcards allow you to expand on your normal pattern matching to include things that change. An alias that is used to change targets, for example, because you do not want to have to make an alias for every single thing and person in the game. These are also used in triggers to match things that change, like the amount of gold you are moving around your inventory or the names of elixirs or tonics.

\w

This wildcard will match a single non-whitespace letter or number. It will not match symbols like an asterisk, hyphen, or comma, but it will match an underscore.

^\wello world$ 

This will match if 'hello' starts with any letter.

\d

This wildcard will match any non-whitespace integer in a pattern. It will not match any letters or symbols.

^hello \d world$ 

This will match if hello is followed by any number 1-9.

\s

This wildcard will match a single whitespace character (a space).

^hello\sworld$ 

The \s between the words acts like a space..

.

A period is a special wildcard. It will match spaces, symbols, letters, and numbers. Literally anything. This also means that it requires boundaries to keep from matching things you do not want to match.

^hello .orld$ 

This pattern will match no matter what is used in place of the letter 'w'.

As an added bonus, you can capitalize each letter in the wildcard to mean its inverse. For example, where \d might mean a number, \D means NOT a number. Just something to keep in mind.

Quantifiers

Obviously, we don't want to repeat the wildcards ten times to match a single word, which is for what quantifiers are used. They allow you to extend the influence of a wildcard to include more than one character. Below I will cover some of the common quantifiers used in conjunction with the wildcards.

*

An asterisk means 0 or more. The pattern will match if the piece that is using this wildcard is missing.

 

It is typically not a good idea to use .*, instead, use .+?.

^hello.*$ 

This pattern will match "hello" or "hello world". 'World' can be anything of any length and it will still match.

+

The plus sign means 1 or more. It is similar to the asterisk except that whatever this is used with MUST be present for the pattern to match.

^hello.+$ 

This pattern will match "hello" followed by anything you enter after it.

?

A question mark is a special character. It may be used anywhere in the pattern without a wildcard. It means 0 or 1 of the preceding element (the thing before this may or may not exist).

^h?ello world$ 

This pattern will match only "hello world" and "ello world".

{m,n}

This means between the numbers M and N. Any numbers including 0 may be used.

^h{2,3}ello world$ 

This pattern will only match "hhello world" and "hhhello world"..

(A|B|C)

This is a list. It's like a wildcard except that it can only match what is in the list.

^(h|y|m)ello world$ 

This pattern will match "hello world". "yello world", and "mello world". It will not match anything else.

Examples

Vadol has successfully inscribed the image of the Star on his Tarot card.

^(\w+) has successfully inscribe the image of the (.+?) on (?:his|her) Tarot card\.$

This would send "Vadol" and "Star" to the matches table and disregard "his" but still match if it was a female that was inscribing.

 You set the bomb's timer for 18 seconds.

^You set the bomb's timer for (\d+) seconds\.$ 

This would send the number "18" to the matches table as matches[2].

Mosr tells you, "Empress."
Mosr tells you, "Empress please."
Mosr tells you, "Emp?"
Mosr tells you, "Emp."

^(\w+) tells you, "Emp(?:ress)? ?(?:please)?(?:\.|\?)"$ 

Sometimes you need to capture multiple ways of somebody telling you something. This is an example of how to do it.
Mosr will be sent to matches[2]. Simple enough.

What the Emp(?:ress)? does is check for "Emp". That much HAS to be there. By "ress" being in parenthesis, it groups the letters together and tells Mudlet to send "ress" to the matches table. However, by placing ?: after the first parenthesis, it tells Mudlet NOT to send it to the matches table, but to keep them grouped. The ? after the last parenthsis means that the previous capture group (the stuff in parenthesis) may or may not be there. Meaning that it can match Emp or empress. We do the same thing for the "please" portion. To cover the space between the words, we just stick a question mark after the space. This way it will match the possibilities listed above.

bottom of page