RegExpressed
Descriptive functions to create regular expressions.
Readability
Regular expressions are nutorious for being hard to read.
Part of this issue to me is the result of many features use single characters (+
,*
,?
, etc.)
It is hard to remember what each of them does and whenever you need to match a special character they need to be escaped.
This library attempts to solve this issue by utilizing variables, functions and tagged template literal to generate a RegEx. Meaning you can use vanilla JS to write your RegEx without having to learn all these special characters. No need to escape special characters, except those needed by javascript.
Examples
The comment represents the produced regex.
Word
// /(?:\w)+/
oneOrMore`${wordChar}`
Simplified email, reuses the word regex
// /(?:\w)+/
const word = oneOrMore`${wordChar}`
// /(?:\w)+@(?:\w)+(?:(?:\.nl)|(?:\.com))/
regex`${word}@${word}${or(".nl", ".com")}`
URL
const urlChar = charset`-@:%._+~#=${range("a","z")}${range("A","Z")}${range("0","9")}`;
const protocol = regex`http${optional`s`}://`;
const domain = between(2, 256)`${urlChar}`;
const domainExt = between(2, 6)`${charset`${range("a", "z")}`}`;
const host = regex`${optional`www.`}${domain}.${domainExt}`;
const pathChar = or(urlChar, charset`()?&/=`);
const path = zeroOrMore`${pathChar}`;
// (manually simplified)
// /https?:\/\/(?:www\.)?[\-@:%._+~#=a-zA-Z0-9]{2,256}\.[a-z]{2,6}[\-@:%._+~#=a-zA-Z0-9()?&/=]*/
regex`${protocol}${host}${path}`;
Complete example shown in url-example.ts
Tradeoffs
In terms of functionality there is no tradeoff all regex features have an equivalent in this library.
Most functions map directly to the same regex you normally would write by hand.
The exception being that all quantifiers wrap the expression into an nonCaptureGroup (?:some-pattern)
.
This can generate a longer regex.
It would be possible to figure out which groups are not neccesary and safely remove them.
For now this is not available though.
Another tradeoff is that it will be more code, however I do not see this as a total negative. RegEx are hard to read and can often be left untouched for a long time. Meaning that you probably completely forgot how it worked when reading it months later.
This is where splitting the pattern into parts and having descriptive functions will help. When reading the URL example it would probably be understandable. While I would have ended up coping the raw regex to a site like https://regexr.com/ to try and understand it.
Documentation
Each function has a small comment explaining what it does, most descriptions are almost directly taken from https://regexr.com/. Want to get some more information look at the MDN page
License
RegExpressed is released under the MIT License.