JavaScript regex

Joshua 74 Published: 10/08/2024

JavaScript regex

I'd be happy to respond in English and provide a detailed explanation of JavaScript regular expressions.

What are Regular Expressions?

Regular expressions, often abbreviated as RegEx or Regex, are a sequence of characters that forms a search pattern for searching text. They are used to match strings against a specific pattern. In JavaScript, you can use the RegExp constructor to create a regular expression object and then use methods like test() or exec() to find matches in a string.

Basic Concepts

Before diving into JavaScript RegEx, let's cover some basic concepts:

Pattern: A sequence of characters that defines the search criteria. Match: A part of the original string that matches the pattern. Group: A set of characters enclosed by parentheses () that can be treated as a single unit.

JavaScript RegEx Syntax

The syntax for JavaScript regular expressions is as follows:

Literal characters: Match themselves literally (e.g., "hello" matches "hello"). Metacharacters: Special characters with special meanings: ., w, W: Match any character except a newline. ^: Matches the start of a string. $: Matches the end of a string. [ ] (square brackets): Defines a character class, matching one of the characters within the brackets. { } (curly braces): Defines a quantifier, specifying the minimum and maximum number of times to match. | (pipe): Used for alternation (matching either the pattern before or after the pipe).

Examples

Let's explore some examples:

Simple Match: var re = /hello/;

The regular expression /hello/ matches the string "hello".

let str = 'hello world';

console.log(str.match(re)); // Output: ["hello", index: 0, input: "hello world"]

Character Class: var re = /[a-zA-Z]/;

The regular expression /[a-zA-Z]/ matches any character that is a letter (both uppercase and lowercase).

let str = 'Hello World';

console.log(str.match(re)); // Output: ["H", index: 0, input: "Hello World"]

Quantifier: var re = /w{2,}/;

The regular expression /w{2,}/ matches one or more word characters (letters, numbers, or underscores) that appear at least two times.

let str = 'hello world';

console.log(str.match(re)); // Output: ["hello", index: 0, input: "hello world"]

Alternation: var re = /hello|hi/;

The regular expression /hello|hi/ matches either the string "hello" or the string "hi".

let str = 'hello';

console.log(str.match(re)); // Output: ["hello", index: 0, input: "hello"]

Advanced Topics

For more advanced users, here are some additional topics:

Groups: Use parentheses () to group patterns and capture or retrieve parts of the matched text. Backreferences: Use 1 (or any other group number) to reference the value captured by a previous group. Lookarounds: Use (?=pattern) for positive lookahead or (?!pattern) for negative lookahead to match without including the matched text.

Conclusion

JavaScript regular expressions provide an efficient way to search and manipulate text. Mastering RegEx syntax, patterns, and concepts can help you write more effective code and improve your problem-solving skills. With this foundation, you're ready to tackle more advanced topics in JavaScript programming!

JavaScript regex special characters list

Here is the list of JavaScript regex special characters:

Non-alphanumeric characters

., dot (matches any character except a newline) ^, caret (start of string or start of line, depending on the context) $, dollar sign (end of string or end of line, depending on the context)

[, square bracket (starts a character class or denotes a range) ], square bracket (ends a character class or denotes a range)

{, curly brace (starts a quantifier or group) }, curly brace (ends a quantifier or group) (, left parenthesis (grouping or capturing) ), right parenthesis (grouping or capturing) *, asterisk (zero or more occurrences) +, plus sign (one or more occurrences) ?, question mark (optional or non-greedy match) {n, m}, curly brace with commas (between n and m occurrences)

Escape sequences

, backslash (escapes special characters) a (matches the character 'a' literally) b (matches the character 'b' literally) c (matches the character 'c' literally) d (matches the character 'd' literally) e (matches the character 'e' literally) f (matches the character 'f' literally) g (matches the character 'g' literally) h (matches the character 'h' literally) i (matches the character 'i' literally) j (matches the character 'j' literally) k (matches the character 'k' literally) l (matches the character 'l' literally)

Word characters and non-word characters

w, word boundary (word, symbol or underscore) W, non-word boundary (non-alphanumeric characters) s, whitespace (spaces, tabs, newline, etc.) S, non-whitespace (non-space characters) d, digit (0-9) D, non-digit (non-numeric characters)

Flags

i, ignore case (case-insensitive match) m, multiline (treats newline as part of the string) s, dot matches newline (includes newline in dot matches) u, Unicode (uses Unicode characters and properties)

These special characters can be used to create more complex regular expressions, allowing for matching and validation of various patterns in strings.