10.4. re — Regular expressions

Source code: src/text/re.h, src/text/re.c

Test code: tst/text/re/main.c

Test coverage: src/text/re.c


Defines

RE_IGNORECASE

Perform case-insensitive matching; expressions like [A-Z] will match lowercase letters, too.

RE_DOTALL

Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline.

RE_MULTILINE

When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline). By default, ‘^’ matches only at the beginning of the string, and '$' only at the end of the string and immediately before the newline (if any) at the end of the string.

Functions

int re_module_init(void)

Initialize the re module. This function must be called before calling any other function in this module.

The module will only be initialized once even if this function is called multiple times.

Return
zero(0) or negative error code.

char *re_compile(char *compiled_p, const char *pattern_p, char flags, size_t size)

Compile given pattern.

Pattern syntax:

  • '.' - Any character.
  • '^' - Beginning of the string (not yet supported).
  • '$' - End of the string (not yet supported).
  • '?' - Zero or one repetitions (greedy).
  • '*' - Zero or more repetitions (greedy).
  • '+' - One or more repetitions (greedy).
  • '??' - Zero or one repetitions (non-greedy).
  • *? - Zero or more repetitions (non-greedy).
  • +? - One or more repetitions (non-greedy).
  • {m} - Exactly m repetitions.
  • \\ - Escape character.
  • [] - Set of characters.
  • '|' - Alternatives (not yet supported).
  • (...) - Groups (not yet supported).
  • \\d - Decimal digits [0-9].
  • \\w - Alphanumerical characters [a-ZA-Z0-9_].
  • \\s - Whitespace characters [ \t\r\n\f\v].

Return
Compiled patten, or NULL if the compilation failed.
Parameters
  • compiled_p: Compiled regular expression pattern.
  • pattern_p: Regular expression pattern.
  • flags: A combination of the flags RE_IGNORECASE, RE_DOTALL and RE_MULTILINE (RE_MULTILINE is not yet supported).
  • size: Size of the compiled buffer.

ssize_t re_match(const char *compiled_p, const char *buf_p, size_t size, struct re_group_t *groups_p, size_t *number_of_groups_p)

Apply given regular expression to the beginning of given string.

Return
Number of matched bytes or negative error code.
Parameters
  • compiled_p: Compiled regular expression pattern. Compile a pattern with re_compile().
  • buf_p: Buffer to apply the compiled pattern to.
  • size: Number of bytes in the buffer.
  • groups_p: Read groups or NULL.
  • number_of_groups_p: Number of read groups or NULL.

struct re_group_t

Public Members

const char *buf_p
ssize_t size