Help:Manipulating strings

The English Wikipedia has several templates and Lua modules which can format or manipulate strings. In this context a "string" is any piece of text forming part of a page. This help page covers a few useful techniques; look in the navbox below for the full catalogue of templates.

Substrings
The simplest operation is taking a substring, a snippet of the string taken at a certain offset (called an "index") from the start or end. There are a number of legacy templates offering this (see navbox) but for new code use. The indices are one-based (meaning the first is number one), inclusive (meaning the indices you specify are included), and may be negative to count from the other end. For example,  →. Not all the legacy substring templates use this numbering scheme, so check the documentation of unfamiliar templates.

Using existing templates
If you think that someone will have done what you want before, look in the navbox below and check. It is much easier to find and use an existing template than to write complex code to do it all in one place.

Look for a template that will do what you want all in one go. For example, rather than taking the final six characters of a string and checking if they are equal to "navbox", use.

Automatically trimmed whitespace
If you pass the string " " (without quotes) to a template via a named or explicitly numbered parameter (like  the spaces on the outside will be trimmed off and will not be counted for anything the template does with that parameter. It will see the string.

If you use automatically numbered parameters the spaces on the outside do count, but some templates may still choose to remove them themselves.

Lua patterns (regex)
Regular expressions (or regex) are a common and very versatile programming technique for manipulating strings. On Wikipedia you can use a limited version of regex called a Lua pattern to select and modify bits of text from a string. The pattern is a piece of code describing what you are looking for in the string. The symbols you an use in a pattern are:
 * means any individual character.  would mean any three characters, etc.
 * ,,  , and   are the quantifiers. They mean that the previous character can be repeated $n$ times, where for each symbol $n ≥ 0$, $n > 0$, $n$ is zero or one, and $n ≥ 0$ again respectively. (The difference with   is that it is "non-greedy", it matches as few symbols as possible given the rest of the pattern.)
 * means the start of the string, and  means the end.
 * means any symbol out of a, b or c, and  means anything that isn't a, b or c.
 * Preceding any of the above with a  takes away their normal meaning and makes them mean "literally" the symbol they are. Preceding anything else with a   (like  ) has a special meaning which you can check in the manual.

Putting this all together,  matches the first six characters of "AaAabcccc".

By wrapping part of the pattern in brackets, you can extract it, referencing it with the code. Example:
 * The find-replace instruction gives
 * We can discard the XYZ by putting  at the end of the search string; this picks up anything after the rest of the pattern.  gives.

StringFunctions (from ParserFunctions)
Wikipedia does not have the "StringFunctions" series of parser functions (listed below), and is not going to get them (per T8455). Instead, templates use Lua (via Module:String or otherwise), alongside existing parser functions.

None of these functions will work, but they have alternatives:
 * #len – use
 * #pos – use
 * #rpos
 * #sub – use
 * #count – use
 * #replace – use
 * #explode – use string split
 * #urldecode – use

Testing code
If you're not sure what some code is going to do, paste it into Special:ExpandTemplates, which will evaluate it for you to view.