User:Espeholt jr/sandbox

{Orphan|August 2006}} There is three types of strings used in programming languages. These types are:

- Fixed length strings

- Terminated strings

- Counted strings

This wiki will describe Counted UTF-8 string which, as the name says, is a counted string which uses the UTF-8 character encoding.

Length prefix
Pascal strings (also called P-strings) use a 1 byte length prefix. The major disadvantages to that is that strings can't be more than 255 characters long. In Microsoft .Net MSIL there is a 2byte length prefix. This allows the string to be 65535 ($$2^{16} - 1$$) characters long. It is very rare that a string is more than 64 K long, but it can happen and another disadvantage is that the string uses 2 bytes more, even if it only contains a single character. With Counted UTF-8 string all these problems are solved with a variable length prefix length. If the string contains 127 characters or more, there is a length prefix of 1 byte. If the string contains 16383 characters or more, there is a length prefix of 2 bytes.