GSM 03.40

GSM 03.40 or 3GPP TS 23.040 is a mobile telephony standard describing the format of the Transfer Protocol Data Units (TPDU) of the Short Message Transfer Protocol (SM-TP) used in the GSM networks to carry Short Messages. This format is used throughout the whole transfer of the message in the GSM mobile network. In contrast, application servers use different protocols, like Short Message Peer-to-Peer or Universal Computer Protocol, to exchange messages between them and the Short Message Service Center (SMSC).

GSM 03.40 is the original name of the standard. Since 1999 has been developed by the 3GPP under the name 3GPP TS 23.040. However, the original name is often used to refer even to the 3GPP document.

Usage
The GSM 03.40 TPDUs are used to carry messages between the Mobile Station (MS) and Mobile Switching Centre (MSC) using the Short Message Relay Protocol (SM-RP), while between MSC and Short Message Service Centre (SMSC) the TPDUs are carried as a parameter of a Mobile Application Part (MAP) package.

In emerging networks which use IP Multimedia Subsystem (IMS) Short Messages are carried in the MESSAGE command of Session Initiation Protocol (SIP). Even in these IP-based networks an option exists which (due to compatibility reasons) defines transfer of Short Messages in the GSM 03.40 format embedded in 3GPP 24.011 as Content-Type: application/vnd.3gpp.sms.

TPDU Types
GSM 03.40 defines six types of messages between Mobile Station (MS) and SMS Center (SC), which are distinguished by the message direction and the two least significant bits in the first octet of SM-TP message (the TP-MTI field):

SMS-SUBMIT is used to submit a short message from a mobile phone (Mobile Station, MS) to a short message service centre (SMSC, SC).

SMS-SUBMIT-REPORT is an acknowledgement to the SMS-SUBMIT; a success means that the message was stored (buffered) in the SMSC, a failure means that the message was rejected by the SMSC.

SMS-COMMAND may be used to query for a message buffered in the SMSC, to modify its parameters or to delete it.

SMS-DELIVER is used to deliver a message from SMSC to a mobile phone. The acknowledgement returned by the mobile phone may optionally contain a SMS-DELIVER-REPORT. When home routing applies, SMS-DELIVER is used to submit messages from an SMSC to another one.

SMS-STATUS-REPORT may be sent by the SMSC to inform the originating mobile phone about the final outcome of the message delivery or to reply to a SMS-COMMAND.

TPDU Fields
The fields of SM-TP messages, including their order and size, are summarized in the following table, where M means a mandatory field, O an optional field, E is used for fields which are mandatory in negative responses (RP-ERR) and not present in positive responses (RP-ACK), x is a field present elsewhere:

The first octet of the TPDU contains various flags including the TP-MTI field described above:

By setting the TP-More-Messages-to-Send (TP-MMS) bit to 0 (reversed logic), the SMSC signals it has more messages for the recipient (often further segments of a concatenated message). The MSC usually does not close the connection to the mobile phone and does not end the MAP dialogue with the SMSC, which allows faster delivery of subsequent messages or message segments. If by coincidence the further messages vanish from the SMSC in the meantime (when they are for example deleted), the SMSC terminates the MAP dialogue with a MAP Abort message.

The TP-Loop-Prevention (TP-LP) bit is designed to prevent looping of SMS-DELIVER or SMS-STATUS-REPORT messages routed to a different address than is their destination address or generated by an application. Such message may be sent only if the original message had this flag cleared and the new message must be sent with the flag set.

By setting the TP-Status-Report-Indication (TP-SRI) bit to 1, the SMSC requests a status report to be returned to the SME.

By setting the TP-Status-Report-Request (TP-SRR) bit to 1 in a SMS-SUBMIT or SMS-COMMAND, the mobile phone requests a status report to be returned by the SMSC.

When the TP-SRQ has value of 1 in an SMS-STATUS-REPORT message, the message is the result of an SMS-COMMAND; otherwise it is a result of an SMS-SUBMIT.

When TP-UDHI has value 1, the TP-UD field starts with User Data Header.

Setting the TP-RP bits turns on a feature which allows to send a reply for a message using the same path as the original message. If the originator and the recipient home networks differ, the reply would go through another SMSC then usually. The mobile operator must take special measures to charge such messages.

Both SM-RP and MAP used to transmit GSM 03.40 TPDU carry enough information to return acknowledgement&mdash;the information whether a request was successful or not. However, a GSM 03.40 TPDU may be included in the acknowledgement to carry even more information. The GSM 03.40 has undergone the following development:


 * Up to GSM 03.40 5.2.0 SMS-DELIVER-REPORT and SMS-SUBMIT-REPORT was sent only in the case of an error. Since 5.3.0 they are sent in case of success as well. MO-ForwardSM-Res was introduced back in GSM 09.02 5.6.0 August 1997
 * Up to GSM 03.40 6.0.0 SMS-DELIVER-REPORT and SMS-SUBMIT-REPORT sent in case of an error contained only TP-MTI and TP-FCS fields and the last field in SMS-STATUS-REPORT was TP-ST. Since version 6.1.0 these TPDUs has format shown in the table above.

Although these changes are ancient (version 6.1.0 occurred in July 1998), old formats of MAP are frequently seen even in today's networks.

Message Content
The content of the message (its text when the message is not a binary one) is carried in the TP-UD field. Its size may be up to 160 × 7 = 140 × 8 = 1120 bits. Longer messages can be split into multiple parts and sent as a Concatenated SMS. The length of message content is given in the TP-UDL field. When the message encoding is GSM 7-bit default alphabet (depends on TP-DCS field), the TP-UDL gives length of TP-UD in 7-bit units; otherwise TP-UDL gives length of the TP-UD in octets.

When TP-UDHI is 1, the TP-UD starts with User Data Header (UDH); in this case the first octet of the TP-UD is User Data Header Length (UDHL) octet, containing the length of the UDH in octets without UDHL itself. UDH eats room from the TP-UD field. When the message encoding is GSM 7-bit default alphabet and a UDH is present, fill bits are inserted to align start of the first character of the text after UDH with septet boundary. This behaviour was designed for older mobile phones which don't understand UDH; such mobile phones might display the UDH as a jumble of strange characters; if the first character after UDH was Carriage Return (CR), the mobile phone would rewrite the message with the rest of the message.

Addresses
A GSM 03.40 message contains at most one address: destination address (TP-DA) in SMS-SUBMIT and SMS-COMMAND, originator address (TP-OA) in SMS-DELIVER and recipient address (TP-RA) in SMS-STATUS-REPORT. Other addresses are carried by lower layers.

The format of addresses in the GSM 03.40 is described in the following table:

Type of number (TON):

If a subscriber enters a telephone number with `+' sign at its start, the `+' sign will be removed and the address gets TON=1 (international number), NPI=1. The number itself must always start with a country code and must be formatted exactly according to the E.164 standard.

In contrast, for numbers written without `+' sign the address gets TON=0 (unknown), NPI=1. In this case the number must adhere to the mobile operator's dial plan, which means that international numbers must have the international prefix (00 in most countries, but 011 in the USA) before the country code and numbers for long-distance calls must start with the trunk prefix (0 in most countries, 1 in the USA) followed by a trunk code.

Numbering plan identification (NPI):

Telephone numbers should have NPI=1. Application servers may use alphanumeric addresses which have TON=5, NPI=0 combination.

The EXT bit is always 1 meaning "no extension".

Address examples
U.S. number +1 555 123 4567 would be encoded as 0B 91 51 55 21 43 65 F7 (the F in upper four bits of the last octet is a filler which is used when the number length is odd).

Alphanumeric address is at first put to the GSM 7-bit default alphabet, then encoded the same way as any message text in TP-UD field (that means it is 7-bit packed) and then the address is supplied with the "number" length and TON and NPI.

For example, a fictional alphanumeric address Design@Home is converted to the GSM 7-bit default alphabet which yields 11 bytes 44 65 73 69 67 6E 00 48 6F 6D 65 (hex), the 7-bit packing transforms it to 77 bits stored in 10 octets as C4 F2 3C 7D 76 03 90 EF 76 19; 77 bits is 20 nibbles (14 hex) which is the value of the first octet of the address. The second octet contains TON (5) and NPI (0), which yields D0 hex. The complete address in the GSM format is 14 D0 C4 F2 3C 7D 76 03 90 EF 76 19.

Message Reference
The Message Reference field (TP-MR) is used in all messages on the submission side with exception of the SMS-SUBMIT-REPORT (that is in SMS-SUBMIT, SMS-COMMAND and SMS-STATUS-REPORT). It is a single-octet value which is incremented each time a new message is submitted or a new SMS-COMMAND is sent. If the message submission fails, the mobile phone should repeat the submission with the same TP-MR value and with the TP-RD bit set to 1.

Time Format
A date and time used in TP-SCTS, TP-DT and in Absolute format of TP-VP is stored in 7 octets:

In all octets the values are stored in binary coded decimal format with switched digits (number 35 is stored as 53 hex).

Time zone is given in quarters of an hour. If the time zone offset is negative (in Western hemisphere) bit 3 of the last octet is set to 1.

23:01:56 Mar 25th 2013 PST (GMT-7) would be encoded as 31 30 52 32 10 65 8A.

In this example, the time zone, 8A is binary 1000 1010. Bit 3 is 1, therefore the time zone is negative. The remaining number (bit-wise 'and' with 1111 0111) is 1000 0010, hexadecimal 82. Treat this as any previous element in the sequence, (hex 82 represents number 28). Finally the time zone offset is given by 28 × 15 minutes = 420 minutes (7 hours).

Validity Period
An SMS-SUBMIT TPDU may contain a TP-VP parameter which limits the time period for which the SMSC would attempt to deliver the message. However, the validity period is usually limited globally by the SMSC configuration parameter&mdash; often to 48 or 72 hours. The Validity Period format is defined by the Validity Period Format field:

Absolute format
The absolute format is identical to the other time formats in GSM 03.40.

Enhanced format
Enhanced format of TP-VP field is seldom used. It has always 7 octets, although some of them are not used. The first octet is TP-VP Functionality Indicator. Its 3 least significant bits have the following meaning:

The value of 1 in the bit 6 of the first octet means that the message is Single-shot. The value of 1 in the bit 7 of the first octet indicates that TP-VP functionality indicator extends to another octet. However, no such extensions are defined.

Protocol Identifier
TP-PID (Protocol identifier) either refers to the higher layer protocol being used, indicates interworking with a certain type of telematic device (like fax, telex, pager, teletex, e-mail), specifies replace type of the message or allows download of configuration parameters to the SIM card. Plain MO-MT messages have PID=0.

For TP-PID = 63 the SC converts the SM from the received TP Data Coding Scheme to any data coding scheme supported by that MS (e.g. the default).

Short Message Type 0 is known as a silent SMS. Any handset must be able to receive such short message irrespective of whether there is memory available in the (U)SIM or ME or not, must acknowledge receipt of the message, but must not indicate its receipt to the user and must discard its contents, so the message will not be stored in the (U)SIM or ME.

Data Coding Scheme
A special 7-bit encoding called GSM 7 bit default alphabet was designed for Short Message System in GSM. The alphabet contains the most-often used symbols from most Western-European languages (and some Greek uppercase letters). Some ASCII characters and the Euro sign did not fit into the GSM 7-bit default alphabet and must be encoded using two septets. These characters form GSM 7-bit default alphabet extension table. Support of the GSM 7-bit alphabet is mandatory for GSM handsets and network elements.

Languages which use Latin script, but use characters which are not present in the GSM 7-bit default alphabet, often replace missing characters with diacritic marks with corresponding characters without diacritics, which causes a not entirely satisfactory user experience, but is often accepted. For best look the 16-bit UTF-16 (in GSM called UCS-2) encoding may be used at price of reducing length of a (non segmented) message from 160 to 70 characters.

The messages in Chinese, Korean or Japanese languages must be encoded using the UTF-16 character encoding. The same was also true for other languages using non-Latin scripts like Russian, Arabic, Hebrew and various Indian languages. In 3GPP TS 23.038 8.0.0 published in 2008 a new feature, an extended National language shift table was introduced, which in the version 11.0.0 published in 2012 covers Turkish, Spanish, Portuguese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Oriya, Punjabi, Tamil, Telugu and Urdu languages. The mechanism replaces GSM 7-bit default alphabet code table and/or extended table with a national table(s) according to special information elements in User Data Header. The non-segmented message using national language shift table(s) may carry up to 155 (or 153) 7-bit characters.

The Data Coding Scheme (TP-DCS) field contains primarily information about message encoding. GSM recognizes only 2 encodings for text messages and 1 encoding for binary messages:


 * GSM 7-bit default alphabet (which includes using of National language shift tables as well)
 * UCS-2
 * 8-bit data

The TP-DCS octet has a complex syntax to allow carrying of other information; the most notable are message classes:

Flash messages are received by a mobile phone even though it has full memory. They are not stored in the phone, they just displayed on the phone display.

Another feature available through TP-DCS is Automatic Deletion: after reading the message is deleted from the phone.

Message Waiting Indication group of DCS values can set or reset flags of indicating presence of unread voicemail, fax, e-mail or other messages.

A special DCS values also allows message compression, but it perhaps is not used by any operator.

The values of TP-DCS are defined in GSM recommendation 03.38. Messages sent via this encoding can be encoded in the default GSM 7-bit alphabet, the 8-bit data alphabet, and the 16-bit UCS-2 alphabet.

Discharge Time
The TP-DT field indicates the time and date associated with a particular TP-ST outcome:


 * if the message has been delivered or, more generally, other transaction completed (TP-ST is 0-31), the TP-DT is the time of the completion of the transaction
 * if the SMSC is still trying to deliver the message (TP-ST is 32-63), the TP-DT is the time of the last delivery attempt
 * if the SMSC is not making any more delivery attempts (TP-ST is 64-127), the TP-DT is either the time of the last delivery attempt or the time at which the SMSC disposed the message

Parameter Indicator
The TP-PI field indicates presence of further fields in the SUBMIT-REPORT, DELIVER-REPORT or SMS-STATUS-REPORT TPDU.

As currently there are still four free bits in TP-PI, it can be expected that the extension bit will be zero even in the future, which helps to distinguish TP-PI field from TP-FCS field when information whether TPDU is part of positive or negative response is not available: if the most significant bit of the second octet of TPDU is 1, the second octet is TP-FCS (in a negative response), otherwise it is TP-PI (in a positive response).