2.1 Character Set

From OC Systems Wiki!
< Guide:95lrm
Revision as of 23:36, 4 May 2019 by imported>WikiVisor (1 revision imported)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The only characters allowed outside of comments are the graphic_characters and format_effectors.

Syntax

character ::= graphic_character | format_effector | other_control_function

graphic_character ::= identifier_letter | digit | space_character | special_character</syntaxhighlight>

Static Semantics

The character repertoire for the text of an Ada program consists of the collection of characters called the Basic Multilingual Plane (BMP) of the ISO 10646 Universal Multiple-Octet Coded Character Set, plus a set of format_effectors and, in comments only, a set of other_control_functions; the coded representation for these characters is implementation defined (it need not be a representation defined within ISO-10646-1).

The description of the language definition in this International Standard uses the graphic symbols defined for Row 00: Basic Latin and Row 00: Latin-1 Supplement of the ISO 10646 BMP; these correspond to the graphic symbols of ISO 8859-1 (Latin-1); no graphic symbols are used in this International Standard for characters outside of Row 00 of the BMP. The actual set of graphic symbols used by an implementation for the visual representation of the text of an Ada program is not specified.

The categories of characters are defined as follows:

identifier_letter
upper_case_identifier_letter | lower_case_identifier_letter
upper_case_identifier_letter
Any character of Row 00 of ISO 10646 BMP whose name begins Latin Capital Letter.
lower_case_identifier_letter
Any character of Row 00 of ISO 10646 BMP whose name begins Latin Small Letter.
 digit
One of the characters 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9.
 space_character
The character of ISO 10646 BMP named Space.
 special_character
Any character of the ISO 10646 BMP that is not reserved for a control function, and is not the space_character, an identifier_letter, or a digit.
 format_effector
The control functions of ISO 6429 called character tabulation (HT), line tabulation (VT), carriage return (CR), line feed (LF), and form feed (FF).
 other_control_function
Any control function, other than a format_effector, that is allowed in a comment; the set of other_control_functions allowed in comments is implementation defined.

The following names are used when referring to certain special_characters:

symbol name symbol name
       
      " quotation mark       : colon
      # number sign       ; semicolon
      & ampersand       < less-than sign
      ' apostrophe, tick       = equals sign
      ( left parenthesis       > greater-than sign
      ) right parenthesis       _ low line, underline
      * asterisk, multiply       | vertical line
      + plus sign       [ left square bracket
      , comma       ] right square bracket
      - hyphen-minus, minus       { left curly bracket
      . full stop, dot, point       } right curly bracket
      / solidus, divide    

Implementation Permissions

In a nonstandard mode, the implementation may support a different character repertoire; in particular, the set of characters that are considered identifier_letters can be extended or changed to conform to local conventions.

Notes

1  Every code position of ISO 10646 BMP that is not reserved for a control function is defined to be a graphic_character by this International Standard. This includes all code positions other than 0000 - 001F, 007F - 009F, and FFFE - FFFF.

2  The language does not specify the source representation of programs.

Copyright © 1992,1993,1994,1995 Intermetrics, Inc.
Copyright © 2000 The MITRE Corporation, Inc. Ada Reference Manual