Skip to content

5.5. Strings#

Syntax#


StringExpression = (* variant: *) StringLiteral (* Deprecated in 0.5.14 *)
| (* variant: *) StringLiterals (* Introduced in 0.5.14 *)
| (* variant: *) HexStringLiteral (* Deprecated in 0.5.14 *)
| (* variant: *) HexStringLiterals (* Introduced in 0.5.14 *)
| (* variant: *) UnicodeStringLiterals; (* Introduced in 0.7.0 *)

(* Introduced in 0.5.14 *)
StringLiterals = (* item: *) StringLiteral+;

StringLiteral = (* variant: *) SINGLE_QUOTED_STRING_LITERAL
| (* variant: *) DOUBLE_QUOTED_STRING_LITERAL;

(* Deprecated in 0.4.25 *)
SINGLE_QUOTED_STRING_LITERAL = "'" («ESCAPE_SEQUENCE_ARBITRARY» | !("'" "\\" "\r" "\n"))* "'";

(* Introduced in 0.4.25 and deprecated in 0.7.0. *)
SINGLE_QUOTED_STRING_LITERAL = "'" («ESCAPE_SEQUENCE» | !("'" "\\" "\r" "\n"))* "'";

SINGLE_QUOTED_STRING_LITERAL = "'" («ESCAPE_SEQUENCE» | (" ""&") | ("(""[") | ("]""~"))* "'";

(* Deprecated in 0.4.25 *)
DOUBLE_QUOTED_STRING_LITERAL = '"' («ESCAPE_SEQUENCE_ARBITRARY» | !('"' "\\" "\r" "\n"))* '"';

(* Introduced in 0.4.25 and deprecated in 0.7.0. *)
DOUBLE_QUOTED_STRING_LITERAL = '"' («ESCAPE_SEQUENCE» | !('"' "\\" "\r" "\n"))* '"';

DOUBLE_QUOTED_STRING_LITERAL = '"' («ESCAPE_SEQUENCE» | (" ""!") | ("#""[") | ("]""~"))* '"';

(* Introduced in 0.5.14 *)
HexStringLiterals = (* item: *) HexStringLiteral+;

HexStringLiteral = (* variant: *) SINGLE_QUOTED_HEX_STRING_LITERAL
| (* variant: *) DOUBLE_QUOTED_HEX_STRING_LITERAL;

SINGLE_QUOTED_HEX_STRING_LITERAL = "hex'" «HEX_STRING_CONTENTS»? "'";

DOUBLE_QUOTED_HEX_STRING_LITERAL = 'hex"' «HEX_STRING_CONTENTS»? '"';

«HEX_STRING_CONTENTS» = «HEX_CHARACTER» «HEX_CHARACTER» ("_"? «HEX_CHARACTER» «HEX_CHARACTER»)*;

«HEX_CHARACTER» = ("0""9") | ("a""f") | ("A""F");

(* Introduced in 0.7.0 *)
UnicodeStringLiterals = (* item: *) UnicodeStringLiteral+;

(* Introduced in 0.7.0 *)
UnicodeStringLiteral = (* variant: *) SINGLE_QUOTED_UNICODE_STRING_LITERAL
| (* variant: *) DOUBLE_QUOTED_UNICODE_STRING_LITERAL;

(* Introduced in 0.7.0 *)
SINGLE_QUOTED_UNICODE_STRING_LITERAL = "unicode'" («ESCAPE_SEQUENCE» | !("'" "\\" "\r" "\n"))* "'";

(* Introduced in 0.7.0 *)
DOUBLE_QUOTED_UNICODE_STRING_LITERAL = 'unicode"' («ESCAPE_SEQUENCE» | !('"' "\\" "\r" "\n"))* '"';

«ESCAPE_SEQUENCE» = "\\" («ASCII_ESCAPE» | «HEX_BYTE_ESCAPE» | «UNICODE_ESCAPE»);

(* Deprecated in 0.4.25 *)
«ESCAPE_SEQUENCE_ARBITRARY» = "\\" (!("x" "u") | «HEX_BYTE_ESCAPE» | «UNICODE_ESCAPE»);

«ASCII_ESCAPE» = "n" | "r" | "t" | "'" | '"' | "\\" | "\r\n" | "\r" | "\n";

«HEX_BYTE_ESCAPE» = "x" «HEX_CHARACTER» «HEX_CHARACTER»;

«UNICODE_ESCAPE» = "u" «HEX_CHARACTER» «HEX_CHARACTER» «HEX_CHARACTER» «HEX_CHARACTER»;

String Literals#

String literals are written with either double or single-quotes ("foo" or 'bar'), and they can also be split into multiple consecutive parts ("foo" "bar" is equivalent to "foobar") which can be helpful when dealing with long strings. They do not imply trailing zeroes as in C; "foo" represents three bytes, not four. As with integer literals, their type can vary, but they are implicitly convertible to bytes1, ..., bytes32 if they fit.

String literals can only contain printable ASCII characters, which means the characters between 0x20 and 0x7E inclusively.

Unicode Literals#

While regular string literals can only contain ASCII, unicode literals (prefixed with the keyword unicode) can contain any valid UTF-8 sequence. They also support the very same escape sequences as regular string literals.

string memory a = unicode"Hello 😃";

Hexadecimal Literals#

Hexadecimal literals are prefixed with the keyword hex and are enclosed in double or single-quotes (hex"001122FF", hex'0011_22_FF'). Their content must be hexadecimal digits which can optionally use a single underscore as separator between byte boundaries. The value of the literal will be the binary representation of the hexadecimal sequence.

Hexadecimal literals behave like string literals and have the same convertibility restrictions. Additionally, multiple hexadecimal literals separated by whitespace are concatenated into a single literal: hex"00112233" hex"44556677" is equivalent to hex"0011223344556677"

Escape Sequences#

String literals also support the following escape characters:

  • \<newline> (escapes an actual newline)
  • \\ (backslash)
  • \' (single quote)
  • \" (double quote)
  • \n (newline)break
  • \r (carriage return)
  • \t (tab)
  • \xNN (hex escape, takes a hex value and inserts the appropriate byte)
  • \uNNNN (unicode escape, takes a Unicode code point and inserts an UTF-8 sequence)

Any Unicode line terminator which is not a newline (i.e. LF, VF, FF, CR, NEL, LS, PS) is considered to terminate the string literal. Newline only terminates the string literal if it is not preceded by a \.

Note

This section is under construction. You are more than welcome to contribute suggestions to our GitHub repository.