Unified Hardware Design, Specification, and Verification Language
6. Data types
6.16 String data type
The string data type is an ordered collection of characters. The length of a string variable is the number of characters in the collection. Variables of type string are dynamic as their length may vary during simulation. A single character of a string variable may be selected for reading or writing by indexing the variable. A single character of a string variable is of type byte.
SystemVerilog also includes a number of special methods to work with strings, which are defined in this subclause.
A string variable does not represent a string in the same way as a string literal (see 5.9). String literals behave like packed arrays of a width that is a multiple of 8 bits. A string literal assigned to a packed array of an integral variable of a different size is either truncated to the size of the variable or padded with zeros to the left as necessary. When using the string data type instead of an integral variable, strings can be of
arbitrary length and no truncation occurs. String literals are implicitly converted to the string type when assigned to a string type or used in an expression involving string type operands.
The indices of string variables shall be numbered from 0 to N–1 (where N is the length of the string) so that index 0 corresponds to the first (leftmost) character of the string and index N–1 corresponds to the last (rightmost) character of the string. The string variables can take on the special value "", which is the empty string. Indexing an empty string variable shall be an out-of-bounds access.
A string variable shall not contain the special character "\0". Assigning the value 0 to a string character shall be ignored.
The syntax to declare a string variable is as follows:
string variable_name [= initial_value];
where variable_name is a valid identifier and the optional initial_value can be a string literal, the value "" for an empty string, or a string data type expression. For example:
parameter string default_name = "John Smith";
string myName = default_name;
If an initial value is not specified in the declaration, the variable is initialized to “ ”, the empty string. An empty string has zero length.
SystemVerilog provides a set of operators that can be used to manipulate combinations of string variables and string literals. The basic operators defined on the string data type are listed in Table 6-9.
A string literal can be assigned to a variable of a string or an integral data type. When assigning to a variable of integral data type, if the number of bits of the data object is not equal to the number of characters in the string literal multiplied by 8, the literal is right justified and either truncated on the left or zero-filled on the left, as necessary. For example:
byte c = "A"; // assigns to c "A"
bit [10:0] b = "\x41"; // assigns to b 'b000_0100_0001 bit [1:4][7:0] h = "hello" ; // assigns to h "ello"
A string literal or an expression of string type can be assigned directly to a variable of string type (a string variable). Values of integral type can be assigned to a string variable, but require a cast. When casting an integral value to a string variable, that variable shall grow or shrink to accommodate the integral value. If the size of the integral value is not a multiple of 8 bits, then the value shall be zero-filled on the left so that its size is a multiple of 8 bits.
A string literal assigned to a string variable is converted according to the following steps:
— All "\0" characters in the string literal are ignored (i.e., removed from the string).
— If the result of the first step is an empty string literal, the string is assigned the empty string.
— Otherwise, the string is assigned the remaining characters in the string literal.
Casting an integral value to a string variable proceeds in the following steps:
— If the size (in bits) of the integral value is not a multiple of 8, the integral value is left extended and filled with zeros until its bit size is a multiple of 8. The extended value is then treated the same as a string literal, where each successive 8 bits represent a character.
— The steps described above for string literal conversion are then applied to the extended value.
VERIFICATION LANGUAGE
For example:
string s0 = "String literal assign";// sets s0 to "String literal assign"
string s1 = "hello\0world"; // sets s1 to "helloworld"
bit [11:0] b = 12'ha41;
string s2 = string'(b); // sets s2 to 16'h0a41 As a second example:
typedef logic [15:0] r_t;
r_t r;
a = {i{"Hi"}}; // OK (non-constant replication) r = {i{"Hi"}}; // invalid (non-constant replication)
a = {i{b}}; // OK
a = {a,b}; // OK
a = {"Hi",b}; // OK
r = {"H",""}; // yields "H\0". "" is converted to 8'b0 b = {"H",""}; // yields "H". "" is the empty string a[0] = "h"; // OK, same as a[0] = "cough"
a[0] = b; // invalid, requires a cast a[1] = "\0"; // ignored, a is unchanged
Table 6-9—String operators
Operator Semantics
Str1 == Str2 Equality. Checks whether the two string operands are equal. Result is 1 if they are equal and 0 if they are not. Both operands can be expressions of string type, or one can be an expression of string type and the other can be a string literal, which shall be implicitly converted to string type for the comparison.
If both operands are string literals, the operator is the same equality operator as for integral types.
Str1 != Str2 Inequality. Logical negation of ==
Str1 < Str2 Str1 <= Str2 Str1 > Str2 Str1 >= Str2
Comparison: Relational operators return 1 if the corresponding condition is true using the lexicographic ordering of the two strings Str1 and Str2. The com-parison uses the compare string method. Both operands can be expressions of string type, or one can be an expression of string type and the other can be a string literal, which shall be implicitly converted to string type for the com-parison. If both operands are string literals, the operator is the same comparison operator as for integral types.
{Str1,Str2,...,Strn} Concatenation: Each operand can be a string literal or an expression of string type. If all the operands are string literals the expression shall behave as a con-catenation of integral values; if the result of such a concon-catenation is used in an expression involving string types then it shall be implicitly converted to string type. If at least one operand is an expression of string type, then any operands that are string literals shall be converted to string type before the concatenation is performed, and the result of the concatenation shall be of string type.
SystemVerilog also includes a number of special methods to work with strings, which use the built-in method notation. These methods are described in 6.16.1 through 6.16.15.
6.16.1 Len()
function int len();
— str.len() returns the length of the string, i.e., the number of characters in the string.
— If str is " ", then str.len() returns 0.
6.16.2 Putc()
function void putc(int i, byte c);
— str.putc(i, c) replaces the ith character in str with the given integral value.
— putc does not change the size of str: If i < 0 or i >= str.len(), then str is unchanged.
— If the second argument to putc is zero, the string is unaffected.
The putc method assignment str.putc(j, x) is semantically equivalent to str[j] = x. 6.16.3 Getc()
function byte getc(int i);
— str.getc(i) returns the ASCII code of the ith character in str.
— If i < 0 or i >= str.len(), then str.getc(i) returns 0.
The getc method assignment x = str.getc(j) is semantically equivalent to x = str[j]. 6.16.4 Toupper()
function string toupper();
— str.toupper() returns a string with characters in str converted to uppercase.
— str is unchanged.
{multiplier{Str}} Replication: Str can be a string literal or an expression of string type.
multiplier shall be an expression of integral type and is not required to be a constant expression. If multiplier is non-constant or Str is an expression of string type, the result is a string containing N concatenated copies of Str, where N is specified by the multiplier. If Str is a literal and the
multiplier is constant, the expression behaves like numeric replication (if the result is used in another expression involving string types, it is implicitly converted to the string type).
Str[index] Indexing. Returns a byte, the ASCII code at the given index. Indices range from 0 to N–1, where N is the number of characters in the string. If given an index out of range, returns 0. Semantically equivalent to Str.getc(index) in 6.16.3.
Str.method(...) The dot (.) operator is used to invoke a specified method on strings.
Table 6-9—String operators (continued)
Operator Semantics
VERIFICATION LANGUAGE
6.16.5 Tolower()
function string tolower();
— str.tolower() returns a string with characters in str converted to lowercase.
— str is unchanged.
6.16.6 Compare()
function int compare(string s);
— str.compare(s) compares str and s, as in the ANSI C strcmp function with regard to lexical ordering and return value.
See the relational string operators in Table 6-9.
6.16.7 Icompare()
function int icompare(string s);
— str.icompare(s) compares str and s, like the ANSI C strcmp function with regard to lexical ordering and return value, but the comparison is case insensitive.
6.16.8 Substr()
function string substr(int i, int j);
— str.substr(i, j) returns a new string that is a substring formed by characters in position i through j of str.
— If i < 0, j < i, or j >= str.len(), substr() returns "" (the empty string).
6.16.9 Atoi(), atohex(), atooct(), atobin() function integer atoi();
function integer atohex();
function integer atooct();
function integer atobin();
— str.atoi() returns the integer corresponding to the ASCII decimal representation in str. For example:
str = "123";
int i = str.atoi(); // assigns 123 to i.
The conversion scans all leading digits and underscore characters ( _ ) and stops as soon as it encounters any other character or the end of the string. It returns zero if no digits were encountered. It does not parse the full syntax for integer literals (sign, size, apostrophe, base).
— atohex interprets the string as hexadecimal.
— atooct interprets the string as octal.
— atobin interprets the string as binary.
NOTE—These ASCII conversion functions return a 32-bit integer value. Truncation is possible without warning. For converting integer values greater than 32 bits, see $sscanf in 21.3.4.
6.16.10 Atoreal()
function real atoreal();
— str.atoreal() returns the real number corresponding to the ASCII decimal representation in str.
The conversion parses for real constants. The scan stops as soon as it encounters any character that does not conform to this syntax or the end of the string. It returns zero if no digits were encountered.
6.16.11 Itoa()
function void itoa(integer i);
— str.itoa(i) stores the ASCII decimal representation of i into str (inverse of atoi).
6.16.12 Hextoa()
function void hextoa(integer i);
— str.hextoa(i) stores the ASCII hexadecimal representation of i into str (inverse of atohex).
6.16.13 Octtoa()
function void octtoa(integer i);
— str.octtoa(i) stores the ASCII octal representation of i into str (inverse of atooct).
6.16.14 Bintoa()
function void bintoa(integer i);
— str.bintoa(i) stores the ASCII binary representation of i into str (inverse of atobin).
6.16.15 Realtoa()
function void realtoa(real r);
— str.realtoa(r) stores the ASCII real representation of r into str (inverse of atoreal).