Real Number and Non-number Encodings - REAL NUMBERS AND FLOATING-POINT FORMATS

4.6.2 128-Bit Packed SIMD Data Types

4.8 REAL NUMBERS AND FLOATING-POINT FORMATS

4.8.3 Real Number and Non-number Encodings

A variety of real numbers and special values can be encoded in the IEEE Standard 754 floating-point format. These numbers and values are generally divided into the following classes:

•

Signed zeros

•

Denormalized finite numbers

Table 4-5. Real and Floating-Point Number Notation

Notation Value

Ordinary Decimal 178.125

Scientific Decimal 1.78125E₁₀2 Scientific Binary 1.0110010001E₂111 Scientific Binary

(Biased Exponent) 1.0110010001E₂10000110

IEEE Single-Precision Format Sign Biased Exponent Normalized Significand

0 10000110 01100100010000000000000

1. (Implied)

4-16 Vol. 1 DATA TYPES

•

Normalized finite numbers

•

Signed infinities

•

NaNs

•

Indefinite numbers

(The term NaN stands for “Not a Number.”)

Figure 4-12 shows how the encodings for these numbers and non-numbers fit into the real number continuum. The encodings shown here are for the IEEE single-precision floating-point format. The term “S” indicates the sign bit, “E” the biased exponent, and “Sig” the significand.

The exponent values are given in decimal. The integer bit is shown for the significands, even though the integer bit is implied in single-precision floating-point format.

An IA-32 processor can operate on and/or return any of these values, depending on the type of computation being performed. The following sections describe these number and non-number classes.

Figure 4-12. Real Numbers and NaNs

1 0

S E Sig¹

− 0

1 0 − Denormalized

Finite NaN

1 1...254 − Normalized

Finite

1 255 −∞

255 SNaN

255 QNaN

NOTES:

3. Sign bit ignored.

2. Fraction must be non-zero.

0 0

S E Sig¹

0 0

NaN

0 1...254 0 255

X³ 255 1.0XX...² 255 1.1XX...

+ 0 +Denormalized Finite +Normalized Finite + ∞ SNaN

QNaN X³ X³

X³

Real Number and NaN Encodings For 32-Bit Floating-Point Format

− Denormalized Finite

−Normalized Finite − 0

− ∞ ⁺∞

+ Denormalized Finite + Normalized Finite + 0

0.XXX...² 0.000...

1.000...

1.XXX...

1.000...

0.000...

0.XXX...² 1.XXX...

1.0XX...² 1.1XX...

1. Integer bit of fraction implied for single-precision floating-point format.

Vol. 1 4-17 DATA TYPES

4.8.3.1 Signed Zeros

Zero can be represented as a +0 or a −0 depending on the sign bit. Both encodings are equal in value. The sign of a zero result depends on the operation being performed and the rounding mode being used. Signed zeros have been provided to aid in implementing interval arithmetic.

The sign of a zero may indicate the direction from which underflow occurred, or it may indicate the sign of an ∞ that has been reciprocated.

4.8.3.2 Normalized and Denormalized Finite Numbers

Non-zero, finite numbers are divided into two classes: normalized and denormalized. The normalized finite numbers comprise all the non-zero finite values that can be encoded in a normalized real number format between zero and ∞. In the single-precision floating-point format shown in Figure 4-12, this group of numbers includes all the numbers with biased expo-nents ranging from 1 to 25410 (unbiased, the exponent range is from −12610 to +12710).

When floating-point numbers become very close to zero, the normalized-number format can no longer be used to represent the numbers. This is because the range of the exponent is not large enough to compensate for shifting the binary point to the right to eliminate leading zeros.

When the biased exponent is zero, smaller numbers can only be represented by making the integer bit (and perhaps other leading bits) of the significand zero. The numbers in this range are called denormalized (or tiny) numbers. The use of leading zeros with denormalized numbers allows smaller numbers to be represented. However, this denormalization causes a loss of preci-sion (the number of significant bits in the fraction is reduced by the leading zeros).

When performing normalized floating-point computations, an IA-32 processor normally oper-ates on normalized numbers and produces normalized numbers as results. Denormalized numbers represent an underflow condition. The exact conditions are specified in Section 4.9.1.5,

“Numeric Underflow Exception (#U)”.

A denormalized number is computed through a technique called gradual underflow. Table 4-6 gives an example of gradual underflow in the denormalization process. Here the single-precision format is being used, so the minimum exponent (unbiased) is −126₁₀. The true result in this example requires an exponent of −129₁₀in order to have a normalized number. Since −129₁₀ is beyond the allowable exponent range, the result is denormalized by inserting leading zeros until the minimum exponent of −126₁₀ is reached.

* Expressed as an unbiased, decimal number.

Table 4-6. Denormalization Process

Operation Sign Exponent* Significand

True Result 0 −129 1.01011100000...00

Denormalize 0 −128 0.10101110000...00

Denormalize 0 −127 0.01010111000...00

Denormalize 0 −126 0.00101011100...00

Denormal Result 0 −126 0.00101011100...00

4-18 Vol. 1 DATA TYPES

In the extreme case, all the significant bits are shifted out to the right by leading zeros, creating a zero result.

The IA-32 architecture deals with denormal values in the following ways:

•

It avoids creating denormals by normalizing numbers whenever possible.

•

It provides the floating-point underflow exception to permit programmers to detect cases when denormals are created.

•

It provides the floating-point denormal-operand exception to permit procedures or programs to detect when denormals are being used as source operands for computations.

4.8.3.3 Signed Infinities

The two infinities, +∞ and − ∞, represent the maximum positive and negative real numbers, respectively, that can be represented in the floating-point format. Infinity is always represented by a significand of 1.00...00 (the integer bit may be implied) and the maximum biased exponent allowed in the specified format (for example, 25510 for the single-precision format).

The signs of infinities are observed, and comparisons are possible. Infinities are always inter-preted in the affine sense; that is, –∞ is less than any finite number and +∞ is greater than any finite number. Arithmetic on infinities is always exact. Exceptions are generated only when the use of an infinity as a source operand constitutes an invalid operation.

Whereas denormalized numbers may represent an underflow condition, the two ∞ numbers may represent the result of an overflow condition. Here, the normalized result of a computation has a biased exponent greater than the largest allowable exponent for the selected result format.

4.8.3.4 NaNs

Since NaNs are non-numbers, they are not part of the real number line. In Figure 4-12, the encoding space for NaNs in the floating-point formats is shown above the ends of the real number line. This space includes any value with the maximum allowable biased exponent and a non-zero fraction (the sign bit is ignored for NaNs).

The IA-32 architecture defines two classes of NaNs: quiet NaNs (QNaNs) and signaling NaNs (SNaNs). A QNaN is a NaN with the most significant fraction bit set; an SNaN is a NaN with the most significant fraction bit clear. QNaNs are allowed to propagate through most arithmetic operations without signaling an exception. SNaNs generally signal a floating-point invalid-operation exception whenever they appear as operands in arithmetic invalid-operations.

SNaNs are typically used to trap or invoke an exception handler. They must be inserted by soft-ware; that is, the processor never generates an SNaN as a result of a floating-point operation.

Vol. 1 4-19 DATA TYPES

4.8.3.5 Operating on SNaNs and QNaNs

When a floating-point operation is performed on an SNaN and/or a QNaN, the result of the oper-ation is either a QNaN delivered to the destinoper-ation operand or the generoper-ation of a floating-point invalid operating exception, depending on the following rules:

•

If one of the source operands is an SNaN and the floating-point invalid-operating exception is not masked (see Section 4.9.1.1, “Invalid Operation Exception (#I)”), the a floating-point invalid-operation exception is signaled and no result is stored in the destination operand.

•

If either or both of the source operands are NaNs and floating-point invalid-operation exception is masked, the result is as shown in Table 4-7. When an SNaN is converted to a QNaN, the conversion is handled by setting the most-significant fraction bit of the SNaN to 1. Also, when one of the source operands is an SNaN, the floating-point invalid-operation exception flag it set. Note that for some combinations of source operands, the result is different for x87 FPU operations and for SSE/SSE2/SSE3 operations.

•

When neither of the source operands is a NaN, but the operation generates a floating-point invalid-operation exception (see Tables 8-10 and 11-1), the result is commonly an SNaN source operand converted to a QNaN or the QNaN floating-point indefinite value.

Any exceptions to the behavior described in Table 4-7 are described in Section 8.5.1.2, “Invalid Arithmetic Operand Exception (#IA)” and Section 11.5.2.1, “Invalid Operation Exception (#I)”.

Table 4-7. Rules for Handling NaNs

Source Operands Result¹

SNaN and QNaN x87 FPU — QNaN source operand.

SSE/SSE2/SSE3 — First operand (if this operand is an SNaN, it is converted to a QNaN)

Two SNaNs x87 FPU—SNaN source operand with the larger

significand, converted into a QNaN

SSE/SSE2/SSE3 — First operand converted to a QNaN

Two QNaNs x87 FPU — QNaN source operand with the larger

significand

SSE/SSE2/SSE3 — First operand

SNaN and a floating-point value SNaN source operand, converted into a QNaN

QNaN and a floating-point value QNaN source operand SNaN (for instructions that take only one

operand) SNaN source operand, converted into a QNaN

QNaN (for instructions that take only one

operand) QNaN source operand

NOTE:

1. For SSE/SSE2/SSE3 instructions, the first operand is generally a source operand that becomes the destination operand. Within the Result column, the x87 FPU notation also applies to the FISTTP instruction in SSE3; the SSE3 notation applies to the SIMD floating-point instructions.

4-20 Vol. 1 DATA TYPES

4.8.3.6 Using SNaNs and QNaNs in Applications

Except for the rules given at the beginning of Section 4.8.3.4, “NaNs” for encoding SNaNs and QNaNs, software is free to use the bits in the significand of a NaN for any purpose. Both SNaNs and QNaNs can be encoded to carry and store data, such as diagnostic information.

By unmasking the invalid operation exception, the programmer can use signaling NaNs to trap to the exception handler. The generality of this approach and the large number of NaN values that are available provide the sophisticated programmer with a tool that can be applied to a variety of special situations.

For example, a compiler can use signaling NaNs as references to uninitialized (real) array elements. The compiler can preinitialize each array element with a signaling NaN whose signif-icand contained the index (relative position) of the element. Then, if an application program attempts to access an element that it had not initialized, it can use the NaN placed there by the compiler. If the invalid operation exception is unmasked, an interrupt will occur, and the excep-tion handler will be invoked. The excepexcep-tion handler can determine which element has been accessed, since the operand address field of the exception pointer will point to the NaN, and the NaN will contain the index number of the array element.

Quiet NaNs are often used to speed up debugging. In its early testing phase, a program often contains multiple errors. An exception handler can be written to save diagnostic information in memory whenever it was invoked. After storing the diagnostic data, it can supply a quiet NaN as the result of the erroneous instruction, and that NaN can point to its associated diagnostic area in memory. The program will then continue, creating a different NaN for each error. When the program ends, the NaN results can be used to access the diagnostic data saved at the time the errors occurred. Many errors can thus be diagnosed and corrected in one test run.

In embedded applications that use computed results in further computations, an undetected QNaN can invalidate all subsequent results. Such applications should therefore periodically check for QNaNs and provide a recovery mechanism to be used if a QNaN result is detected.

4.8.3.7 QNaN Floating-Point Indefinite

For the floating-point data type encodings (single-precision, precision, and double-extended-precision), one unique encoding (a QNaN) is reserved for representing the special value QNaN floating-point indefinite. The x87 FPU and the SSE/SSE2/SSE3 extensions return these indefinite values as responses to some masked floating-point exceptions. Table 4-3 shows the encoding used for the QNaN floating-point indefinite.

在文檔中 Volume 1:Basic Architecture IA-32 Intel ArchitectureSoftware Developer’s Manual (頁 103-108)