Multiplicative operators - Rationale for American National Standard for Information Systems

3.3 Expressions

3.3.5 Multiplicative operators

There was considerable sentiment for giving more portable semantics to division (and hence remainder) by specifying some way of giving less machine dependent results for negative operands. Few Committee members wanted to require this by default, lest existing fast code be gravely slowed. One suggestion was to make signed int a type distinct from plain int, and require better-defined semantics for signed int division and remainder. This suggestion was opposed on the grounds that effectively adding several types would have consequences out of proportion to the benefit to be obtained; the Committee twice rejected this approach. Instead the Committee has adopted new library functions div and ldiv which produce integral quotient and remainder with well-defined sign semantics. (See §4.10.6.2, §4.10.6.3.) The Committee rejected extending the % operator to work on floating types;

such usage would duplicate the facility provided by fmod. (See §4.5.6.5.) 3.3.6 Additive operators

As with the sizeof operator, implementations have taken different approaches in defining a type for the difference between two pointers (see §3.3.3.4). It is important

that this type be signed, in order to obtain proper algebraic ordering when dealing with pointers within the same array. However, the magnitude of a pointer difference can be as large as the size of the largest object that can be declared. (And since that is an unsigned type, the difference between two pointers may cause an overflow.)

The type of pointer minus pointer is defined to be int in K&R. The Stan-dard defines the result of this operation to be a signed integer, the size of which is implementation-defined. The type is published as ptrdiff t, in the standard header <stddef.h>. Old code recompiled by a conforming compiler may no longer work if the implementation defines the result of such an operation to be a type other than int and if the program depended on the result to be of type int. This behavior was considered by the Committee to be correctable. Overflow was considered not to break old code since it was undefined by K&R. Mismatch of types between ac-tual and formal argument declarations is correctable by including a properly defined function prototype in the scope of the function invocation.

An important endorsement of widespread practice is the requirement that a pointer can always be incremented to just past the end of an array, with no fear of overflow or wraparound:

SOMETYPE array[SPAN];

/* ... */

for (p = &array[0]; p < &array[SPAN]; p++)

This stipulation merely requires that every object be followed by one byte whose address is representable. That byte can be the first byte of the next object declared for all but the last object located in a contiguous segment of memory. (In the exam-ple, the address &array[SPAN] must address a byte following the highest element of array.) Since the pointer expression p+1 need not (and should not) be derefer-enced, it is unnecessary to leave room for a complete object of size sizeof(*p).

In the case of p-1, on the other hand, an entire object would have to be allocated prior to the array of objects that p traverses, so decrement loops that run off the bottom of an array may fail. This restriction allows segmented architectures, for instance, to place objects at the start of a range of addressable memory.

3.3.7 Bitwise shift operators

See §3.3.3.3 for a discussion of the arithmetic definition of these operators.

The description of shift operators in K&R suggests that shifting by a long count should force the left operand to be widened to long before being shifted. A more intuitive practice, endorsed by the Committee, is that the type of the shift count has no bearing on the type of the result.

QUIET CHANGE

Shifting by a long count no longer coerces the shifted operand to long.

The Committee has affirmed the freedom in implementation granted by the Base Document in not requiring the signed right shift operation to sign extend, since such a requirement might slow down fast code and since the usefulness of sign extended shifts is marginal. (Shifting a negative twos-complement integer arithmetically right one place is not the same as dividing by two!)

3.3.8 Relational operators

For an explanation of why the pointer comparison of the object pointer P with the pointer expression P+1 is always safe, see Rationale §3.3.6.

3.3.9 Equality operators

The Committee considered, on more than one occasion, permitting comparison of structures for equality. Such proposals foundered on the problem of holes in struc-tures. A byte-wise comparison of two structures would require that the holes as-suredly be set to zero so that all holes would compare equal, a difficult task for automatic or dynamically allocated variables. (The possibility of union-type ele-ments in a structure raises insuperable problems with this approach.) Otherwise the implementation would have to be prepared to break a structure comparison into an arbitrary number of member comparisons; a seemingly simple expression could thus expand into a substantial stretch of code, which is contrary to the spirit of C.

In pointer comparisons, one of the operands may be of type void *. In partic-ular, this allows NULL, which can be defined as (void *)0, to be compared to any object pointer.

3.3.10 Bitwise AND operator

See §3.3.3.3 for a discussion of the arithmetic definition of the bitwise operators.

3.3.11 Bitwise exclusive OR operator See §3.3.3.3.

3.3.12 Bitwise inclusive OR operator See §3.3.3.3.

3.3.13 Logical AND operator 3.3.14 Logical OR operator 3.3.15 Conditional operator

The syntactic restrictions on the middle operand of the conditional operator have been relaxed to include more than just logical-OR-expression: several extant imple-mentations have adopted this practice.

The type of a conditional operator expression can be void, a structure, or a union; most other operators do not deal with such types. The rules for balancing type between pointer and integer have, however, been tightened, since now only the constant 0 can portably be coerced to pointer.

The Standard allows one of the second or third operands to be of type void *, if the other is a pointer type. Since the result of such a conditional expression is void *, an appropriate cast must be used.

3.3.16 Assignment operators

Certain syntactic forms of assignment operators have been discontinued, and others tightened up (see §3.1.5).

The storage assignment need not take place until the next sequence point. (A restriction in earlier drafts that the storage take place before the value of the ex-pression is used has been removed.) As a consequence, a straightforward syntactic test for ambiguous expressions can be stated. Some definitions: A side effect is a storage to any data object, or a read of a volatile object. An ambiguous expression is one whose value depends upon the order in which side effects are evaluated. A pure function is one with no side effects; an impure function is any other. A sequenced expression is one whose major operator defines a sequence point: comma, &&, ||, or conditional operator; an unsequenced expression is any other. We can then say that an unsequenced expression is ambiguous if more than one operand invokes any impure function, or if more than one operand contains an lvalue referencing the same object and one or more operands specify a side-effect to that object. Further, any expression containing an ambiguous expression is ambiguous.

The optimization rules for factoring out assignments can also be stated. Let X(i,S) be an expression which contains no impure functions or sequenced operators, and suppose that X contains a storage S(i) to i. The storage expressions, and related expressions, are

S(i): Sval(i): Snew(i):

++i i+1 i+1

i++ i i+1

--i i-1 i-1

i-- i i-1

i = y y y

i op= y i op y i op y

Then X(i,S) can be replaced by either

(T = i, i = Snew(i), X(T,Sval)) or

(T = X(i,Sval), i = Snew(i), T)

provided that neither i nor y have side effects themselves.

3.3.16.1 Simple assignment

Structure assignment has been added: its use was foreshadowed even in K&R, and many existing implementations already support it.

The rules for type compatibility in assignment also apply to argument compati-bility between actual argument expressions and their corresponding argument types in a function prototype.

An implementation need not correctly perform an assignment between over-lapping operands. Overover-lapping operands occur most naturally in a union, where assigning one field to another is often desirable to effect a type conversion in place;

the assignment may well work properly in all simple cases, but it is not maximally portable. Maximally portable code should use a temporary variable as an interme-diate in such an assignment.

3.3.16.2 Compound assignment

The importance of requiring that the left operand lvalue be evaluated only once is not a question of efficiency, although that is one compelling reason for using the compound assignment operators. Rather, it is to assure that any side effects of evaluating the left operand are predictable.

3.3.17 Comma operator

The left operand of a comma operator may be void, since only the right-hand operator is relevant to the type of the expression.

The example in the Standard clarifies that commas separating arguments “bind”

tighter than the comma operator in expressions.

3.4 Constant Expressions

To clarify existing practice, several varieties of constant expression have been iden-tified:

The expression following #if (§3.8.1) must expand to integer constants, charac-ter constants, the special operator defined, and operators with no side effects.

No environmental inquiries can be made, since all arithmetic is done as translate-time (signed or unsigned) long integers, and casts are disallowed. The restriction to translate-time arithmetic frees an implementation from having to perform execution-environment arithmetic in the host execution-environment. It does not preclude an imple-mentation from doing so — the impleimple-mentation may simply define “translate-time arithmetic” to be that of the target.

Unsigned arithmetic is performed in these expressions (according to the default widening rules) when unsigned operands are involved; this rule allows for unsur-prising arithmetic involving very large constants (i.e, those whose type is unsigned

long) since they cannot be represented as long or constants explicitly marked as unsigned.

Character constants, when evaluated in #if expressions, may be interpreted in the source character set, the execution character set, or some other implementation-defined character set. This latitude reflects the diversity of existing practice, espe-cially in cross-compilers.

An integral constant expression must involve only numbers knowable at translate time, and operators with no side effects. Casts and the sizeof operator may be used to interrogate the execution environment.

Static initializers include integral constant expressions, along with floating constants and simple addressing expressions. An implementation must accept arbitrary ex-pressions involving floating and integral numbers and side-effect-free operators in arithmetic initializers, but it is at liberty to turn such initializers into executable code which is invoked prior to program startup (see §2.1.2.2); this scheme might impose some requirements on linkers or runtime library code in some implementa-tions.

The translation environment must not produce a less accurate value for a floating-point initializer than the execution environment, but it is at liberty to do better. Thus a static initializer may well be slightly different than the same expression computed at execution time. However, while implementations are cer-tainly permitted to produce exactly the same result in translation and execution environments, requiring this was deemed to be an intolerable burden on many cross-compilers.

QUIET CHANGE

A program that uses #if expressions to determine properties of the ex-ecution environment may now get different answers.

3.5 Declarations

The Committee decided that empty declarations are invalid (except for a special case with tags, see §3.5.2.3, and the case of enumerations such as enum {zero,one};, see §3.5.2.2). While many seemingly silly constructs are tolerated in other parts of the language in the interest of facilitating the machine generation of C, empty declarations were considered sufficiently easy to avoid.

The practice of placing the storage class specifier other than first in a declaration has been branded as obsolescent (See §3.9.3.) The Committee feels it desirable to rule out such constructs as

enum { aaa, aab, /* etc */

zzy, zzz } typedef a2z;

in some future standard.

3.5.1 Storage-class specifiers

Because the address of a register variable cannot be taken, objects of storage class register effectively exist in a space distinct from other objects. (Functions occupy yet a third address space). This makes them candidates for optimal placement, the usual reason for declaring registers, but it also makes them candidates for more aggressive optimization.

The practice of representing register variables as wider types (as when register char is quietly changed to register int) is no longer acceptable.

3.5.2 Type specifiers

Several new type specifiers have been added: signed, enum, and void. long float has been retired and long double has been added, along with a plethora of integer types. The Committee’s reasons for each of these additions, and the one deletion, are given in section §3.1.2.5 of this document.

3.5.2.1 Structure and union specifiers

Three types of bit fields are now defined: “plain” int calls for implementation-defined signedness (as in the Base Document), signed int calls for assuredly signed fields, and unsigned int calls for unsigned fields. The old constraints on bit fields crossing word boundaries have been relaxed, since so many properties of bit fields are implementation dependent anyway.

The layout of structures is determined only to a limited extent:

• no hole may occur at the beginning;

• members occupy increasing storage addresses; and

• if necessary, a hole is placed on the end to make the structure big enough to pack tightly into arrays and maintain proper alignment.

Since some existing implementations, in the interest of enhanced access time, leave internal holes larger than absolutely necessary, it is not clear that a portable deter-ministic method can be given for traversing a structure field by field.

To clarify what is meant by the notion that “all the fields of a union occupy the same storage,” the Standard specifies that a pointer to a union, when suitably cast, points to each member (or, in the case of a bit-field member, to the storage unit containing the bit field).

3.5.2.2 Enumeration specifiers 3.5.2.3 Tags

As with all block structured languages that also permit forward references, C has a problem with structure and union tags. If one wants to declare, within a block, two mutually referencing structures, one must write something like:

struct x { struct y *p; /*...*/ };

struct y { struct x *q; /*...*/ };

But if struct y is already defined in a containing block, the first field of struct x will refer to the older declaration.

Thus special semantics has been given to the form:

struct y;

It now hides the outer declaration of y, and “opens” a new instance in the current block.

QUIET CHANGE

The empty declaration struct x; is no longer innocuous.

3.5.3 Type qualifiers

The Committee has added to C two type qualifiers : const and volatile. Indi-vidually and in combination they specify the assumptions a compiler can and must make when accessing an object through an lvalue.

The syntax and semantics of const were adapted from C++; the concept itself has appeared in other languages. volatile is an invention of the Committee; it follows the syntactic model of const.

Type qualifiers were introduced in part to provide greater control over opti-mization. Several important optimization techniques are based on the principle of

“cacheing”: under certain circumstances the compiler can remember the last value accessed (read or written) from a location, and use this retained value the next time that location is read. (The memory, or “cache”, is typically a hardware register.) If this memory is a machine register, for instance, the code can be smaller and faster using the register rather than accessing external memory.

The basic qualifiers can be characterized by the restrictions they impose on access and cacheing:

const No writes through this lvalue. In the absence of this qualifier, writes may occur through this lvalue.

volatile No cacheing through this lvalue: each operation in the abstract semantics must be performed. (That is, no cacheing assumptions may be made, since the location is not guaranteed to contain any previous value.) In the absence of this qualifier, the contents of the designated location may be assumed to be unchanged (except for possible aliasing.)

A translator design with no cacheing optimizations can effectively ignore the type qualifiers, except insofar as they affect assignment compatibility.

It would have been possible, of course, to specify a nonconst keyword instead of const, or nonvolatile instead of volatile. The senses of these concepts in

the Standard were chosen to assure that the default, unqualified, case was the most common, and that it corresponded most clearly to traditional practice in the use of lvalue expressions.

Four combinations of the two qualifiers is possible; each defines a useful set of lvalue properties. The next several paragraphs describe typical uses of these qualifiers.

The translator may assume, for an unqualified lvalue, that it may read or write the referenced object, that the value of this object cannot be changed except by explicitly programmed actions in the current thread of control, but that other lvalue expressions could reference the same object.

const is specified in such a way that an implementation is at liberty to put const objects in read-only storage, and is encouraged to diagnose obvious attempts to modify them, but is not required to track down all the subtle ways that such checking can be subverted. If a function parameter is declared const, then the referenced object is not changed (through that lvalue) in the body of the function

— the parameter is read-only.

A static volatile object is an appropriate model for a memory-mapped I/O register. Implementors of C translators should take into account relevant hardware details on the target systems when implementing accesses to volatile objects. For instance, the hardware logic of a system may require that a two-byte memory-mapped register not be accessed with byte operations; a compiler for such a system would have to assure that no such instructions were generated, even if the source code only accesses one byte of the register. Whether read-modify-write instructions can be used on such device registers must also be considered. Whatever decisions are adopted on such issues must be documented, as volatile access is implementation-defined. A volatile object is an appropriate model for a variable shared among multiple processes.

A static const volatile object appropriately models a memory-mapped input port, such as a real-time clock. Similarly, a const volatile object models a variable which can be altered by another process but not by this one.

Although the type qualifiers are formally treated as defining new types they actually serve as modifiers of declarators. Thus the declarations

const struct s {int a,b;} x;

struct s y;

declare x as a const object, but not y. The const property can be associated with the aggregate type by means of a type definition:

typedef const struct s {int a,b;} stype;

stype x;

stype y;

In these declarations the const property is associated with the declarator stype, so x and y are both const objects.

The Committee considered making const and volatile storage classes, but this would have ruled out any number of desirable constructs, such as const members of structures and variable pointers to const types.

A cast of a value to a qualified type has no effect; the qualification (volatile, say) can have no effect on the access since it has occurred prior to the cast. If it is

在文檔中 Rationale for American National Standard for Information Systems – Programming Language – C (頁 51-0)