[tz] Minor (unimportant really) technical UB bug in strftime() ?
Clive D.W. Feather
clive at davros.org
Wed Nov 9 07:02:18 UTC 2022
Paul Eggert said:
> > You could replace the assignment by a memcpy. Assignment via unsigned
> > chars (which is what memcpy does) are exempt from the undefined behaviour.
>
> As I understand it, that exemption is for using memcpy to copy a trap
> representation. There's no similar exemption for using memcpy to copy
> uninitialized data; for example, the following function has undefined
> behavior:
>
> int y;
> void f(void) { int x; memcpy(&y, &x, sizeof y); }
>
> If I'm right, we can't get by simply by replacing the assignment with
> memcpy.
You're wrong, pure and simple. There's no such thing as uninitialized data
in that sense.
The following quotes are from a late draft because that's all I have to
hand right this second, but the final C99 wording was either the same or
effectively so.
[6.2.6.1]
Values stored in unsigned bit-fields and objects of
type unsigned char shall be represented using a pure binary
notation.
Elsewhere we say that unsigned char doesn't have any padding bits, so it
holds values in the range 0 to (1<<CHAR_BIT) - 1 inclusive.
Values stored in non-bit-field objects of any other
object type consist of n x CHAR_BIT bits, where n is the size
of an object of that type, in bytes. The value may be
copied into an object of type unsigned char [n] (e.g., by
memcpy); the resulting set of bytes is called the object
representation of the value.
I've omitted the bit about bit-fields.
Certain object representations need not represent a
value of the object type. If the stored value of an object
has such a representation and is read by an lvalue
expression that does not have character type, the behavior
is undefined. If such a representation is produced by a
side effect that modifies all or any part of the object by
an lvalue expression that does not have character type, the
behavior is undefined. Such a representation is called a
trap representation.
So, for any type T other than character types, some byte sequences can be
trap representations. Reading or writing a trap representation using type T
is undefined behaviour. But reading or writing it using a character type
isn't, though in the case of writing the result could be a trap
representation. That means that memcpy always has defined behaviour.
[3.17.2]
indeterminate value
either an unspecified value or a trap representation
[6.7.8]
If an object that has automatic storage duration is
not initialized explicitly, its value is indeterminate.
So, given,
int x;
int y = x;
within a block, x holds either some unspecified (valid) value or a trap
representation. If it's an unspecified value, y is set to the same value.
If it's a trap representation, you hit undefined behaviour.
In particular, if int has N bits and can hold 1<<N different values, then
all possible object representations are valid and therefore x can't hold a
trap representation, so the assignment is safe.
We very explicitly wanted memcpy to be safe with uninitialized values.
That's why it's worded this way.
--
Clive D.W. Feather | If you lie to the compiler,
Email: clive at davros.org | it will get its revenge.
Web: http://www.davros.org | - Henry Spencer
Mobile: +44 7973 377646
More information about the tz
mailing list