mirror of https://github.com/python/cpython
Document the new extensions.
This commit is contained in:
parent
3aa27fd315
commit
125434665b
|
@ -45,23 +45,81 @@ and Python values should be obvious given their types:
|
|||
\lineiii{x}{pad byte}{no value}
|
||||
\lineiii{c}{char}{string of length 1}
|
||||
\lineiii{b}{signed char}{integer}
|
||||
\lineiii{B}{unsigned char}{integer}
|
||||
\lineiii{h}{short}{integer}
|
||||
\lineiii{H}{unsigned short}{integer}
|
||||
\lineiii{i}{int}{integer}
|
||||
\lineiii{I}{unsigned int}{integer}
|
||||
\lineiii{l}{long}{integer}
|
||||
\lineiii{L}{unsigned long}{integer}
|
||||
\lineiii{f}{float}{float}
|
||||
\lineiii{d}{double}{float}
|
||||
\lineiii{s}{char[]}{string}
|
||||
\end{tableiii}
|
||||
|
||||
A format character may be preceded by an integral repeat count; e.g.\
|
||||
the format string \code{'4h'} means exactly the same as \code{'hhhh'}.
|
||||
|
||||
C numbers are represented in the machine's native format and byte
|
||||
order, and properly aligned by skipping pad bytes if necessary
|
||||
(according to the rules used by the C compiler).
|
||||
For the \code{'s'} format character, the count is interpreted as the
|
||||
size of the string, not a repeat count like for the other format
|
||||
characters; e.g. \code{'10s'} means a single 10-byte string, while
|
||||
\code{'10c'} means 10 characters. For packing, the string is
|
||||
truncated or padded with null bytes as appropriate to make it fit.
|
||||
For unpacking, the resulting string always has exactly the specified
|
||||
number of bytes. As a special case, \code{'0s'} means a single, empty
|
||||
string (while \code{'0c'} means 0 characters).
|
||||
|
||||
Examples (all on a big-endian machine):
|
||||
For the \code{'I'} and \code{'L'} format characters, the return
|
||||
value is a Python long integer if a Python plain integer can't
|
||||
represent the required range (note: this is dependent on the size of
|
||||
the relevant C types only, not of the sign of the actual value).
|
||||
|
||||
By default, C numbers are represented in the machine's native format
|
||||
and byte order, and properly aligned by skipping pad bytes if
|
||||
necessary (according to the rules used by the C compiler).
|
||||
|
||||
Alternatively, the first character of the format string can be used to
|
||||
indicate the byte order, size and alignment of the packed data,
|
||||
according to the following table:
|
||||
|
||||
\begin{tableiii}{|c|l|l|}{samp}{Character}{Byte order}{Size and alignment}
|
||||
\lineiii{@}{native}{native}
|
||||
\lineiii{=}{native}{standard}
|
||||
\lineiii{<}{little-endian}{standard}
|
||||
\lineiii{>}{big-endian}{standard}
|
||||
\lineiii{!}{network (= big-endian)}{standard}
|
||||
\end{tableiii}
|
||||
|
||||
If the first character is not one of these, \code{'@'} is assumed.
|
||||
|
||||
Native byte order is big-endian or little-endian, depending on the
|
||||
host system (e.g. Motorola and Sun are big-endian; Intel and DEC are
|
||||
little-endian).
|
||||
|
||||
Native size and alignment are determined using the C compiler's sizeof
|
||||
expression. This is always combined with native byte order.
|
||||
|
||||
Standard size and alignment are as follows: no alignment is required
|
||||
for any type (so you have to use pad bytes); short is 2 bytes; int and
|
||||
long are 4 bytes. In this mode, there is no support for float and
|
||||
double (\code{'f'} and \code{'d'}).
|
||||
|
||||
Note the difference between \code{'@'} and \code{'='}: both use native
|
||||
byte order, but the size and alignment of the latter is standardized.
|
||||
|
||||
The form \code{'!'} is available for those poor souls who claim they
|
||||
can't remember whether network byte order is big-endian or
|
||||
little-endian.
|
||||
|
||||
There is no way to indicate non-native byte order (i.e. force
|
||||
byte-swapping); use the appropriate choice of \code{'<'} or
|
||||
\code{'>'}.
|
||||
|
||||
Examples (all using native byte order, size and alignment, on a
|
||||
big-endian machine):
|
||||
|
||||
\bcode\begin{verbatim}
|
||||
from struct import *
|
||||
pack('hhl', 1, 2, 3) == '\000\001\000\002\000\000\000\003'
|
||||
unpack('hhl', '\000\001\000\002\000\000\000\003') == (1, 2, 3)
|
||||
calcsize('hhl') == 8
|
||||
|
@ -71,8 +129,5 @@ Hint: to align the end of a structure to the alignment requirement of
|
|||
a particular type, end the format with the code for that type with a
|
||||
repeat count of zero, e.g.\ the format \code{'llh0l'} specifies two
|
||||
pad bytes at the end, assuming longs are aligned on 4-byte boundaries.
|
||||
|
||||
(More format characters are planned, e.g.\ \code{'s'} for character
|
||||
arrays, upper case for unsigned variants, and a way to specify the
|
||||
byte order, which is useful for [de]constructing network packets and
|
||||
reading/writing portable binary file formats like TIFF and AIFF.)
|
||||
(This only works when native size and alignment are in effect;
|
||||
standard size and alignment does not enforce any alignment.)
|
||||
|
|
|
@ -45,23 +45,81 @@ and Python values should be obvious given their types:
|
|||
\lineiii{x}{pad byte}{no value}
|
||||
\lineiii{c}{char}{string of length 1}
|
||||
\lineiii{b}{signed char}{integer}
|
||||
\lineiii{B}{unsigned char}{integer}
|
||||
\lineiii{h}{short}{integer}
|
||||
\lineiii{H}{unsigned short}{integer}
|
||||
\lineiii{i}{int}{integer}
|
||||
\lineiii{I}{unsigned int}{integer}
|
||||
\lineiii{l}{long}{integer}
|
||||
\lineiii{L}{unsigned long}{integer}
|
||||
\lineiii{f}{float}{float}
|
||||
\lineiii{d}{double}{float}
|
||||
\lineiii{s}{char[]}{string}
|
||||
\end{tableiii}
|
||||
|
||||
A format character may be preceded by an integral repeat count; e.g.\
|
||||
the format string \code{'4h'} means exactly the same as \code{'hhhh'}.
|
||||
|
||||
C numbers are represented in the machine's native format and byte
|
||||
order, and properly aligned by skipping pad bytes if necessary
|
||||
(according to the rules used by the C compiler).
|
||||
For the \code{'s'} format character, the count is interpreted as the
|
||||
size of the string, not a repeat count like for the other format
|
||||
characters; e.g. \code{'10s'} means a single 10-byte string, while
|
||||
\code{'10c'} means 10 characters. For packing, the string is
|
||||
truncated or padded with null bytes as appropriate to make it fit.
|
||||
For unpacking, the resulting string always has exactly the specified
|
||||
number of bytes. As a special case, \code{'0s'} means a single, empty
|
||||
string (while \code{'0c'} means 0 characters).
|
||||
|
||||
Examples (all on a big-endian machine):
|
||||
For the \code{'I'} and \code{'L'} format characters, the return
|
||||
value is a Python long integer if a Python plain integer can't
|
||||
represent the required range (note: this is dependent on the size of
|
||||
the relevant C types only, not of the sign of the actual value).
|
||||
|
||||
By default, C numbers are represented in the machine's native format
|
||||
and byte order, and properly aligned by skipping pad bytes if
|
||||
necessary (according to the rules used by the C compiler).
|
||||
|
||||
Alternatively, the first character of the format string can be used to
|
||||
indicate the byte order, size and alignment of the packed data,
|
||||
according to the following table:
|
||||
|
||||
\begin{tableiii}{|c|l|l|}{samp}{Character}{Byte order}{Size and alignment}
|
||||
\lineiii{@}{native}{native}
|
||||
\lineiii{=}{native}{standard}
|
||||
\lineiii{<}{little-endian}{standard}
|
||||
\lineiii{>}{big-endian}{standard}
|
||||
\lineiii{!}{network (= big-endian)}{standard}
|
||||
\end{tableiii}
|
||||
|
||||
If the first character is not one of these, \code{'@'} is assumed.
|
||||
|
||||
Native byte order is big-endian or little-endian, depending on the
|
||||
host system (e.g. Motorola and Sun are big-endian; Intel and DEC are
|
||||
little-endian).
|
||||
|
||||
Native size and alignment are determined using the C compiler's sizeof
|
||||
expression. This is always combined with native byte order.
|
||||
|
||||
Standard size and alignment are as follows: no alignment is required
|
||||
for any type (so you have to use pad bytes); short is 2 bytes; int and
|
||||
long are 4 bytes. In this mode, there is no support for float and
|
||||
double (\code{'f'} and \code{'d'}).
|
||||
|
||||
Note the difference between \code{'@'} and \code{'='}: both use native
|
||||
byte order, but the size and alignment of the latter is standardized.
|
||||
|
||||
The form \code{'!'} is available for those poor souls who claim they
|
||||
can't remember whether network byte order is big-endian or
|
||||
little-endian.
|
||||
|
||||
There is no way to indicate non-native byte order (i.e. force
|
||||
byte-swapping); use the appropriate choice of \code{'<'} or
|
||||
\code{'>'}.
|
||||
|
||||
Examples (all using native byte order, size and alignment, on a
|
||||
big-endian machine):
|
||||
|
||||
\bcode\begin{verbatim}
|
||||
from struct import *
|
||||
pack('hhl', 1, 2, 3) == '\000\001\000\002\000\000\000\003'
|
||||
unpack('hhl', '\000\001\000\002\000\000\000\003') == (1, 2, 3)
|
||||
calcsize('hhl') == 8
|
||||
|
@ -71,8 +129,5 @@ Hint: to align the end of a structure to the alignment requirement of
|
|||
a particular type, end the format with the code for that type with a
|
||||
repeat count of zero, e.g.\ the format \code{'llh0l'} specifies two
|
||||
pad bytes at the end, assuming longs are aligned on 4-byte boundaries.
|
||||
|
||||
(More format characters are planned, e.g.\ \code{'s'} for character
|
||||
arrays, upper case for unsigned variants, and a way to specify the
|
||||
byte order, which is useful for [de]constructing network packets and
|
||||
reading/writing portable binary file formats like TIFF and AIFF.)
|
||||
(This only works when native size and alignment are in effect;
|
||||
standard size and alignment does not enforce any alignment.)
|
||||
|
|
Loading…
Reference in New Issue