mirror of https://github.com/python/cpython
gh-125063: marshal: Add version 5, improve documentation (GH-126829)
* Document that slices can be marshalled * Deduplicate and organize the list of supported types in docs * Organize the type code list in marshal.c, to make it more obvious that this is a versioned format * Back-fill some historical info Co-authored-by: Michael Droettboom <mdboom@gmail.com>
This commit is contained in:
parent
e17486982c
commit
d00f7b1b9d
|
@ -13,11 +13,12 @@ binary mode.
|
||||||
|
|
||||||
Numeric values are stored with the least significant byte first.
|
Numeric values are stored with the least significant byte first.
|
||||||
|
|
||||||
The module supports two versions of the data format: version 0 is the
|
The module supports several versions of the data format; see
|
||||||
historical version, version 1 shares interned strings in the file, and upon
|
the :py:mod:`Python module documentation <marshal>` for details.
|
||||||
unmarshalling. Version 2 uses a binary format for floating-point numbers.
|
|
||||||
``Py_MARSHAL_VERSION`` indicates the current file format (currently 2).
|
|
||||||
|
|
||||||
|
.. c:macro:: Py_MARSHAL_VERSION
|
||||||
|
|
||||||
|
The current format version. See :py:data:`marshal.version`.
|
||||||
|
|
||||||
.. c:function:: void PyMarshal_WriteLongToFile(long value, FILE *file, int version)
|
.. c:function:: void PyMarshal_WriteLongToFile(long value, FILE *file, int version)
|
||||||
|
|
||||||
|
|
|
@ -38,23 +38,39 @@ supports a substantially wider range of objects than marshal.
|
||||||
maliciously constructed data. Never unmarshal data received from an
|
maliciously constructed data. Never unmarshal data received from an
|
||||||
untrusted or unauthenticated source.
|
untrusted or unauthenticated source.
|
||||||
|
|
||||||
|
There are functions that read/write files as well as functions operating on
|
||||||
|
bytes-like objects.
|
||||||
|
|
||||||
.. index:: object; code, code object
|
.. index:: object; code, code object
|
||||||
|
|
||||||
Not all Python object types are supported; in general, only objects whose value
|
Not all Python object types are supported; in general, only objects whose value
|
||||||
is independent from a particular invocation of Python can be written and read by
|
is independent from a particular invocation of Python can be written and read by
|
||||||
this module. The following types are supported: booleans, integers, floating-point
|
this module. The following types are supported:
|
||||||
numbers, complex numbers, strings, bytes, bytearrays, tuples, lists, sets,
|
|
||||||
frozensets, dictionaries, and code objects (if *allow_code* is true),
|
* Numeric types: :class:`int`, :class:`bool`, :class:`float`, :class:`complex`.
|
||||||
where it should be understood that
|
* Strings (:class:`str`) and :class:`bytes`.
|
||||||
tuples, lists, sets, frozensets and dictionaries are only supported as long as
|
:term:`Bytes-like objects <bytes-like object>` like :class:`bytearray` are
|
||||||
the values contained therein are themselves supported. The
|
marshalled as :class:`!bytes`.
|
||||||
singletons :const:`None`, :const:`Ellipsis` and :exc:`StopIteration` can also be
|
* Containers: :class:`tuple`, :class:`list`, :class:`set`, :class:`frozenset`,
|
||||||
marshalled and unmarshalled.
|
and (since :data:`version` 5), :class:`slice`.
|
||||||
For format *version* lower than 3, recursive lists, sets and dictionaries cannot
|
It should be understood that these are supported only if the values contained
|
||||||
be written (see below).
|
therein are themselves supported.
|
||||||
|
Recursive containers are supported since :data:`version` 3.
|
||||||
|
* The singletons :const:`None`, :const:`Ellipsis` and :exc:`StopIteration`.
|
||||||
|
* :class:`code` objects, if *allow_code* is true. See note above about
|
||||||
|
version dependence.
|
||||||
|
|
||||||
|
.. versionchanged:: 3.4
|
||||||
|
|
||||||
|
* Added format version 3, which supports marshalling recursive lists, sets
|
||||||
|
and dictionaries.
|
||||||
|
* Added format version 4, which supports efficient representations
|
||||||
|
of short strings.
|
||||||
|
|
||||||
|
.. versionchanged:: next
|
||||||
|
|
||||||
|
Added format version 5, which allows marshalling slices.
|
||||||
|
|
||||||
There are functions that read/write files as well as functions operating on
|
|
||||||
bytes-like objects.
|
|
||||||
|
|
||||||
The module defines these functions:
|
The module defines these functions:
|
||||||
|
|
||||||
|
@ -140,11 +156,24 @@ In addition, the following constants are defined:
|
||||||
|
|
||||||
.. data:: version
|
.. data:: version
|
||||||
|
|
||||||
Indicates the format that the module uses. Version 0 is the historical
|
Indicates the format that the module uses.
|
||||||
format, version 1 shares interned strings and version 2 uses a binary format
|
Version 0 is the historical first version; subsequent versions
|
||||||
for floating-point numbers.
|
add new features.
|
||||||
Version 3 adds support for object instancing and recursion.
|
Generally, a new version becomes the default when it is introduced.
|
||||||
The current version is 4.
|
|
||||||
|
======= =============== ====================================================
|
||||||
|
Version Available since New features
|
||||||
|
======= =============== ====================================================
|
||||||
|
1 Python 2.4 Sharing interned strings
|
||||||
|
------- --------------- ----------------------------------------------------
|
||||||
|
2 Python 2.5 Binary representation of floats
|
||||||
|
------- --------------- ----------------------------------------------------
|
||||||
|
3 Python 3.4 Support for object instancing and recursion
|
||||||
|
------- --------------- ----------------------------------------------------
|
||||||
|
4 Python 3.4 Efficient representation of short strings
|
||||||
|
------- --------------- ----------------------------------------------------
|
||||||
|
5 Python 3.14 Support for :class:`slice` objects
|
||||||
|
======= =============== ====================================================
|
||||||
|
|
||||||
|
|
||||||
.. rubric:: Footnotes
|
.. rubric:: Footnotes
|
||||||
|
@ -154,4 +183,3 @@ In addition, the following constants are defined:
|
||||||
around in a self-contained form. Strictly speaking, "to marshal" means to
|
around in a self-contained form. Strictly speaking, "to marshal" means to
|
||||||
convert some data from internal to external form (in an RPC buffer for instance)
|
convert some data from internal to external form (in an RPC buffer for instance)
|
||||||
and "unmarshalling" for the reverse process.
|
and "unmarshalling" for the reverse process.
|
||||||
|
|
||||||
|
|
|
@ -13,7 +13,7 @@ PyAPI_FUNC(PyObject *) PyMarshal_ReadObjectFromString(const char *,
|
||||||
Py_ssize_t);
|
Py_ssize_t);
|
||||||
PyAPI_FUNC(PyObject *) PyMarshal_WriteObjectToString(PyObject *, int);
|
PyAPI_FUNC(PyObject *) PyMarshal_WriteObjectToString(PyObject *, int);
|
||||||
|
|
||||||
#define Py_MARSHAL_VERSION 4
|
#define Py_MARSHAL_VERSION 5
|
||||||
|
|
||||||
PyAPI_FUNC(long) PyMarshal_ReadLongFromFile(FILE *);
|
PyAPI_FUNC(long) PyMarshal_ReadLongFromFile(FILE *);
|
||||||
PyAPI_FUNC(int) PyMarshal_ReadShortFromFile(FILE *);
|
PyAPI_FUNC(int) PyMarshal_ReadShortFromFile(FILE *);
|
||||||
|
|
|
@ -28,6 +28,13 @@ class HelperMixin:
|
||||||
finally:
|
finally:
|
||||||
os_helper.unlink(os_helper.TESTFN)
|
os_helper.unlink(os_helper.TESTFN)
|
||||||
|
|
||||||
|
def omit_last_byte(data):
|
||||||
|
"""return data[:-1]"""
|
||||||
|
# This file's code is used in CompatibilityTestCase,
|
||||||
|
# but slices need marshal version 5.
|
||||||
|
# Avoid the slice literal.
|
||||||
|
return data[slice(0, -1)]
|
||||||
|
|
||||||
class IntTestCase(unittest.TestCase, HelperMixin):
|
class IntTestCase(unittest.TestCase, HelperMixin):
|
||||||
def test_ints(self):
|
def test_ints(self):
|
||||||
# Test a range of Python ints larger than the machine word size.
|
# Test a range of Python ints larger than the machine word size.
|
||||||
|
@ -241,7 +248,8 @@ class BugsTestCase(unittest.TestCase):
|
||||||
def test_patch_873224(self):
|
def test_patch_873224(self):
|
||||||
self.assertRaises(Exception, marshal.loads, b'0')
|
self.assertRaises(Exception, marshal.loads, b'0')
|
||||||
self.assertRaises(Exception, marshal.loads, b'f')
|
self.assertRaises(Exception, marshal.loads, b'f')
|
||||||
self.assertRaises(Exception, marshal.loads, marshal.dumps(2**65)[:-1])
|
self.assertRaises(Exception, marshal.loads,
|
||||||
|
omit_last_byte(marshal.dumps(2**65)))
|
||||||
|
|
||||||
def test_version_argument(self):
|
def test_version_argument(self):
|
||||||
# Python 2.4.0 crashes for any call to marshal.dumps(x, y)
|
# Python 2.4.0 crashes for any call to marshal.dumps(x, y)
|
||||||
|
@ -594,6 +602,19 @@ class InterningTestCase(unittest.TestCase, HelperMixin):
|
||||||
s2 = sys.intern(s)
|
s2 = sys.intern(s)
|
||||||
self.assertNotEqual(id(s2), id(s))
|
self.assertNotEqual(id(s2), id(s))
|
||||||
|
|
||||||
|
class SliceTestCase(unittest.TestCase, HelperMixin):
|
||||||
|
def test_slice(self):
|
||||||
|
for obj in (
|
||||||
|
slice(None), slice(1), slice(1, 2), slice(1, 2, 3),
|
||||||
|
slice({'set'}, ('tuple', {'with': 'dict'}, ), self.helper.__code__)
|
||||||
|
):
|
||||||
|
with self.subTest(obj=str(obj)):
|
||||||
|
self.helper(obj)
|
||||||
|
|
||||||
|
for version in range(4):
|
||||||
|
with self.assertRaises(ValueError):
|
||||||
|
marshal.dumps(obj, version)
|
||||||
|
|
||||||
@support.cpython_only
|
@support.cpython_only
|
||||||
@unittest.skipUnless(_testcapi, 'requires _testcapi')
|
@unittest.skipUnless(_testcapi, 'requires _testcapi')
|
||||||
class CAPI_TestCase(unittest.TestCase, HelperMixin):
|
class CAPI_TestCase(unittest.TestCase, HelperMixin):
|
||||||
|
@ -654,7 +675,7 @@ class CAPI_TestCase(unittest.TestCase, HelperMixin):
|
||||||
self.assertEqual(r, obj)
|
self.assertEqual(r, obj)
|
||||||
|
|
||||||
with open(os_helper.TESTFN, 'wb') as f:
|
with open(os_helper.TESTFN, 'wb') as f:
|
||||||
f.write(data[:1])
|
f.write(omit_last_byte(data))
|
||||||
with self.assertRaises(EOFError):
|
with self.assertRaises(EOFError):
|
||||||
_testcapi.pymarshal_read_last_object_from_file(os_helper.TESTFN)
|
_testcapi.pymarshal_read_last_object_from_file(os_helper.TESTFN)
|
||||||
os_helper.unlink(os_helper.TESTFN)
|
os_helper.unlink(os_helper.TESTFN)
|
||||||
|
@ -671,7 +692,7 @@ class CAPI_TestCase(unittest.TestCase, HelperMixin):
|
||||||
self.assertEqual(p, len(data))
|
self.assertEqual(p, len(data))
|
||||||
|
|
||||||
with open(os_helper.TESTFN, 'wb') as f:
|
with open(os_helper.TESTFN, 'wb') as f:
|
||||||
f.write(data[:1])
|
f.write(omit_last_byte(data))
|
||||||
with self.assertRaises(EOFError):
|
with self.assertRaises(EOFError):
|
||||||
_testcapi.pymarshal_read_object_from_file(os_helper.TESTFN)
|
_testcapi.pymarshal_read_object_from_file(os_helper.TESTFN)
|
||||||
os_helper.unlink(os_helper.TESTFN)
|
os_helper.unlink(os_helper.TESTFN)
|
||||||
|
|
|
@ -0,0 +1,2 @@
|
||||||
|
:mod:`marshal` now supports :class:`slice` objects. The marshal format
|
||||||
|
version was increased to 5.
|
|
@ -121,6 +121,7 @@ compile_and_marshal(const char *name, const char *text)
|
||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
assert(Py_MARSHAL_VERSION >= 5);
|
||||||
PyObject *marshalled = PyMarshal_WriteObjectToString(code, Py_MARSHAL_VERSION);
|
PyObject *marshalled = PyMarshal_WriteObjectToString(code, Py_MARSHAL_VERSION);
|
||||||
Py_CLEAR(code);
|
Py_CLEAR(code);
|
||||||
if (marshalled == NULL) {
|
if (marshalled == NULL) {
|
||||||
|
|
|
@ -50,41 +50,52 @@ module marshal
|
||||||
# define MAX_MARSHAL_STACK_DEPTH 2000
|
# define MAX_MARSHAL_STACK_DEPTH 2000
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
/* Supported types */
|
||||||
#define TYPE_NULL '0'
|
#define TYPE_NULL '0'
|
||||||
#define TYPE_NONE 'N'
|
#define TYPE_NONE 'N'
|
||||||
#define TYPE_FALSE 'F'
|
#define TYPE_FALSE 'F'
|
||||||
#define TYPE_TRUE 'T'
|
#define TYPE_TRUE 'T'
|
||||||
#define TYPE_STOPITER 'S'
|
#define TYPE_STOPITER 'S'
|
||||||
#define TYPE_ELLIPSIS '.'
|
#define TYPE_ELLIPSIS '.'
|
||||||
#define TYPE_INT 'i'
|
#define TYPE_BINARY_FLOAT 'g' // Version 0 uses TYPE_FLOAT instead.
|
||||||
/* TYPE_INT64 is not generated anymore.
|
#define TYPE_BINARY_COMPLEX 'y' // Version 0 uses TYPE_COMPLEX instead.
|
||||||
Supported for backward compatibility only. */
|
#define TYPE_LONG 'l' // See also TYPE_INT.
|
||||||
#define TYPE_INT64 'I'
|
#define TYPE_STRING 's' // Bytes. (Name comes from Python 2.)
|
||||||
#define TYPE_FLOAT 'f'
|
#define TYPE_TUPLE '(' // See also TYPE_SMALL_TUPLE.
|
||||||
#define TYPE_BINARY_FLOAT 'g'
|
|
||||||
#define TYPE_COMPLEX 'x'
|
|
||||||
#define TYPE_BINARY_COMPLEX 'y'
|
|
||||||
#define TYPE_LONG 'l'
|
|
||||||
#define TYPE_STRING 's'
|
|
||||||
#define TYPE_INTERNED 't'
|
|
||||||
#define TYPE_REF 'r'
|
|
||||||
#define TYPE_TUPLE '('
|
|
||||||
#define TYPE_LIST '['
|
#define TYPE_LIST '['
|
||||||
#define TYPE_DICT '{'
|
#define TYPE_DICT '{'
|
||||||
#define TYPE_CODE 'c'
|
#define TYPE_CODE 'c'
|
||||||
#define TYPE_UNICODE 'u'
|
#define TYPE_UNICODE 'u'
|
||||||
#define TYPE_UNKNOWN '?'
|
#define TYPE_UNKNOWN '?'
|
||||||
|
// added in version 2:
|
||||||
#define TYPE_SET '<'
|
#define TYPE_SET '<'
|
||||||
#define TYPE_FROZENSET '>'
|
#define TYPE_FROZENSET '>'
|
||||||
|
// added in version 5:
|
||||||
#define TYPE_SLICE ':'
|
#define TYPE_SLICE ':'
|
||||||
#define FLAG_REF '\x80' /* with a type, add obj to index */
|
// Remember to update the version and documentation when adding new types.
|
||||||
|
|
||||||
|
/* Special cases for unicode strings (added in version 4) */
|
||||||
|
#define TYPE_INTERNED 't' // Version 1+
|
||||||
#define TYPE_ASCII 'a'
|
#define TYPE_ASCII 'a'
|
||||||
#define TYPE_ASCII_INTERNED 'A'
|
#define TYPE_ASCII_INTERNED 'A'
|
||||||
#define TYPE_SMALL_TUPLE ')'
|
|
||||||
#define TYPE_SHORT_ASCII 'z'
|
#define TYPE_SHORT_ASCII 'z'
|
||||||
#define TYPE_SHORT_ASCII_INTERNED 'Z'
|
#define TYPE_SHORT_ASCII_INTERNED 'Z'
|
||||||
|
|
||||||
|
/* Special cases for small objects */
|
||||||
|
#define TYPE_INT 'i' // All versions. 32-bit encoding.
|
||||||
|
#define TYPE_SMALL_TUPLE ')' // Version 4+
|
||||||
|
|
||||||
|
/* Supported for backwards compatibility */
|
||||||
|
#define TYPE_COMPLEX 'x' // Generated for version 0 only.
|
||||||
|
#define TYPE_FLOAT 'f' // Generated for version 0 only.
|
||||||
|
#define TYPE_INT64 'I' // Not generated any more.
|
||||||
|
|
||||||
|
/* References (added in version 3) */
|
||||||
|
#define TYPE_REF 'r'
|
||||||
|
#define FLAG_REF '\x80' /* with a type, add obj to index */
|
||||||
|
|
||||||
|
|
||||||
|
// Error codes:
|
||||||
#define WFERR_OK 0
|
#define WFERR_OK 0
|
||||||
#define WFERR_UNMARSHALLABLE 1
|
#define WFERR_UNMARSHALLABLE 1
|
||||||
#define WFERR_NESTEDTOODEEP 2
|
#define WFERR_NESTEDTOODEEP 2
|
||||||
|
@ -615,6 +626,11 @@ w_complex_object(PyObject *v, char flag, WFILE *p)
|
||||||
PyBuffer_Release(&view);
|
PyBuffer_Release(&view);
|
||||||
}
|
}
|
||||||
else if (PySlice_Check(v)) {
|
else if (PySlice_Check(v)) {
|
||||||
|
if (p->version < 5) {
|
||||||
|
w_byte(TYPE_UNKNOWN, p);
|
||||||
|
p->error = WFERR_UNMARSHALLABLE;
|
||||||
|
return;
|
||||||
|
}
|
||||||
PySliceObject *slice = (PySliceObject *)v;
|
PySliceObject *slice = (PySliceObject *)v;
|
||||||
W_TYPE(TYPE_SLICE, p);
|
W_TYPE(TYPE_SLICE, p);
|
||||||
w_object(slice->start, p);
|
w_object(slice->start, p);
|
||||||
|
|
Loading…
Reference in New Issue