diff --git a/Doc/library/plistlib.rst b/Doc/library/plistlib.rst index a2673688481..92de8600ea8 100644 --- a/Doc/library/plistlib.rst +++ b/Doc/library/plistlib.rst @@ -16,26 +16,21 @@ -------------- This module provides an interface for reading and writing the "property list" -XML files used mainly by Mac OS X. +files used mainly by Mac OS X and supports both binary and XML plist files. -The property list (``.plist``) file format is a simple XML pickle supporting +The property list (``.plist``) file format is a simple serialization supporting basic object types, like dictionaries, lists, numbers and strings. Usually the top level object is a dictionary. -To write out and to parse a plist file, use the :func:`writePlist` and -:func:`readPlist` functions. +To write out and to parse a plist file, use the :func:`dump` and +:func:`load` functions. -To work with plist data in bytes objects, use :func:`writePlistToBytes` -and :func:`readPlistFromBytes`. +To work with plist data in bytes objects, use :func:`dumps` +and :func:`loads`. Values can be strings, integers, floats, booleans, tuples, lists, dictionaries -(but only with string keys), :class:`Data` or :class:`datetime.datetime` -objects. String values (including dictionary keys) have to be unicode strings -- -they will be written out as UTF-8. - -The ```` plist type is supported through the :class:`Data` class. This is -a thin wrapper around a Python bytes object. Use :class:`Data` if your strings -contain control characters. +(but only with string keys), :class:`Data`, :class:`bytes`, :class:`bytesarray` +or :class:`datetime.datetime` objects. .. seealso:: @@ -45,37 +40,145 @@ contain control characters. This module defines the following functions: -.. function:: readPlist(pathOrFile) +.. function:: load(fp, \*, fmt=None, use_builtin_types=True, dict_type=dict) - Read a plist file. *pathOrFile* may either be a file name or a (readable and - binary) file object. Return the unpacked root object (which usually is a + Read a plist file. *fp* should be a readable and binary file object. + Return the unpacked root object (which usually is a dictionary). - The XML data is parsed using the Expat parser from :mod:`xml.parsers.expat` - -- see its documentation for possible exceptions on ill-formed XML. - Unknown elements will simply be ignored by the plist parser. + The *fmt* is the format of the file and the following values are valid: + + * :data:`None`: Autodetect the file format + + * :data:`FMT_XML`: XML file format + + * :data:`FMT_BINARY`: Binary plist format + + If *use_builtin_types* is True (the default) binary data will be returned + as instances of :class:`bytes`, otherwise it is returned as instances of + :class:`Data`. + + The *dict_type* is the type used for dictionaries that are read from the + plist file. The exact structure of the plist can be recovered by using + :class:`collections.OrderedDict` (although the order of keys shouldn't be + important in plist files). + + XML data for the :data:`FMT_XML` format is parsed using the Expat parser + from :mod:`xml.parsers.expat` -- see its documentation for possible + exceptions on ill-formed XML. Unknown elements will simply be ignored + by the plist parser. + + The parser for the binary format raises :exc:`InvalidFileException` + when the file cannot be parsed. + + .. versionadded:: 3.4 + + +.. function:: loads(data, \*, fmt=None, use_builtin_types=True, dict_type=dict) + + Load a plist from a bytes object. See :func:`load` for an explanation of + the keyword arguments. + + +.. function:: dump(value, fp, \*, fmt=FMT_XML, sort_keys=True, skipkeys=False) + + Write *value* to a plist file. *Fp* should be a writable, binary + file object. + + The *fmt* argument specifies the format of the plist file and can be + one of the following values: + + * :data:`FMT_XML`: XML formatted plist file + + * :data:`FMT_BINARY`: Binary formatted plist file + + When *sort_keys* is true (the default) the keys for dictionaries will be + written to the plist in sorted order, otherwise they will be written in + the iteration order of the dictionary. + + When *skipkeys* is false (the default) the function raises :exc:`TypeError` + when a key of a dictionary is not a string, otherwise such keys are skipped. + + A :exc:`TypeError` will be raised if the object is of an unsupported type or + a container that contains objects of unsupported types. + + .. versionchanged:: 3.4 + Added the *fmt*, *sort_keys* and *skipkeys* arguments. + + +.. function:: dumps(value, \*, fmt=FMT_XML, sort_keys=True, skipkeys=False) + + Return *value* as a plist-formatted bytes object. See + the documentation for :func:`dump` for an explanation of the keyword + arguments of this function. + + +The following functions are deprecated: + +.. function:: readPlist(pathOrFile) + + Read a plist file. *pathOrFile* may be either a file name or a (readable + and binary) file object. Returns the unpacked root object (which usually + is a dictionary). + + This function calls :func:`load` to do the actual work, the the documentation + of :func:`that function ` for an explanation of the keyword arguments. + + .. note:: + + Dict values in the result have a ``__getattr__`` method that defers + to ``__getitem_``. This means that you can use attribute access to + access items of these dictionaries. + + .. deprecated: 3.4 Use :func:`load` instead. .. function:: writePlist(rootObject, pathOrFile) - Write *rootObject* to a plist file. *pathOrFile* may either be a file name - or a (writable and binary) file object. + Write *rootObject* to an XML plist file. *pathOrFile* may be either a file name + or a (writable and binary) file object - A :exc:`TypeError` will be raised if the object is of an unsupported type or - a container that contains objects of unsupported types. + .. deprecated: 3.4 Use :func:`dump` instead. .. function:: readPlistFromBytes(data) Read a plist data from a bytes object. Return the root object. + See :func:`load` for a description of the keyword arguments. + + .. note:: + + Dict values in the result have a ``__getattr__`` method that defers + to ``__getitem_``. This means that you can use attribute access to + access items of these dictionaries. + + .. deprecated:: 3.4 Use :func:`loads` instead. + .. function:: writePlistToBytes(rootObject) - Return *rootObject* as a plist-formatted bytes object. + Return *rootObject* as an XML plist-formatted bytes object. + + .. deprecated:: 3.4 Use :func:`dumps` instead. + + .. versionchanged:: 3.4 + Added the *fmt*, *sort_keys* and *skipkeys* arguments. -The following class is available: +The following classes are available: + +.. class:: Dict([dict]): + + Return an extended mapping object with the same value as dictionary + *dict*. + + This class is a subclass of :class:`dict` where attribute access can + be used to access items. That is, ``aDict.key`` is the same as + ``aDict['key']`` for getting, setting and deleting items in the mapping. + + .. deprecated:: 3.0 + .. class:: Data(data) @@ -86,6 +189,24 @@ The following class is available: It has one attribute, :attr:`data`, that can be used to retrieve the Python bytes object stored in it. + .. deprecated:: 3.4 Use a :class:`bytes` object instead + + +The following constants are avaiable: + +.. data:: FMT_XML + + The XML format for plist files. + + .. versionadded:: 3.4 + + +.. data:: FMT_BINARY + + The binary format for plist files + + .. versionadded:: 3.4 + Examples -------- @@ -103,13 +224,15 @@ Generating a plist:: aTrueValue = True, aFalseValue = False, ), - someData = Data(b""), - someMoreData = Data(b"" * 10), + someData = b"", + someMoreData = b"" * 10, aDate = datetime.datetime.fromtimestamp(time.mktime(time.gmtime())), ) - writePlist(pl, fileName) + with open(fileName, 'wb') as fp: + dump(pl, fp) Parsing a plist:: - pl = readPlist(pathOrFile) + with open(fileName, 'rb') as fp: + pl = load(fp) print(pl["aKey"]) diff --git a/Lib/plistlib.py b/Lib/plistlib.py index 2b0b6341007..bdb8ab352e9 100644 --- a/Lib/plistlib.py +++ b/Lib/plistlib.py @@ -4,25 +4,20 @@ The property list (.plist) file format is a simple XML pickle supporting basic object types, like dictionaries, lists, numbers and strings. Usually the top level object is a dictionary. -To write out a plist file, use the writePlist(rootObject, pathOrFile) -function. 'rootObject' is the top level object, 'pathOrFile' is a -filename or a (writable) file object. +To write out a plist file, use the dump(value, file) +function. 'value' is the top level object, 'file' is +a (writable) file object. -To parse a plist from a file, use the readPlist(pathOrFile) function, -with a file name or a (readable) file object as the only argument. It +To parse a plist from a file, use the load(file) function, +with a (readable) file object as the only argument. It returns the top level object (again, usually a dictionary). -To work with plist data in bytes objects, you can use readPlistFromBytes() -and writePlistToBytes(). +To work with plist data in bytes objects, you can use loads() +and dumps(). Values can be strings, integers, floats, booleans, tuples, lists, -dictionaries (but only with string keys), Data or datetime.datetime objects. -String values (including dictionary keys) have to be unicode strings -- they -will be written out as UTF-8. - -The plist type is supported through the Data class. This is a -thin wrapper around a Python bytes object. Use 'Data' if your strings -contain control characters. +dictionaries (but only with string keys), Data, bytes, bytearray, or +datetime.datetime objects. Generate Plist example: @@ -37,226 +32,48 @@ Generate Plist example: aTrueValue = True, aFalseValue = False, ), - someData = Data(b""), - someMoreData = Data(b"" * 10), + someData = b"", + someMoreData = b"" * 10, aDate = datetime.datetime.fromtimestamp(time.mktime(time.gmtime())), ) - writePlist(pl, fileName) + with open(fileName, 'wb') as fp: + dump(pl, fp) Parse Plist example: - pl = readPlist(pathOrFile) - print pl["aKey"] + with open(fileName, 'rb') as fp: + pl = load(fp) + print(pl["aKey"]) """ - - __all__ = [ "readPlist", "writePlist", "readPlistFromBytes", "writePlistToBytes", - "Plist", "Data", "Dict" + "Plist", "Data", "Dict", "FMT_XML", "FMT_BINARY", + "load", "dump", "loads", "dumps" ] -# Note: the Plist and Dict classes have been deprecated. import binascii +import codecs +import contextlib import datetime +import enum from io import BytesIO +import itertools +import os import re +import struct +from warnings import warn +from xml.parsers.expat import ParserCreate -def readPlist(pathOrFile): - """Read a .plist file. 'pathOrFile' may either be a file name or a - (readable) file object. Return the unpacked root object (which - usually is a dictionary). - """ - didOpen = False - try: - if isinstance(pathOrFile, str): - pathOrFile = open(pathOrFile, 'rb') - didOpen = True - p = PlistParser() - rootObject = p.parse(pathOrFile) - finally: - if didOpen: - pathOrFile.close() - return rootObject +PlistFormat = enum.Enum('PlistFormat', 'FMT_XML FMT_BINARY', module=__name__) +globals().update(PlistFormat.__members__) -def writePlist(rootObject, pathOrFile): - """Write 'rootObject' to a .plist file. 'pathOrFile' may either be a - file name or a (writable) file object. - """ - didOpen = False - try: - if isinstance(pathOrFile, str): - pathOrFile = open(pathOrFile, 'wb') - didOpen = True - writer = PlistWriter(pathOrFile) - writer.writeln("") - writer.writeValue(rootObject) - writer.writeln("") - finally: - if didOpen: - pathOrFile.close() - - -def readPlistFromBytes(data): - """Read a plist data from a bytes object. Return the root object. - """ - return readPlist(BytesIO(data)) - - -def writePlistToBytes(rootObject): - """Return 'rootObject' as a plist-formatted bytes object. - """ - f = BytesIO() - writePlist(rootObject, f) - return f.getvalue() - - -class DumbXMLWriter: - def __init__(self, file, indentLevel=0, indent="\t"): - self.file = file - self.stack = [] - self.indentLevel = indentLevel - self.indent = indent - - def beginElement(self, element): - self.stack.append(element) - self.writeln("<%s>" % element) - self.indentLevel += 1 - - def endElement(self, element): - assert self.indentLevel > 0 - assert self.stack.pop() == element - self.indentLevel -= 1 - self.writeln("" % element) - - def simpleElement(self, element, value=None): - if value is not None: - value = _escape(value) - self.writeln("<%s>%s" % (element, value, element)) - else: - self.writeln("<%s/>" % element) - - def writeln(self, line): - if line: - # plist has fixed encoding of utf-8 - if isinstance(line, str): - line = line.encode('utf-8') - self.file.write(self.indentLevel * self.indent) - self.file.write(line) - self.file.write(b'\n') - - -# Contents should conform to a subset of ISO 8601 -# (in particular, YYYY '-' MM '-' DD 'T' HH ':' MM ':' SS 'Z'. Smaller units may be omitted with -# a loss of precision) -_dateParser = re.compile(r"(?P\d\d\d\d)(?:-(?P\d\d)(?:-(?P\d\d)(?:T(?P\d\d)(?::(?P\d\d)(?::(?P\d\d))?)?)?)?)?Z", re.ASCII) - -def _dateFromString(s): - order = ('year', 'month', 'day', 'hour', 'minute', 'second') - gd = _dateParser.match(s).groupdict() - lst = [] - for key in order: - val = gd[key] - if val is None: - break - lst.append(int(val)) - return datetime.datetime(*lst) - -def _dateToString(d): - return '%04d-%02d-%02dT%02d:%02d:%02dZ' % ( - d.year, d.month, d.day, - d.hour, d.minute, d.second - ) - - -# Regex to find any control chars, except for \t \n and \r -_controlCharPat = re.compile( - r"[\x00\x01\x02\x03\x04\x05\x06\x07\x08\x0b\x0c\x0e\x0f" - r"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f]") - -def _escape(text): - m = _controlCharPat.search(text) - if m is not None: - raise ValueError("strings can't contains control characters; " - "use plistlib.Data instead") - text = text.replace("\r\n", "\n") # convert DOS line endings - text = text.replace("\r", "\n") # convert Mac line endings - text = text.replace("&", "&") # escape '&' - text = text.replace("<", "<") # escape '<' - text = text.replace(">", ">") # escape '>' - return text - - -PLISTHEADER = b"""\ - - -""" - -class PlistWriter(DumbXMLWriter): - - def __init__(self, file, indentLevel=0, indent=b"\t", writeHeader=1): - if writeHeader: - file.write(PLISTHEADER) - DumbXMLWriter.__init__(self, file, indentLevel, indent) - - def writeValue(self, value): - if isinstance(value, str): - self.simpleElement("string", value) - elif isinstance(value, bool): - # must switch for bool before int, as bool is a - # subclass of int... - if value: - self.simpleElement("true") - else: - self.simpleElement("false") - elif isinstance(value, int): - self.simpleElement("integer", "%d" % value) - elif isinstance(value, float): - self.simpleElement("real", repr(value)) - elif isinstance(value, dict): - self.writeDict(value) - elif isinstance(value, Data): - self.writeData(value) - elif isinstance(value, datetime.datetime): - self.simpleElement("date", _dateToString(value)) - elif isinstance(value, (tuple, list)): - self.writeArray(value) - else: - raise TypeError("unsupported type: %s" % type(value)) - - def writeData(self, data): - self.beginElement("data") - self.indentLevel -= 1 - maxlinelength = max(16, 76 - len(self.indent.replace(b"\t", b" " * 8) * - self.indentLevel)) - for line in data.asBase64(maxlinelength).split(b"\n"): - if line: - self.writeln(line) - self.indentLevel += 1 - self.endElement("data") - - def writeDict(self, d): - if d: - self.beginElement("dict") - items = sorted(d.items()) - for key, value in items: - if not isinstance(key, str): - raise TypeError("keys must be strings") - self.simpleElement("key", key) - self.writeValue(value) - self.endElement("dict") - else: - self.simpleElement("dict") - - def writeArray(self, array): - if array: - self.beginElement("array") - for value in array: - self.writeValue(value) - self.endElement("array") - else: - self.simpleElement("array") +# +# +# Deprecated functionality +# +# class _InternalDict(dict): @@ -264,19 +81,18 @@ class _InternalDict(dict): # This class is needed while Dict is scheduled for deprecation: # we only need to warn when a *user* instantiates Dict or when # the "attribute notation for dict keys" is used. + __slots__ = () def __getattr__(self, attr): try: value = self[attr] except KeyError: raise AttributeError(attr) - from warnings import warn warn("Attribute access from plist dicts is deprecated, use d[key] " "notation instead", DeprecationWarning, 2) return value def __setattr__(self, attr, value): - from warnings import warn warn("Attribute access from plist dicts is deprecated, use d[key] " "notation instead", DeprecationWarning, 2) self[attr] = value @@ -286,56 +102,111 @@ class _InternalDict(dict): del self[attr] except KeyError: raise AttributeError(attr) - from warnings import warn warn("Attribute access from plist dicts is deprecated, use d[key] " "notation instead", DeprecationWarning, 2) + class Dict(_InternalDict): def __init__(self, **kwargs): - from warnings import warn warn("The plistlib.Dict class is deprecated, use builtin dict instead", DeprecationWarning, 2) super().__init__(**kwargs) -class Plist(_InternalDict): +@contextlib.contextmanager +def _maybe_open(pathOrFile, mode): + if isinstance(pathOrFile, str): + with open(pathOrFile, mode) as fp: + yield fp - """This class has been deprecated. Use readPlist() and writePlist() + else: + yield pathOrFile + + +class Plist(_InternalDict): + """This class has been deprecated. Use dump() and load() functions instead, together with regular dict objects. """ def __init__(self, **kwargs): - from warnings import warn - warn("The Plist class is deprecated, use the readPlist() and " - "writePlist() functions instead", DeprecationWarning, 2) + warn("The Plist class is deprecated, use the load() and " + "dump() functions instead", DeprecationWarning, 2) super().__init__(**kwargs) + @classmethod def fromFile(cls, pathOrFile): - """Deprecated. Use the readPlist() function instead.""" - rootObject = readPlist(pathOrFile) + """Deprecated. Use the load() function instead.""" + with maybe_open(pathOrFile, 'rb') as fp: + value = load(fp) plist = cls() - plist.update(rootObject) + plist.update(value) return plist - fromFile = classmethod(fromFile) def write(self, pathOrFile): - """Deprecated. Use the writePlist() function instead.""" - writePlist(self, pathOrFile) + """Deprecated. Use the dump() function instead.""" + with _maybe_open(pathOrFile, 'wb') as fp: + dump(self, fp) -def _encodeBase64(s, maxlinelength=76): - # copied from base64.encodebytes(), with added maxlinelength argument - maxbinsize = (maxlinelength//4)*3 - pieces = [] - for i in range(0, len(s), maxbinsize): - chunk = s[i : i + maxbinsize] - pieces.append(binascii.b2a_base64(chunk)) - return b''.join(pieces) +def readPlist(pathOrFile): + """ + Read a .plist from a path or file. pathOrFile should either + be a file name, or a readable binary file object. + + This function is deprecated, use load instead. + """ + warn("The readPlist function is deprecated, use load() instead", + DeprecationWarning, 2) + + with _maybe_open(pathOrFile, 'rb') as fp: + return load(fp, fmt=None, use_builtin_types=False, + dict_type=_InternalDict) + +def writePlist(value, pathOrFile): + """ + Write 'value' to a .plist file. 'pathOrFile' may either be a + file name or a (writable) file object. + + This function is deprecated, use dump instead. + """ + warn("The writePlist function is deprecated, use dump() instead", + DeprecationWarning, 2) + with _maybe_open(pathOrFile, 'wb') as fp: + dump(value, fp, fmt=FMT_XML, sort_keys=True, skipkeys=False) + + +def readPlistFromBytes(data): + """ + Read a plist data from a bytes object. Return the root object. + + This function is deprecated, use loads instead. + """ + warn("The readPlistFromBytes function is deprecated, use loads() instead", + DeprecationWarning, 2) + return load(BytesIO(data), fmt=None, use_builtin_types=False, + dict_type=_InternalDict) + + +def writePlistToBytes(value): + """ + Return 'value' as a plist-formatted bytes object. + + This function is deprecated, use dumps instead. + """ + warn("The writePlistToBytes function is deprecated, use dumps() instead", + DeprecationWarning, 2) + f = BytesIO() + dump(value, f, fmt=FMT_XML, sort_keys=True, skipkeys=False) + return f.getvalue() + class Data: + """ + Wrapper for binary data. - """Wrapper for binary data.""" + This class is deprecated, use a bytes object instead. + """ def __init__(self, data): if not isinstance(data, bytes): @@ -346,10 +217,10 @@ class Data: def fromBase64(cls, data): # base64.decodebytes just calls binascii.a2b_base64; # it seems overkill to use both base64 and binascii. - return cls(binascii.a2b_base64(data)) + return cls(_decode_base64(data)) def asBase64(self, maxlinelength=76): - return _encodeBase64(self.data, maxlinelength) + return _encode_base64(self.data, maxlinelength) def __eq__(self, other): if isinstance(other, self.__class__): @@ -362,43 +233,119 @@ class Data: def __repr__(self): return "%s(%s)" % (self.__class__.__name__, repr(self.data)) -class PlistParser: +# +# +# End of deprecated functionality +# +# - def __init__(self): + +# +# XML support +# + + +# XML 'header' +PLISTHEADER = b"""\ + + +""" + + +# Regex to find any control chars, except for \t \n and \r +_controlCharPat = re.compile( + r"[\x00\x01\x02\x03\x04\x05\x06\x07\x08\x0b\x0c\x0e\x0f" + r"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f]") + +def _encode_base64(s, maxlinelength=76): + # copied from base64.encodebytes(), with added maxlinelength argument + maxbinsize = (maxlinelength//4)*3 + pieces = [] + for i in range(0, len(s), maxbinsize): + chunk = s[i : i + maxbinsize] + pieces.append(binascii.b2a_base64(chunk)) + return b''.join(pieces) + +def _decode_base64(s): + if isinstance(s, str): + return binascii.a2b_base64(s.encode("utf-8")) + + else: + return binascii.a2b_base64(s) + +# Contents should conform to a subset of ISO 8601 +# (in particular, YYYY '-' MM '-' DD 'T' HH ':' MM ':' SS 'Z'. Smaller units +# may be omitted with # a loss of precision) +_dateParser = re.compile(r"(?P\d\d\d\d)(?:-(?P\d\d)(?:-(?P\d\d)(?:T(?P\d\d)(?::(?P\d\d)(?::(?P\d\d))?)?)?)?)?Z", re.ASCII) + + +def _date_from_string(s): + order = ('year', 'month', 'day', 'hour', 'minute', 'second') + gd = _dateParser.match(s).groupdict() + lst = [] + for key in order: + val = gd[key] + if val is None: + break + lst.append(int(val)) + return datetime.datetime(*lst) + + +def _date_to_string(d): + return '%04d-%02d-%02dT%02d:%02d:%02dZ' % ( + d.year, d.month, d.day, + d.hour, d.minute, d.second + ) + +def _escape(text): + m = _controlCharPat.search(text) + if m is not None: + raise ValueError("strings can't contains control characters; " + "use bytes instead") + text = text.replace("\r\n", "\n") # convert DOS line endings + text = text.replace("\r", "\n") # convert Mac line endings + text = text.replace("&", "&") # escape '&' + text = text.replace("<", "<") # escape '<' + text = text.replace(">", ">") # escape '>' + return text + +class _PlistParser: + def __init__(self, use_builtin_types, dict_type): self.stack = [] - self.currentKey = None + self.current_key = None self.root = None + self._use_builtin_types = use_builtin_types + self._dict_type = dict_type def parse(self, fileobj): - from xml.parsers.expat import ParserCreate self.parser = ParserCreate() - self.parser.StartElementHandler = self.handleBeginElement - self.parser.EndElementHandler = self.handleEndElement - self.parser.CharacterDataHandler = self.handleData + self.parser.StartElementHandler = self.handle_begin_element + self.parser.EndElementHandler = self.handle_end_element + self.parser.CharacterDataHandler = self.handle_data self.parser.ParseFile(fileobj) return self.root - def handleBeginElement(self, element, attrs): + def handle_begin_element(self, element, attrs): self.data = [] handler = getattr(self, "begin_" + element, None) if handler is not None: handler(attrs) - def handleEndElement(self, element): + def handle_end_element(self, element): handler = getattr(self, "end_" + element, None) if handler is not None: handler() - def handleData(self, data): + def handle_data(self, data): self.data.append(data) - def addObject(self, value): - if self.currentKey is not None: + def add_object(self, value): + if self.current_key is not None: if not isinstance(self.stack[-1], type({})): raise ValueError("unexpected element at line %d" % self.parser.CurrentLineNumber) - self.stack[-1][self.currentKey] = value - self.currentKey = None + self.stack[-1][self.current_key] = value + self.current_key = None elif not self.stack: # this is the root object self.root = value @@ -408,7 +355,7 @@ class PlistParser: self.parser.CurrentLineNumber) self.stack[-1].append(value) - def getData(self): + def get_data(self): data = ''.join(self.data) self.data = [] return data @@ -416,39 +363,648 @@ class PlistParser: # element handlers def begin_dict(self, attrs): - d = _InternalDict() - self.addObject(d) + d = self._dict_type() + self.add_object(d) self.stack.append(d) + def end_dict(self): - if self.currentKey: + if self.current_key: raise ValueError("missing value for key '%s' at line %d" % - (self.currentKey,self.parser.CurrentLineNumber)) + (self.current_key,self.parser.CurrentLineNumber)) self.stack.pop() def end_key(self): - if self.currentKey or not isinstance(self.stack[-1], type({})): + if self.current_key or not isinstance(self.stack[-1], type({})): raise ValueError("unexpected key at line %d" % self.parser.CurrentLineNumber) - self.currentKey = self.getData() + self.current_key = self.get_data() def begin_array(self, attrs): a = [] - self.addObject(a) + self.add_object(a) self.stack.append(a) + def end_array(self): self.stack.pop() def end_true(self): - self.addObject(True) + self.add_object(True) + def end_false(self): - self.addObject(False) + self.add_object(False) + def end_integer(self): - self.addObject(int(self.getData())) + self.add_object(int(self.get_data())) + def end_real(self): - self.addObject(float(self.getData())) + self.add_object(float(self.get_data())) + def end_string(self): - self.addObject(self.getData()) + self.add_object(self.get_data()) + def end_data(self): - self.addObject(Data.fromBase64(self.getData().encode("utf-8"))) + if self._use_builtin_types: + self.add_object(_decode_base64(self.get_data())) + + else: + self.add_object(Data.fromBase64(self.get_data())) + def end_date(self): - self.addObject(_dateFromString(self.getData())) + self.add_object(_date_from_string(self.get_data())) + + +class _DumbXMLWriter: + def __init__(self, file, indent_level=0, indent="\t"): + self.file = file + self.stack = [] + self._indent_level = indent_level + self.indent = indent + + def begin_element(self, element): + self.stack.append(element) + self.writeln("<%s>" % element) + self._indent_level += 1 + + def end_element(self, element): + assert self._indent_level > 0 + assert self.stack.pop() == element + self._indent_level -= 1 + self.writeln("" % element) + + def simple_element(self, element, value=None): + if value is not None: + value = _escape(value) + self.writeln("<%s>%s" % (element, value, element)) + + else: + self.writeln("<%s/>" % element) + + def writeln(self, line): + if line: + # plist has fixed encoding of utf-8 + + # XXX: is this test needed? + if isinstance(line, str): + line = line.encode('utf-8') + self.file.write(self._indent_level * self.indent) + self.file.write(line) + self.file.write(b'\n') + + +class _PlistWriter(_DumbXMLWriter): + def __init__( + self, file, indent_level=0, indent=b"\t", writeHeader=1, + sort_keys=True, skipkeys=False): + + if writeHeader: + file.write(PLISTHEADER) + _DumbXMLWriter.__init__(self, file, indent_level, indent) + self._sort_keys = sort_keys + self._skipkeys = skipkeys + + def write(self, value): + self.writeln("") + self.write_value(value) + self.writeln("") + + def write_value(self, value): + if isinstance(value, str): + self.simple_element("string", value) + + elif value is True: + self.simple_element("true") + + elif value is False: + self.simple_element("false") + + elif isinstance(value, int): + self.simple_element("integer", "%d" % value) + + elif isinstance(value, float): + self.simple_element("real", repr(value)) + + elif isinstance(value, dict): + self.write_dict(value) + + elif isinstance(value, Data): + self.write_data(value) + + elif isinstance(value, (bytes, bytearray)): + self.write_bytes(value) + + elif isinstance(value, datetime.datetime): + self.simple_element("date", _date_to_string(value)) + + elif isinstance(value, (tuple, list)): + self.write_array(value) + + else: + raise TypeError("unsupported type: %s" % type(value)) + + def write_data(self, data): + self.write_bytes(data.data) + + def write_bytes(self, data): + self.begin_element("data") + self._indent_level -= 1 + maxlinelength = max( + 16, + 76 - len(self.indent.replace(b"\t", b" " * 8) * self._indent_level)) + + for line in _encode_base64(data, maxlinelength).split(b"\n"): + if line: + self.writeln(line) + self._indent_level += 1 + self.end_element("data") + + def write_dict(self, d): + if d: + self.begin_element("dict") + if self._sort_keys: + items = sorted(d.items()) + else: + items = d.items() + + for key, value in items: + if not isinstance(key, str): + if self._skipkeys: + continue + raise TypeError("keys must be strings") + self.simple_element("key", key) + self.write_value(value) + self.end_element("dict") + + else: + self.simple_element("dict") + + def write_array(self, array): + if array: + self.begin_element("array") + for value in array: + self.write_value(value) + self.end_element("array") + + else: + self.simple_element("array") + + +def _is_fmt_xml(header): + prefixes = (b'offset... + # TRAILER + self._fp = fp + self._fp.seek(-32, os.SEEK_END) + trailer = self._fp.read(32) + if len(trailer) != 32: + raise InvalidFileException() + ( + offset_size, self._ref_size, num_objects, top_object, + offset_table_offset + ) = struct.unpack('>6xBBQQQ', trailer) + self._fp.seek(offset_table_offset) + offset_format = '>' + _BINARY_FORMAT[offset_size] * num_objects + self._ref_format = _BINARY_FORMAT[self._ref_size] + self._object_offsets = struct.unpack( + offset_format, self._fp.read(offset_size * num_objects)) + return self._read_object(self._object_offsets[top_object]) + + except (OSError, IndexError, struct.error): + raise InvalidFileException() + + def _get_size(self, tokenL): + """ return the size of the next object.""" + if tokenL == 0xF: + m = self._fp.read(1)[0] & 0x3 + s = 1 << m + f = '>' + _BINARY_FORMAT[s] + return struct.unpack(f, self._fp.read(s))[0] + + return tokenL + + def _read_refs(self, n): + return struct.unpack( + '>' + self._ref_format * n, self._fp.read(n * self._ref_size)) + + def _read_object(self, offset): + """ + read the object at offset. + + May recursively read sub-objects (content of an array/dict/set) + """ + self._fp.seek(offset) + token = self._fp.read(1)[0] + tokenH, tokenL = token & 0xF0, token & 0x0F + + if token == 0x00: + return None + + elif token == 0x08: + return False + + elif token == 0x09: + return True + + # The referenced source code also mentions URL (0x0c, 0x0d) and + # UUID (0x0e), but neither can be generated using the Cocoa libraries. + + elif token == 0x0f: + return b'' + + elif tokenH == 0x10: # int + return int.from_bytes(self._fp.read(1 << tokenL), 'big') + + elif token == 0x22: # real + return struct.unpack('>f', self._fp.read(4))[0] + + elif token == 0x23: # real + return struct.unpack('>d', self._fp.read(8))[0] + + elif token == 0x33: # date + f = struct.unpack('>d', self._fp.read(8))[0] + # timestamp 0 of binary plists corresponds to 1/1/2001 + # (year of Mac OS X 10.0), instead of 1/1/1970. + return datetime.datetime.utcfromtimestamp(f + (31 * 365 + 8) * 86400) + + elif tokenH == 0x40: # data + s = self._get_size(tokenL) + if self._use_builtin_types: + return self._fp.read(s) + else: + return Data(self._fp.read(s)) + + elif tokenH == 0x50: # ascii string + s = self._get_size(tokenL) + result = self._fp.read(s).decode('ascii') + return result + + elif tokenH == 0x60: # unicode string + s = self._get_size(tokenL) + return self._fp.read(s * 2).decode('utf-16be') + + # tokenH == 0x80 is documented as 'UID' and appears to be used for + # keyed-archiving, not in plists. + + elif tokenH == 0xA0: # array + s = self._get_size(tokenL) + obj_refs = self._read_refs(s) + return [self._read_object(self._object_offsets[x]) + for x in obj_refs] + + # tokenH == 0xB0 is documented as 'ordset', but is not actually + # implemented in the Apple reference code. + + # tokenH == 0xC0 is documented as 'set', but sets cannot be used in + # plists. + + elif tokenH == 0xD0: # dict + s = self._get_size(tokenL) + key_refs = self._read_refs(s) + obj_refs = self._read_refs(s) + result = self._dict_type() + for k, o in zip(key_refs, obj_refs): + result[self._read_object(self._object_offsets[k]) + ] = self._read_object(self._object_offsets[o]) + return result + + raise InvalidFileException() + +def _count_to_size(count): + if count < 1 << 8: + return 1 + + elif count < 1 << 16: + return 2 + + elif count << 1 << 32: + return 4 + + else: + return 8 + +class _BinaryPlistWriter (object): + def __init__(self, fp, sort_keys, skipkeys): + self._fp = fp + self._sort_keys = sort_keys + self._skipkeys = skipkeys + + def write(self, value): + + # Flattened object list: + self._objlist = [] + + # Mappings from object->objectid + # First dict has (type(object), object) as the key, + # second dict is used when object is not hashable and + # has id(object) as the key. + self._objtable = {} + self._objidtable = {} + + # Create list of all objects in the plist + self._flatten(value) + + # Size of object references in serialized containers + # depends on the number of objects in the plist. + num_objects = len(self._objlist) + self._object_offsets = [0]*num_objects + self._ref_size = _count_to_size(num_objects) + + self._ref_format = _BINARY_FORMAT[self._ref_size] + + # Write file header + self._fp.write(b'bplist00') + + # Write object list + for obj in self._objlist: + self._write_object(obj) + + # Write refnum->object offset table + top_object = self._getrefnum(value) + offset_table_offset = self._fp.tell() + offset_size = _count_to_size(offset_table_offset) + offset_format = '>' + _BINARY_FORMAT[offset_size] * num_objects + self._fp.write(struct.pack(offset_format, *self._object_offsets)) + + # Write trailer + sort_version = 0 + trailer = ( + sort_version, offset_size, self._ref_size, num_objects, + top_object, offset_table_offset + ) + self._fp.write(struct.pack('>5xBBBQQQ', *trailer)) + + def _flatten(self, value): + # First check if the object is in the object table, not used for + # containers to ensure that two subcontainers with the same contents + # will be serialized as distinct values. + if isinstance(value, ( + str, int, float, datetime.datetime, bytes, bytearray)): + if (type(value), value) in self._objtable: + return + + elif isinstance(value, Data): + if (type(value.data), value.data) in self._objtable: + return + + # Add to objectreference map + refnum = len(self._objlist) + self._objlist.append(value) + try: + if isinstance(value, Data): + self._objtable[(type(value.data), value.data)] = refnum + else: + self._objtable[(type(value), value)] = refnum + except TypeError: + self._objidtable[id(value)] = refnum + + # And finally recurse into containers + if isinstance(value, dict): + keys = [] + values = [] + items = value.items() + if self._sort_keys: + items = sorted(items) + + for k, v in items: + if not isinstance(k, str): + if self._skipkeys: + continue + raise TypeError("keys must be strings") + keys.append(k) + values.append(v) + + for o in itertools.chain(keys, values): + self._flatten(o) + + elif isinstance(value, (list, tuple)): + for o in value: + self._flatten(o) + + def _getrefnum(self, value): + try: + if isinstance(value, Data): + return self._objtable[(type(value.data), value.data)] + else: + return self._objtable[(type(value), value)] + except TypeError: + return self._objidtable[id(value)] + + def _write_size(self, token, size): + if size < 15: + self._fp.write(struct.pack('>B', token | size)) + + elif size < 1 << 8: + self._fp.write(struct.pack('>BBB', token | 0xF, 0x10, size)) + + elif size < 1 << 16: + self._fp.write(struct.pack('>BBH', token | 0xF, 0x11, size)) + + elif size < 1 << 32: + self._fp.write(struct.pack('>BBL', token | 0xF, 0x12, size)) + + else: + self._fp.write(struct.pack('>BBQ', token | 0xF, 0x13, size)) + + def _write_object(self, value): + ref = self._getrefnum(value) + self._object_offsets[ref] = self._fp.tell() + if value is None: + self._fp.write(b'\x00') + + elif value is False: + self._fp.write(b'\x08') + + elif value is True: + self._fp.write(b'\x09') + + elif isinstance(value, int): + if value < 1 << 8: + self._fp.write(struct.pack('>BB', 0x10, value)) + elif value < 1 << 16: + self._fp.write(struct.pack('>BH', 0x11, value)) + elif value < 1 << 32: + self._fp.write(struct.pack('>BL', 0x12, value)) + else: + self._fp.write(struct.pack('>BQ', 0x13, value)) + + elif isinstance(value, float): + self._fp.write(struct.pack('>Bd', 0x23, value)) + + elif isinstance(value, datetime.datetime): + f = (value - datetime.datetime(2001, 1, 1)).total_seconds() + self._fp.write(struct.pack('>Bd', 0x33, f)) + + elif isinstance(value, Data): + self._write_size(0x40, len(value.data)) + self._fp.write(value.data) + + elif isinstance(value, (bytes, bytearray)): + self._write_size(0x40, len(value)) + self._fp.write(value) + + elif isinstance(value, str): + try: + t = value.encode('ascii') + self._write_size(0x50, len(value)) + except UnicodeEncodeError: + t = value.encode('utf-16be') + self._write_size(0x60, len(value)) + + self._fp.write(t) + + elif isinstance(value, (list, tuple)): + refs = [self._getrefnum(o) for o in value] + s = len(refs) + self._write_size(0xA0, s) + self._fp.write(struct.pack('>' + self._ref_format * s, *refs)) + + elif isinstance(value, dict): + keyRefs, valRefs = [], [] + + if self._sort_keys: + rootItems = sorted(value.items()) + else: + rootItems = value.items() + + for k, v in rootItems: + if not isinstance(k, str): + if self._skipkeys: + continue + raise TypeError("keys must be strings") + keyRefs.append(self._getrefnum(k)) + valRefs.append(self._getrefnum(v)) + + s = len(keyRefs) + self._write_size(0xD0, s) + self._fp.write(struct.pack('>' + self._ref_format * s, *keyRefs)) + self._fp.write(struct.pack('>' + self._ref_format * s, *valRefs)) + + else: + raise InvalidFileException() + + +def _is_fmt_binary(header): + return header[:8] == b'bplist00' + + +# +# Generic bits +# + +_FORMATS={ + FMT_XML: dict( + detect=_is_fmt_xml, + parser=_PlistParser, + writer=_PlistWriter, + ), + FMT_BINARY: dict( + detect=_is_fmt_binary, + parser=_BinaryPlistParser, + writer=_BinaryPlistWriter, + ) +} + + +def load(fp, *, fmt=None, use_builtin_types=True, dict_type=dict): + """Read a .plist file. 'fp' should be (readable) file object. + Return the unpacked root object (which usually is a dictionary). + """ + if fmt is None: + header = fp.read(32) + fp.seek(0) + for info in _FORMATS.values(): + if info['detect'](header): + p = info['parser']( + use_builtin_types=use_builtin_types, + dict_type=dict_type, + ) + break + + else: + raise InvalidFileException() + + else: + p = _FORMATS[fmt]['parser'](use_builtin_types=use_builtin_types) + + return p.parse(fp) + + +def loads(value, *, fmt=None, use_builtin_types=True, dict_type=dict): + """Read a .plist file from a bytes object. + Return the unpacked root object (which usually is a dictionary). + """ + fp = BytesIO(value) + return load( + fp, fmt=fmt, use_builtin_types=use_builtin_types, dict_type=dict_type) + + +def dump(value, fp, *, fmt=FMT_XML, sort_keys=True, skipkeys=False): + """Write 'value' to a .plist file. 'fp' should be a (writable) + file object. + """ + if fmt not in _FORMATS: + raise ValueError("Unsupported format: %r"%(fmt,)) + + writer = _FORMATS[fmt]["writer"](fp, sort_keys=sort_keys, skipkeys=skipkeys) + writer.write(value) + + +def dumps(value, *, fmt=FMT_XML, skipkeys=False, sort_keys=True): + """Return a bytes object with the contents for a .plist file. + """ + fp = BytesIO() + dump(value, fp, fmt=fmt, skipkeys=skipkeys, sort_keys=sort_keys) + return fp.getvalue() diff --git a/Lib/test/test_plistlib.py b/Lib/test/test_plistlib.py index 9e86b3de646..abb1d19e63a 100644 --- a/Lib/test/test_plistlib.py +++ b/Lib/test/test_plistlib.py @@ -1,94 +1,87 @@ -# Copyright (C) 2003 Python Software Foundation +# Copyright (C) 2003-2013 Python Software Foundation import unittest import plistlib import os import datetime +import codecs +import binascii +import collections from test import support +from io import BytesIO +ALL_FORMATS=(plistlib.FMT_XML, plistlib.FMT_BINARY) -# This test data was generated through Cocoa's NSDictionary class -TESTDATA = b""" - - - - aDate - 2004-10-26T10:33:33Z - aDict - - aFalseValue - - aTrueValue - - aUnicodeValue - M\xc3\xa4ssig, Ma\xc3\x9f - anotherString - <hello & 'hi' there!> - deeperDict - - a - 17 - b - 32.5 - c - - 1 - 2 - text - - - - aFloat - 0.5 - aList - - A - B - 12 - 32.5 - - 1 - 2 - 3 - - - aString - Doodah - anEmptyDict - - anEmptyList - - anInt - 728 - nestedData - - - PGxvdHMgb2YgYmluYXJ5IGd1bms+AAECAzxsb3RzIG9mIGJpbmFyeSBndW5r - PgABAgM8bG90cyBvZiBiaW5hcnkgZ3Vuaz4AAQIDPGxvdHMgb2YgYmluYXJ5 - IGd1bms+AAECAzxsb3RzIG9mIGJpbmFyeSBndW5rPgABAgM8bG90cyBvZiBi - aW5hcnkgZ3Vuaz4AAQIDPGxvdHMgb2YgYmluYXJ5IGd1bms+AAECAzxsb3Rz - IG9mIGJpbmFyeSBndW5rPgABAgM8bG90cyBvZiBiaW5hcnkgZ3Vuaz4AAQID - PGxvdHMgb2YgYmluYXJ5IGd1bms+AAECAw== - - - someData - - PGJpbmFyeSBndW5rPg== - - someMoreData - - PGxvdHMgb2YgYmluYXJ5IGd1bms+AAECAzxsb3RzIG9mIGJpbmFyeSBndW5rPgABAgM8 - bG90cyBvZiBiaW5hcnkgZ3Vuaz4AAQIDPGxvdHMgb2YgYmluYXJ5IGd1bms+AAECAzxs - b3RzIG9mIGJpbmFyeSBndW5rPgABAgM8bG90cyBvZiBiaW5hcnkgZ3Vuaz4AAQIDPGxv - dHMgb2YgYmluYXJ5IGd1bms+AAECAzxsb3RzIG9mIGJpbmFyeSBndW5rPgABAgM8bG90 - cyBvZiBiaW5hcnkgZ3Vuaz4AAQIDPGxvdHMgb2YgYmluYXJ5IGd1bms+AAECAw== - - \xc3\x85benraa - That was a unicode key. - - -""".replace(b" " * 8, b"\t") # Apple as well as plistlib.py output hard tabs +# The testdata is generated using Mac/Tools/plistlib_generate_testdata.py +# (which using PyObjC to control the Cocoa classes for generating plists) +TESTDATA={ + plistlib.FMT_XML: binascii.a2b_base64(b''' + PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPCFET0NU + WVBFIHBsaXN0IFBVQkxJQyAiLS8vQXBwbGUvL0RURCBQTElTVCAxLjAvL0VO + IiAiaHR0cDovL3d3dy5hcHBsZS5jb20vRFREcy9Qcm9wZXJ0eUxpc3QtMS4w + LmR0ZCI+CjxwbGlzdCB2ZXJzaW9uPSIxLjAiPgo8ZGljdD4KCTxrZXk+YURh + dGU8L2tleT4KCTxkYXRlPjIwMDQtMTAtMjZUMTA6MzM6MzNaPC9kYXRlPgoJ + PGtleT5hRGljdDwva2V5PgoJPGRpY3Q+CgkJPGtleT5hRmFsc2VWYWx1ZTwv + a2V5PgoJCTxmYWxzZS8+CgkJPGtleT5hVHJ1ZVZhbHVlPC9rZXk+CgkJPHRy + dWUvPgoJCTxrZXk+YVVuaWNvZGVWYWx1ZTwva2V5PgoJCTxzdHJpbmc+TcOk + c3NpZywgTWHDnzwvc3RyaW5nPgoJCTxrZXk+YW5vdGhlclN0cmluZzwva2V5 + PgoJCTxzdHJpbmc+Jmx0O2hlbGxvICZhbXA7ICdoaScgdGhlcmUhJmd0Ozwv + c3RyaW5nPgoJCTxrZXk+ZGVlcGVyRGljdDwva2V5PgoJCTxkaWN0PgoJCQk8 + a2V5PmE8L2tleT4KCQkJPGludGVnZXI+MTc8L2ludGVnZXI+CgkJCTxrZXk+ + Yjwva2V5PgoJCQk8cmVhbD4zMi41PC9yZWFsPgoJCQk8a2V5PmM8L2tleT4K + CQkJPGFycmF5PgoJCQkJPGludGVnZXI+MTwvaW50ZWdlcj4KCQkJCTxpbnRl + Z2VyPjI8L2ludGVnZXI+CgkJCQk8c3RyaW5nPnRleHQ8L3N0cmluZz4KCQkJ + PC9hcnJheT4KCQk8L2RpY3Q+Cgk8L2RpY3Q+Cgk8a2V5PmFGbG9hdDwva2V5 + PgoJPHJlYWw+MC41PC9yZWFsPgoJPGtleT5hTGlzdDwva2V5PgoJPGFycmF5 + PgoJCTxzdHJpbmc+QTwvc3RyaW5nPgoJCTxzdHJpbmc+Qjwvc3RyaW5nPgoJ + CTxpbnRlZ2VyPjEyPC9pbnRlZ2VyPgoJCTxyZWFsPjMyLjU8L3JlYWw+CgkJ + PGFycmF5PgoJCQk8aW50ZWdlcj4xPC9pbnRlZ2VyPgoJCQk8aW50ZWdlcj4y + PC9pbnRlZ2VyPgoJCQk8aW50ZWdlcj4zPC9pbnRlZ2VyPgoJCTwvYXJyYXk+ + Cgk8L2FycmF5PgoJPGtleT5hU3RyaW5nPC9rZXk+Cgk8c3RyaW5nPkRvb2Rh + aDwvc3RyaW5nPgoJPGtleT5hbkVtcHR5RGljdDwva2V5PgoJPGRpY3QvPgoJ + PGtleT5hbkVtcHR5TGlzdDwva2V5PgoJPGFycmF5Lz4KCTxrZXk+YW5JbnQ8 + L2tleT4KCTxpbnRlZ2VyPjcyODwvaW50ZWdlcj4KCTxrZXk+bmVzdGVkRGF0 + YTwva2V5PgoJPGFycmF5PgoJCTxkYXRhPgoJCVBHeHZkSE1nYjJZZ1ltbHVZ + WEo1SUdkMWJtcytBQUVDQXp4c2IzUnpJRzltSUdKcGJtRnllU0JuZFc1cgoJ + CVBnQUJBZ004Ykc5MGN5QnZaaUJpYVc1aGNua2daM1Z1YXo0QUFRSURQR3h2 + ZEhNZ2IyWWdZbWx1WVhKNQoJCUlHZDFibXMrQUFFQ0F6eHNiM1J6SUc5bUlH + SnBibUZ5ZVNCbmRXNXJQZ0FCQWdNOGJHOTBjeUJ2WmlCaQoJCWFXNWhjbmtn + WjNWdWF6NEFBUUlEUEd4dmRITWdiMllnWW1sdVlYSjVJR2QxYm1zK0FBRUNB + enhzYjNSegoJCUlHOW1JR0pwYm1GeWVTQm5kVzVyUGdBQkFnTThiRzkwY3lC + dlppQmlhVzVoY25rZ1ozVnVhejRBQVFJRAoJCVBHeHZkSE1nYjJZZ1ltbHVZ + WEo1SUdkMWJtcytBQUVDQXc9PQoJCTwvZGF0YT4KCTwvYXJyYXk+Cgk8a2V5 + PnNvbWVEYXRhPC9rZXk+Cgk8ZGF0YT4KCVBHSnBibUZ5ZVNCbmRXNXJQZz09 + Cgk8L2RhdGE+Cgk8a2V5PnNvbWVNb3JlRGF0YTwva2V5PgoJPGRhdGE+CglQ + R3h2ZEhNZ2IyWWdZbWx1WVhKNUlHZDFibXMrQUFFQ0F6eHNiM1J6SUc5bUlH + SnBibUZ5ZVNCbmRXNXJQZ0FCQWdNOAoJYkc5MGN5QnZaaUJpYVc1aGNua2da + M1Z1YXo0QUFRSURQR3h2ZEhNZ2IyWWdZbWx1WVhKNUlHZDFibXMrQUFFQ0F6 + eHMKCWIzUnpJRzltSUdKcGJtRnllU0JuZFc1clBnQUJBZ004Ykc5MGN5QnZa + aUJpYVc1aGNua2daM1Z1YXo0QUFRSURQR3h2CglkSE1nYjJZZ1ltbHVZWEo1 + SUdkMWJtcytBQUVDQXp4c2IzUnpJRzltSUdKcGJtRnllU0JuZFc1clBnQUJB + Z004Ykc5MAoJY3lCdlppQmlhVzVoY25rZ1ozVnVhejRBQVFJRFBHeHZkSE1n + YjJZZ1ltbHVZWEo1SUdkMWJtcytBQUVDQXc9PQoJPC9kYXRhPgoJPGtleT7D + hWJlbnJhYTwva2V5PgoJPHN0cmluZz5UaGF0IHdhcyBhIHVuaWNvZGUga2V5 + Ljwvc3RyaW5nPgo8L2RpY3Q+CjwvcGxpc3Q+Cg=='''), + plistlib.FMT_BINARY: binascii.a2b_base64(b''' + YnBsaXN0MDDcAQIDBAUGBwgJCgsMDQ4iIykqKywtLy4wVWFEYXRlVWFEaWN0 + VmFGbG9hdFVhTGlzdFdhU3RyaW5nW2FuRW1wdHlEaWN0W2FuRW1wdHlMaXN0 + VWFuSW50Wm5lc3RlZERhdGFYc29tZURhdGFcc29tZU1vcmVEYXRhZwDFAGIA + ZQBuAHIAYQBhM0GcuX30AAAA1Q8QERITFBUWFxhbYUZhbHNlVmFsdWVaYVRy + dWVWYWx1ZV1hVW5pY29kZVZhbHVlXWFub3RoZXJTdHJpbmdaZGVlcGVyRGlj + dAgJawBNAOQAcwBzAGkAZwAsACAATQBhAN9fEBU8aGVsbG8gJiAnaGknIHRo + ZXJlIT7TGRobHB0eUWFRYlFjEBEjQEBAAAAAAACjHyAhEAEQAlR0ZXh0Iz/g + AAAAAAAApSQlJh0nUUFRQhAMox8gKBADVkRvb2RhaNCgEQLYoS5PEPo8bG90 + cyBvZiBiaW5hcnkgZ3Vuaz4AAQIDPGxvdHMgb2YgYmluYXJ5IGd1bms+AAEC + Azxsb3RzIG9mIGJpbmFyeSBndW5rPgABAgM8bG90cyBvZiBiaW5hcnkgZ3Vu + az4AAQIDPGxvdHMgb2YgYmluYXJ5IGd1bms+AAECAzxsb3RzIG9mIGJpbmFy + eSBndW5rPgABAgM8bG90cyBvZiBiaW5hcnkgZ3Vuaz4AAQIDPGxvdHMgb2Yg + YmluYXJ5IGd1bms+AAECAzxsb3RzIG9mIGJpbmFyeSBndW5rPgABAgM8bG90 + cyBvZiBiaW5hcnkgZ3Vuaz4AAQIDTTxiaW5hcnkgZ3Vuaz5fEBdUaGF0IHdh + cyBhIHVuaWNvZGUga2V5LgAIACEAJwAtADQAOgBCAE4AWgBgAGsAdACBAJAA + mQCkALAAuwDJANcA4gDjAOQA+wETARoBHAEeASABIgErAS8BMQEzATgBQQFH + AUkBSwFNAVEBUwFaAVsBXAFfAWECXgJsAAAAAAAAAgEAAAAAAAAAMQAAAAAA + AAAAAAAAAAAAAoY='''), +} class TestPlistlib(unittest.TestCase): @@ -99,7 +92,7 @@ class TestPlistlib(unittest.TestCase): except: pass - def _create(self): + def _create(self, fmt=None): pl = dict( aString="Doodah", aList=["A", "B", 12, 32.5, [1, 2, 3]], @@ -112,9 +105,9 @@ class TestPlistlib(unittest.TestCase): aFalseValue=False, deeperDict=dict(a=17, b=32.5, c=[1, 2, "text"]), ), - someData = plistlib.Data(b""), - someMoreData = plistlib.Data(b"\0\1\2\3" * 10), - nestedData = [plistlib.Data(b"\0\1\2\3" * 10)], + someData = b"", + someMoreData = b"\0\1\2\3" * 10, + nestedData = [b"\0\1\2\3" * 10], aDate = datetime.datetime(2004, 10, 26, 10, 33, 33), anEmptyDict = dict(), anEmptyList = list() @@ -129,49 +122,191 @@ class TestPlistlib(unittest.TestCase): def test_io(self): pl = self._create() - plistlib.writePlist(pl, support.TESTFN) - pl2 = plistlib.readPlist(support.TESTFN) + with open(support.TESTFN, 'wb') as fp: + plistlib.dump(pl, fp) + + with open(support.TESTFN, 'rb') as fp: + pl2 = plistlib.load(fp) + self.assertEqual(dict(pl), dict(pl2)) + self.assertRaises(AttributeError, plistlib.dump, pl, 'filename') + self.assertRaises(AttributeError, plistlib.load, 'filename') + + def test_bytes(self): pl = self._create() - data = plistlib.writePlistToBytes(pl) - pl2 = plistlib.readPlistFromBytes(data) + data = plistlib.dumps(pl) + pl2 = plistlib.loads(data) + self.assertNotIsInstance(pl, plistlib._InternalDict) self.assertEqual(dict(pl), dict(pl2)) - data2 = plistlib.writePlistToBytes(pl2) + data2 = plistlib.dumps(pl2) self.assertEqual(data, data2) def test_indentation_array(self): - data = [[[[[[[[{'test': plistlib.Data(b'aaaaaa')}]]]]]]]] - self.assertEqual(plistlib.readPlistFromBytes(plistlib.writePlistToBytes(data)), data) + data = [[[[[[[[{'test': b'aaaaaa'}]]]]]]]] + self.assertEqual(plistlib.loads(plistlib.dumps(data)), data) def test_indentation_dict(self): - data = {'1': {'2': {'3': {'4': {'5': {'6': {'7': {'8': {'9': plistlib.Data(b'aaaaaa')}}}}}}}}} - self.assertEqual(plistlib.readPlistFromBytes(plistlib.writePlistToBytes(data)), data) + data = {'1': {'2': {'3': {'4': {'5': {'6': {'7': {'8': {'9': b'aaaaaa'}}}}}}}}} + self.assertEqual(plistlib.loads(plistlib.dumps(data)), data) def test_indentation_dict_mix(self): - data = {'1': {'2': [{'3': [[[[[{'test': plistlib.Data(b'aaaaaa')}]]]]]}]}} - self.assertEqual(plistlib.readPlistFromBytes(plistlib.writePlistToBytes(data)), data) + data = {'1': {'2': [{'3': [[[[[{'test': b'aaaaaa'}]]]]]}]}} + self.assertEqual(plistlib.loads(plistlib.dumps(data)), data) def test_appleformatting(self): - pl = plistlib.readPlistFromBytes(TESTDATA) - data = plistlib.writePlistToBytes(pl) - self.assertEqual(data, TESTDATA, - "generated data was not identical to Apple's output") + for use_builtin_types in (True, False): + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt, use_builtin_types=use_builtin_types): + pl = plistlib.loads(TESTDATA[fmt], + use_builtin_types=use_builtin_types) + data = plistlib.dumps(pl, fmt=fmt) + self.assertEqual(data, TESTDATA[fmt], + "generated data was not identical to Apple's output") + def test_appleformattingfromliteral(self): - pl = self._create() - pl2 = plistlib.readPlistFromBytes(TESTDATA) - self.assertEqual(dict(pl), dict(pl2), - "generated data was not identical to Apple's output") + self.maxDiff = None + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + pl = self._create(fmt=fmt) + pl2 = plistlib.loads(TESTDATA[fmt]) + self.assertEqual(dict(pl), dict(pl2), + "generated data was not identical to Apple's output") def test_bytesio(self): - from io import BytesIO - b = BytesIO() - pl = self._create() - plistlib.writePlist(pl, b) - pl2 = plistlib.readPlist(BytesIO(b.getvalue())) - self.assertEqual(dict(pl), dict(pl2)) + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + b = BytesIO() + pl = self._create(fmt=fmt) + plistlib.dump(pl, b, fmt=fmt) + pl2 = plistlib.load(BytesIO(b.getvalue())) + self.assertEqual(dict(pl), dict(pl2)) + + def test_keysort_bytesio(self): + pl = collections.OrderedDict() + pl['b'] = 1 + pl['a'] = 2 + pl['c'] = 3 + + for fmt in ALL_FORMATS: + for sort_keys in (False, True): + with self.subTest(fmt=fmt, sort_keys=sort_keys): + b = BytesIO() + + plistlib.dump(pl, b, fmt=fmt, sort_keys=sort_keys) + pl2 = plistlib.load(BytesIO(b.getvalue()), + dict_type=collections.OrderedDict) + + self.assertEqual(dict(pl), dict(pl2)) + if sort_keys: + self.assertEqual(list(pl2.keys()), ['a', 'b', 'c']) + else: + self.assertEqual(list(pl2.keys()), ['b', 'a', 'c']) + + def test_keysort(self): + pl = collections.OrderedDict() + pl['b'] = 1 + pl['a'] = 2 + pl['c'] = 3 + + for fmt in ALL_FORMATS: + for sort_keys in (False, True): + with self.subTest(fmt=fmt, sort_keys=sort_keys): + data = plistlib.dumps(pl, fmt=fmt, sort_keys=sort_keys) + pl2 = plistlib.loads(data, dict_type=collections.OrderedDict) + + self.assertEqual(dict(pl), dict(pl2)) + if sort_keys: + self.assertEqual(list(pl2.keys()), ['a', 'b', 'c']) + else: + self.assertEqual(list(pl2.keys()), ['b', 'a', 'c']) + + def test_keys_no_string(self): + pl = { 42: 'aNumber' } + + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + self.assertRaises(TypeError, plistlib.dumps, pl, fmt=fmt) + + b = BytesIO() + self.assertRaises(TypeError, plistlib.dump, pl, b, fmt=fmt) + + def test_skipkeys(self): + pl = { + 42: 'aNumber', + 'snake': 'aWord', + } + + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + data = plistlib.dumps( + pl, fmt=fmt, skipkeys=True, sort_keys=False) + + pl2 = plistlib.loads(data) + self.assertEqual(pl2, {'snake': 'aWord'}) + + fp = BytesIO() + plistlib.dump( + pl, fp, fmt=fmt, skipkeys=True, sort_keys=False) + data = fp.getvalue() + pl2 = plistlib.loads(fp.getvalue()) + self.assertEqual(pl2, {'snake': 'aWord'}) + + def test_tuple_members(self): + pl = { + 'first': (1, 2), + 'second': (1, 2), + 'third': (3, 4), + } + + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + data = plistlib.dumps(pl, fmt=fmt) + pl2 = plistlib.loads(data) + self.assertEqual(pl2, { + 'first': [1, 2], + 'second': [1, 2], + 'third': [3, 4], + }) + self.assertIsNot(pl2['first'], pl2['second']) + + def test_list_members(self): + pl = { + 'first': [1, 2], + 'second': [1, 2], + 'third': [3, 4], + } + + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + data = plistlib.dumps(pl, fmt=fmt) + pl2 = plistlib.loads(data) + self.assertEqual(pl2, { + 'first': [1, 2], + 'second': [1, 2], + 'third': [3, 4], + }) + self.assertIsNot(pl2['first'], pl2['second']) + + def test_dict_members(self): + pl = { + 'first': {'a': 1}, + 'second': {'a': 1}, + 'third': {'b': 2 }, + } + + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + data = plistlib.dumps(pl, fmt=fmt) + pl2 = plistlib.loads(data) + self.assertEqual(pl2, { + 'first': {'a': 1}, + 'second': {'a': 1}, + 'third': {'b': 2 }, + }) + self.assertIsNot(pl2['first'], pl2['second']) def test_controlcharacters(self): for i in range(128): @@ -179,25 +314,27 @@ class TestPlistlib(unittest.TestCase): testString = "string containing %s" % c if i >= 32 or c in "\r\n\t": # \r, \n and \t are the only legal control chars in XML - plistlib.writePlistToBytes(testString) + plistlib.dumps(testString, fmt=plistlib.FMT_XML) else: self.assertRaises(ValueError, - plistlib.writePlistToBytes, + plistlib.dumps, testString) def test_nondictroot(self): - test1 = "abc" - test2 = [1, 2, 3, "abc"] - result1 = plistlib.readPlistFromBytes(plistlib.writePlistToBytes(test1)) - result2 = plistlib.readPlistFromBytes(plistlib.writePlistToBytes(test2)) - self.assertEqual(test1, result1) - self.assertEqual(test2, result2) + for fmt in ALL_FORMATS: + with self.subTest(fmt=fmt): + test1 = "abc" + test2 = [1, 2, 3, "abc"] + result1 = plistlib.loads(plistlib.dumps(test1, fmt=fmt)) + result2 = plistlib.loads(plistlib.dumps(test2, fmt=fmt)) + self.assertEqual(test1, result1) + self.assertEqual(test2, result2) def test_invalidarray(self): for i in ["key inside an array", "key inside an array23", "key inside an array3"]: - self.assertRaises(ValueError, plistlib.readPlistFromBytes, + self.assertRaises(ValueError, plistlib.loads, ("%s"%i).encode()) def test_invaliddict(self): @@ -206,22 +343,130 @@ class TestPlistlib(unittest.TestCase): "missing key", "k1v15.3" "k1k2double key"]: - self.assertRaises(ValueError, plistlib.readPlistFromBytes, + self.assertRaises(ValueError, plistlib.loads, ("%s"%i).encode()) - self.assertRaises(ValueError, plistlib.readPlistFromBytes, + self.assertRaises(ValueError, plistlib.loads, ("%s"%i).encode()) def test_invalidinteger(self): - self.assertRaises(ValueError, plistlib.readPlistFromBytes, + self.assertRaises(ValueError, plistlib.loads, b"not integer") def test_invalidreal(self): - self.assertRaises(ValueError, plistlib.readPlistFromBytes, + self.assertRaises(ValueError, plistlib.loads, b"not real") + def test_xml_encodings(self): + base = TESTDATA[plistlib.FMT_XML] + + for xml_encoding, encoding, bom in [ + (b'utf-8', 'utf-8', codecs.BOM_UTF8), + (b'utf-16', 'utf-16-le', codecs.BOM_UTF16_LE), + (b'utf-16', 'utf-16-be', codecs.BOM_UTF16_BE), + # Expat does not support UTF-32 + #(b'utf-32', 'utf-32-le', codecs.BOM_UTF32_LE), + #(b'utf-32', 'utf-32-be', codecs.BOM_UTF32_BE), + ]: + + pl = self._create(fmt=plistlib.FMT_XML) + with self.subTest(encoding=encoding): + data = base.replace(b'UTF-8', xml_encoding) + data = bom + data.decode('utf-8').encode(encoding) + pl2 = plistlib.loads(data) + self.assertEqual(dict(pl), dict(pl2)) + + +class TestPlistlibDeprecated(unittest.TestCase): + def test_io_deprecated(self): + pl_in = { + 'key': 42, + 'sub': { + 'key': 9, + 'alt': 'value', + 'data': b'buffer', + } + } + pl_out = plistlib._InternalDict({ + 'key': 42, + 'sub': plistlib._InternalDict({ + 'key': 9, + 'alt': 'value', + 'data': plistlib.Data(b'buffer'), + }) + }) + + self.addCleanup(support.unlink, support.TESTFN) + with self.assertWarns(DeprecationWarning): + plistlib.writePlist(pl_in, support.TESTFN) + + with self.assertWarns(DeprecationWarning): + pl2 = plistlib.readPlist(support.TESTFN) + + self.assertEqual(pl_out, pl2) + + os.unlink(support.TESTFN) + + with open(support.TESTFN, 'wb') as fp: + with self.assertWarns(DeprecationWarning): + plistlib.writePlist(pl_in, fp) + + with open(support.TESTFN, 'rb') as fp: + with self.assertWarns(DeprecationWarning): + pl2 = plistlib.readPlist(fp) + + self.assertEqual(pl_out, pl2) + + def test_bytes_deprecated(self): + pl = { + 'key': 42, + 'sub': { + 'key': 9, + 'alt': 'value', + 'data': b'buffer', + } + } + with self.assertWarns(DeprecationWarning): + data = plistlib.writePlistToBytes(pl) + + with self.assertWarns(DeprecationWarning): + pl2 = plistlib.readPlistFromBytes(data) + + self.assertIsInstance(pl2, plistlib._InternalDict) + self.assertEqual(pl2, plistlib._InternalDict( + key=42, + sub=plistlib._InternalDict( + key=9, + alt='value', + data=plistlib.Data(b'buffer'), + ) + )) + + with self.assertWarns(DeprecationWarning): + data2 = plistlib.writePlistToBytes(pl2) + self.assertEqual(data, data2) + + def test_dataobject_deprecated(self): + in_data = { 'key': plistlib.Data(b'hello') } + out_data = { 'key': b'hello' } + + buf = plistlib.dumps(in_data) + + cur = plistlib.loads(buf) + self.assertEqual(cur, out_data) + self.assertNotEqual(cur, in_data) + + cur = plistlib.loads(buf, use_builtin_types=False) + self.assertNotEqual(cur, out_data) + self.assertEqual(cur, in_data) + + with self.assertWarns(DeprecationWarning): + cur = plistlib.readPlistFromBytes(buf) + self.assertNotEqual(cur, out_data) + self.assertEqual(cur, in_data) + def test_main(): - support.run_unittest(TestPlistlib) + support.run_unittest(TestPlistlib, TestPlistlibDeprecated) if __name__ == '__main__': diff --git a/Mac/Tools/plistlib_generate_testdata.py b/Mac/Tools/plistlib_generate_testdata.py new file mode 100644 index 00000000000..da1d97645d4 --- /dev/null +++ b/Mac/Tools/plistlib_generate_testdata.py @@ -0,0 +1,94 @@ +#!/usr/bin/env python3 + +from Cocoa import NSMutableDictionary, NSMutableArray, NSString, NSDate +from Cocoa import NSPropertyListSerialization, NSPropertyListOpenStepFormat +from Cocoa import NSPropertyListXMLFormat_v1_0, NSPropertyListBinaryFormat_v1_0 +from Cocoa import CFUUIDCreateFromString, NSNull, NSUUID, CFPropertyListCreateData +from Cocoa import NSURL + +import datetime +from collections import OrderedDict +import binascii + +FORMATS=[ +# ('openstep', NSPropertyListOpenStepFormat), + ('plistlib.FMT_XML', NSPropertyListXMLFormat_v1_0), + ('plistlib.FMT_BINARY', NSPropertyListBinaryFormat_v1_0), +] + +def nsstr(value): + return NSString.alloc().initWithString_(value) + + +def main(): + pl = OrderedDict() + + seconds = datetime.datetime(2004, 10, 26, 10, 33, 33, tzinfo=datetime.timezone(datetime.timedelta(0))).timestamp() + pl[nsstr('aDate')] = NSDate.dateWithTimeIntervalSince1970_(seconds) + + pl[nsstr('aDict')] = d = OrderedDict() + d[nsstr('aFalseValue')] = False + d[nsstr('aTrueValue')] = True + d[nsstr('aUnicodeValue')] = "M\xe4ssig, Ma\xdf" + d[nsstr('anotherString')] = "" + d[nsstr('deeperDict')] = dd = OrderedDict() + dd[nsstr('a')] = 17 + dd[nsstr('b')] = 32.5 + dd[nsstr('c')] = a = NSMutableArray.alloc().init() + a.append(1) + a.append(2) + a.append(nsstr('text')) + + pl[nsstr('aFloat')] = 0.5 + + pl[nsstr('aList')] = a = NSMutableArray.alloc().init() + a.append(nsstr('A')) + a.append(nsstr('B')) + a.append(12) + a.append(32.5) + aa = NSMutableArray.alloc().init() + a.append(aa) + aa.append(1) + aa.append(2) + aa.append(3) + + pl[nsstr('aString')] = nsstr('Doodah') + + pl[nsstr('anEmptyDict')] = NSMutableDictionary.alloc().init() + + pl[nsstr('anEmptyList')] = NSMutableArray.alloc().init() + + pl[nsstr('anInt')] = 728 + + pl[nsstr('nestedData')] = a = NSMutableArray.alloc().init() + a.append(b'''\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03''') + + + pl[nsstr('someData')] = b'' + + pl[nsstr('someMoreData')] = b'''\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03\x00\x01\x02\x03''' + + pl[nsstr('\xc5benraa')] = nsstr("That was a unicode key.") + + print("TESTDATA={") + for fmt_name, fmt_key in FORMATS: + data, error = NSPropertyListSerialization.dataWithPropertyList_format_options_error_( + pl, fmt_key, 0, None) + if data is None: + print("Cannot serialize", fmt_name, error) + + else: + print(" %s: binascii.a2b_base64(b'''\n %s'''),"%(fmt_name, _encode_base64(bytes(data)).decode('ascii')[:-1])) + + print("}") + print() + +def _encode_base64(s, maxlinelength=60): + maxbinsize = (maxlinelength//4)*3 + pieces = [] + for i in range(0, len(s), maxbinsize): + chunk = s[i : i + maxbinsize] + pieces.append(binascii.b2a_base64(chunk)) + return b' '.join(pieces) + +main() diff --git a/Misc/NEWS b/Misc/NEWS index 01986309498..60adc580325 100644 --- a/Misc/NEWS +++ b/Misc/NEWS @@ -59,6 +59,8 @@ Core and Builtins Library ------- +- Issue #14455: plistlib now supports binary plists and has an updated API. + - Issue #19633: Fixed writing not compressed 16- and 32-bit wave files on big-endian platforms.