cpython/Tools/scripts/pdeps.py

#! /usr/bin/env python3

# pdeps
#
# Find dependencies between a bunch of Python modules.
#
# Usage:
#       pdeps file1.py file2.py ...
#
# Output:
# Four tables separated by lines like '--- Closure ---':
# 1) Direct dependencies, listing which module imports which other modules
# 2) The inverse of (1)
# 3) Indirect dependencies, or the closure of the above
# 4) The inverse of (3)
#
# To do:
# - command line options to select output type
# - option to automatically scan the Python library for referenced modules
# - option to limit output to particular modules


import sys
import re
import os


# Main program
#
def main():
    args = sys.argv[1:]
    if not args:
        print('usage: pdeps file.py file.py ...')
        return 2
    #
    table = {}
    for arg in args:
        process(arg, table)
    #
    print('--- Uses ---')
    printresults(table)
    #
    print('--- Used By ---')
    inv = inverse(table)
    printresults(inv)
    #
    print('--- Closure of Uses ---')
    reach = closure(table)
    printresults(reach)
    #
    print('--- Closure of Used By ---')
    invreach = inverse(reach)
    printresults(invreach)
    #
    return 0


# Compiled regular expressions to search for import statements
#
m_import = re.compile('^[ \t]*from[ \t]+([^ \t]+)[ \t]+')
m_from = re.compile('^[ \t]*import[ \t]+([^#]+)')


# Collect data from one file
#
def process(filename, table):
    with open(filename) as fp:
        mod = os.path.basename(filename)
        if mod[-3:] == '.py':
            mod = mod[:-3]
        table[mod] = list = []
        while 1:
            line = fp.readline()
            if not line: break
            while line[-1:] == '\\':
                nextline = fp.readline()
                if not nextline: break
                line = line[:-1] + nextline
            m_found = m_import.match(line) or m_from.match(line)
            if m_found:
                (a, b), (a1, b1) = m_found.regs[:2]
            else: continue
            words = line[a1:b1].split(',')
            # print '#', line, words
            for word in words:
                word = word.strip()
                if word not in list:
                    list.append(word)


# Compute closure (this is in fact totally general)
#
def closure(table):
    modules = list(table.keys())
    #
    # Initialize reach with a copy of table
    #
    reach = {}
    for mod in modules:
        reach[mod] = table[mod][:]
    #
    # Iterate until no more change
    #
    change = 1
    while change:
        change = 0
        for mod in modules:
            for mo in reach[mod]:
                if mo in modules:
                    for m in reach[mo]:
                        if m not in reach[mod]:
                            reach[mod].append(m)
                            change = 1
    #
    return reach


# Invert a table (this is again totally general).
# All keys of the original table are made keys of the inverse,
# so there may be empty lists in the inverse.
#
def inverse(table):
    inv = {}
    for key in table.keys():
        if key not in inv:
            inv[key] = []
        for item in table[key]:
            store(inv, item, key)
    return inv


# Store "item" in "dict" under "key".
# The dictionary maps keys to lists of items.
# If there is no list for the key yet, it is created.
#
def store(dict, key, item):
    if key in dict:
        dict[key].append(item)
    else:
        dict[key] = [item]


# Tabulate results neatly
#
def printresults(table):
    modules = sorted(table.keys())
    maxlen = 0
    for mod in modules: maxlen = max(maxlen, len(mod))
    for mod in modules:
        list = sorted(table[mod])
        print(mod.ljust(maxlen), ':', end=' ')
        if mod in list:
            print('(*)', end=' ')
        for ref in list:
            print(ref, end=' ')
        print()


# Call main and honor exit status
if __name__ == '__main__':
    try:
        sys.exit(main())
    except KeyboardInterrupt:
        sys.exit(1)
convert shebang lines: python -> python3 2010-03-11 18:53:45 -04:00			`#! /usr/bin/env python3`
Initial revision 1991-06-04 17:36:54 -03:00
			`# pdeps`
			`#`
			`# Find dependencies between a bunch of Python modules.`
			`#`
			`# Usage:`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`# pdeps file1.py file2.py ...`
Initial revision 1991-06-04 17:36:54 -03:00			`#`
			`# Output:`
			`# Four tables separated by lines like '--- Closure ---':`
			`# 1) Direct dependencies, listing which module imports which other modules`
			`# 2) The inverse of (1)`
			`# 3) Indirect dependencies, or the closure of the above`
			`# 4) The inverse of (3)`
			`#`
			`# To do:`
			`# - command line options to select output type`
			`# - option to automatically scan the Python library for referenced modules`
			`# - option to limit output to particular modules`


			`import sys`
Merge p3yk branch with the trunk up to revision 45595. This breaks a fair number of tests, all because of the codecs/_multibytecodecs issue described here (it's not a Py3K issue, just something Py3K discovers): http://mail.python.org/pipermail/python-dev/2006-April/064051.html Hye-Shik Chang promised to look for a fix, so no need to fix it here. The tests that are expected to break are: test_codecencodings_cn test_codecencodings_hk test_codecencodings_jp test_codecencodings_kr test_codecencodings_tw test_codecs test_multibytecodec This merge fixes an actual test failure (test_weakref) in this branch, though, so I believe merging is the right thing to do anyway. 2006-04-21 07:40:58 -03:00			`import re`
Adapt to modern times... 1992-12-09 20:00:58 -04:00			`import os`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Main program`
			`#`
			`def main():`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`args = sys.argv[1:]`
			`if not args:`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print('usage: pdeps file.py file.py ...')`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`return 2`
			`#`
			`table = {}`
			`for arg in args:`
			`process(arg, table)`
			`#`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print('--- Uses ---')`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`printresults(table)`
			`#`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print('--- Used By ---')`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`inv = inverse(table)`
			`printresults(inv)`
			`#`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print('--- Closure of Uses ---')`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`reach = closure(table)`
			`printresults(reach)`
			`#`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print('--- Closure of Used By ---')`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`invreach = inverse(reach)`
			`printresults(invreach)`
			`#`
			`return 0`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Compiled regular expressions to search for import statements`
			`#`
Merge p3yk branch with the trunk up to revision 45595. This breaks a fair number of tests, all because of the codecs/_multibytecodecs issue described here (it's not a Py3K issue, just something Py3K discovers): http://mail.python.org/pipermail/python-dev/2006-April/064051.html Hye-Shik Chang promised to look for a fix, so no need to fix it here. The tests that are expected to break are: test_codecencodings_cn test_codecencodings_hk test_codecencodings_jp test_codecencodings_kr test_codecencodings_tw test_codecs test_multibytecodec This merge fixes an actual test failure (test_weakref) in this branch, though, so I believe merging is the right thing to do anyway. 2006-04-21 07:40:58 -03:00			`m_import = re.compile('^[ \t]*from[ \t]+([^ \t]+)[ \t]+')`
			`m_from = re.compile('^[ \t]*import[ \t]+([^#]+)')`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Collect data from one file`
			`#`
			`def process(filename, table):`
bpo-22831: Use "with" to avoid possible fd leaks in tools (part 2). (GH-10927) 2019-03-30 03:33:02 -03:00			`with open(filename) as fp:`
			`mod = os.path.basename(filename)`
			`if mod[-3:] == '.py':`
			`mod = mod[:-3]`
			`table[mod] = list = []`
			`while 1:`
			`line = fp.readline()`
			`if not line: break`
			`while line[-1:] == '\\':`
			`nextline = fp.readline()`
			`if not nextline: break`
			`line = line[:-1] + nextline`
			`m_found = m_import.match(line) or m_from.match(line)`
			`if m_found:`
			`(a, b), (a1, b1) = m_found.regs[:2]`
			`else: continue`
			`words = line[a1:b1].split(',')`
			`# print '#', line, words`
			`for word in words:`
			`word = word.strip()`
			`if word not in list:`
			`list.append(word)`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Compute closure (this is in fact totally general)`
			`#`
			`def closure(table):`
Ran 2to3 over scripts directory. 2008-05-16 12:23:30 -03:00			`modules = list(table.keys())`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`#`
			`# Initialize reach with a copy of table`
			`#`
			`reach = {}`
			`for mod in modules:`
			`reach[mod] = table[mod][:]`
			`#`
			`# Iterate until no more change`
			`#`
			`change = 1`
			`while change:`
			`change = 0`
			`for mod in modules:`
			`for mo in reach[mod]:`
			`if mo in modules:`
			`for m in reach[mo]:`
			`if m not in reach[mod]:`
			`reach[mod].append(m)`
			`change = 1`
			`#`
			`return reach`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Invert a table (this is again totally general).`
			`# All keys of the original table are made keys of the inverse,`
			`# so there may be empty lists in the inverse.`
			`#`
			`def inverse(table):`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`inv = {}`
			`for key in table.keys():`
#14492: fix some bugs in Tools/scripts/pdeps.py. Initial patch by Popa Claudiu. 2012-04-05 23:59:13 -03:00			`if key not in inv:`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`inv[key] = []`
			`for item in table[key]:`
			`store(inv, item, key)`
			`return inv`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Store "item" in "dict" under "key".`
			`# The dictionary maps keys to lists of items.`
			`# If there is no list for the key yet, it is created.`
			`#`
			`def store(dict, key, item):`
Ran 2to3 over scripts directory. 2008-05-16 12:23:30 -03:00			`if key in dict:`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`dict[key].append(item)`
			`else:`
			`dict[key] = [item]`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Tabulate results neatly`
			`#`
			`def printresults(table):`
Ran 2to3 over scripts directory. 2008-05-16 12:23:30 -03:00			`modules = sorted(table.keys())`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`maxlen = 0`
			`for mod in modules: maxlen = max(maxlen, len(mod))`
			`for mod in modules:`
Ran 2to3 over scripts directory. 2008-05-16 12:23:30 -03:00			`list = sorted(table[mod])`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print(mod.ljust(maxlen), ':', end=' ')`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`if mod in list:`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print('(*)', end=' ')`
Whitespace normalization. 2001-01-17 04:48:39 -04:00			`for ref in list:`
Convert print statements to function calls in Tools/. 2007-08-03 14:06:41 -03:00			`print(ref, end=' ')`
			`print()`
Initial revision 1991-06-04 17:36:54 -03:00

			`# Call main and honor exit status`
[Patch #1005491 ] use __name__ == '__main__' in scripts 2004-08-09 14:27:55 -03:00			`if __name__ == '__main__':`
			`try:`
			`sys.exit(main())`
			`except KeyboardInterrupt:`
			`sys.exit(1)`