Looking for Python 3 builtins

Monday 11 February 2013This is nearly 12 years old. Be careful.

I discovered Floyd’s follow-up to my Eval really is dangerous post. He catalogs a few interesting variations. At the end, though, he mentions the difficulty of finding the original builtins on Python 3.

If you remember, in Python 2, we did it like this:

[
    c for c in ().__class__.__base__.__subclasses__() 
    if c.__name__ == 'catch_warnings'
][0]()._module.__builtins__

This relies on the fact that warnings.catch_warnings is defined, so we can get it from object’s subclasses, and on the fact that that object has a _module attribute which is a module.

Python 3 doesn’t seem to have that class defined right off the bat, so we can’t count on it for finding the builtins. But, I figured, there must be some other class that would serve the same purpose?

To find out, I tried searching for one. Here’s the code I used:

import types

def is_builtins(v):
    """Does v seem to be the builtins?"""
    if hasattr(v, "open") and hasattr(v, "__import__"):
        return True
    if isinstance(v, dict):
        return "open" in v and "__import__" in v
    return False

def construct_some(cl):
    """Construct objects from class `cl`.

    Yields (obj, description) tuples.

    """
    made = False
    for args in [
        (), ("x",), ("x", "y"), ("x", "y", "z"),
        ("utf8",), ("os",), (1, 2, 3),
        (0,0,0,0,0,b"KABOOM",(),(),(),"","",0,b""),
        # Maybe there are other useful constructor args?
    ]:
        try:
            obj = cl(*args)
        except:
            continue
        desc = "{}.{}{}".format(cl.__module__, cl.__name__, args)
        yield obj, desc
        made = True
    
    if not made:
        print("Couldn't make a {}.{}".format(cl.__module__, cl.__name__))

def examine_attrs(obj, chain, seen, depth):
    """Examine the attributes on `obj`, looking for builtins."""
    if depth > 10:
        return
    if id(obj) in seen:
        return
    if isinstance(obj, (type(""), type(b""), type(u""))):
        return
    seen.add(id(obj))
    for n in dir(obj):
        try:
            v = getattr(obj, n)
        except:
            continue
        name = chain+"."+n
        if is_builtins(v):
            print("Looks like {} might be builtins".format(name))
        else:
            examine_attrs(v, name, seen, depth+1)

examined = 0
for cl in object.__subclasses__():
    seen = set()
    for obj, desc in construct_some(cl):
        print("Constructed {}".format(desc))
        examine_attrs(obj, desc, seen, 0)
    examined += len(seen)

print("Examined {} objects".format(examined))

This code iterates all the subclasses of object, and tries a bunch of different constructor arguments to try to make one. If it succeeds, it recursively examines the attributes reachable from the object, looking for an object or dict that has “open” and “__import__”.

Running this on Python 3.3 sure enough doesn’t find anything like builtins, after examining 20k objects. And running it on Python 2.7 finds only the catch_warnings object we had before.

I wouldn’t have guessed it was so unusual for an object to hold a reference to a module. Am I overlooking an important principle, or is this just not something people do?

Comments

[gravatar]
My immediate reaction is that being able to get to the core builtins (including open, for example) from an arbitrary object is precisely the sort of security hole that makes writing secure sandboxes in Python so difficult. Even though Python doesn't offer a secure mode, it doesn't surprise me that it's getting harder to do this type of thing, rather than easier...
[gravatar]
You can recover __builtins__ from a function's __globals__:
f = [t for t in ().__class__.__base__.__subclasses__() 
     if t.__name__ == 'Sized'][0].__len__
__builtins__ = f.__globals__['__builtins__']
I commented this on your answer about 3 months ago:
http://stackoverflow.com/a/13307417/205580
[gravatar]
@eryksun, thanks! You've pointed out the error of my ways, and I've updated the code on today's post: http://nedbatchelder.com/blog/201302/finding_python_3_builtins.html
[gravatar]
It's a general principle - most module references are held by modules (hence why going through a functions __globals__ attribute is likely to be more fruitful than going through other class attributes).

When an instance holds a reference to a module, it's likely to only be through a constructor argument.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.