|Ned Batchelder : Blog | Code | Text | Site|
War is peace
» Home : Blog : February 2013
The Rails community has had a few high-profile security issues this week. They are well-summarized, with an alarming list of what follow-ons to expect, by Patrick McKenzie: What the Rails Security Issue Means for Your Startup.
The Python community is in a slightly better position. True, we have pickle in the standard library, which has exactly the same problem, but it's rare to find applications that accept pickles from untrusted sources.
Don't ever unpickle data you don't trust!
The 3rd-party YAML parser PyYAML has the same issue as Ruby's YAML parser. By default, it will let you create arbitrary Python objects, which means it can run arbitrary Python code. YAML isn't nearly as pervasive in the Python world, and we don't parse JSON with the YAML parser usually, but this can still create security holes.
PyYAML has a .load() method and a .safe_load() method. Why do serialization implementers do this? If you must extend the format with dangerous features, provide them in the non-obvious method. Provide a .load() method and a .dangerous_load() method instead. At least that way people would have to decide to do the dangerous thing. I would advocate for PyYAML to make this change now, who cares if backward compatibility breaks? Most people using .load() never intended to deserialize arbitrary Python objects anyway, so they'll never notice.
If you use the PyYAML library in your code, check now that you are using the .safe_load() method.
If you want automatic serialization of your user-defined classes, take a look at Cerealizer, which works similarly to pickle, but is built to be secure from the start. I've never used it, but it looks promising.
BTW, this whole circus reminded me of Allen Short's excellent lightning talk from PyCon 2010: Big Brother's Design Rules (skip to 17:30). To summarize Allen's pithy maxims:
Allen in particular mentions that adding "conveniences" to your interface can make your life harder later on. In Ruby's case, there were two unneeded conveniences that combined to make things really bad: parse JSON with the YAML parser, and let the YAML parser construct arbitrary Ruby objects. Neither of these is actually needed by 99.999% of programs reading JSON, but now all of them are compromisable.
Think hard about what your program does. Stay safe.