Thursday 2 December 2004 — This is nearly 20 years old. Be careful.
We had a head-scratcher bug the other day. Our system is written in C++, and has a custom reference counting system for managing memory (see Greenspun’s Law). We were seeing an assertion that we had thought was long debugged and solid. Here’s what we found. (Warning: lots of arcane C++ stuff ahead.)
Our reference counting implementation looks kind of like this:
1/**
2 * Reference-counted object. When its reference count goes to
3 * zero, it destroys itself.
4 */
5class CRefCountedObject
6{
7public:
8 CRefCountedObject()
9 {
10 m_nRefCount = 0;
11 }
12
13 virtual ~CRefCountedObject()
14 {
15 // CRefCountedObject's can be created on the stack,
16 // where they will be destroyed by falling out of scope.
17 // This assert checks the implicit assumption that no one
18 // is still interested in the object when it is destroyed.
19 ASSERT(m_nRefCount == 0);
20 }
21
22 /** Declare your interest in this object: ups the refcount.
23 */
24 void Retain()
25 {
26 m_nRefCount++;
27 }
28
29 /** Declare your lack of interest in this object: if you were
30 * the last interested party, the object is destroyed.
31 */
32 void Release()
33 {
34 m_nRefCount--;
35 if (m_nRefCount== 0) {
36 delete this;
37 }
38 }
39
40private:
41 int m_nRefCount;
42};
43
44/**
45 * Smart pointer, templated to a particular object class.
46 */
47template <class TObject>
48class CPtr
49{
50public:
51 // .. lots of stuff omitted ..
52
53 /** Assignment operator. Manage the refcounts on the old and
54 * new objects.
55 */
56 TObject *operator=(TObject *p)
57 {
58 if (m_p != p) {
59 if (p != NULL) {
60 p->Retain();
61 }
62 if (m_p != NULL) {
63 m_p->Release();
64 }
65 m_p = p;
66 }
67 return m_p;
68 }
69
70 //.. lots of stuff omitted ..
71};
(Note: this is a simplified version of our actual code, and some of the simplifications mean that it will not work properly, but not in ways that affect the narrative here. For example, don’t try this code with multiple threads, or with multiple inheritance. I’m taking some expository license. Don’t bug me about it!)
The assert that fired is on line 19: it’s designed to protect against allocating reference counted objects on the stack. The problem with a stack-allocated refcounted object is that the object will be destroyed when it falls out of scope, regardless of the reference count. For our reference counting to work right, the only thing that should destroy an object is the delete on line 36. With a stack allocated object, the compiler destroys the object for us, so smart pointers can exist which point to freed memory:
1{
2 CPtr<CRefCountedObject> pObj;
3
4 //.. blah blah ..
5
6 if (blah blah) {
7 CRefCountedObject stackobj;
8 pObj = stackobj;
9 }
10
11 // pObj is invalid at this point.
12}
The object stackobj is created at line 106, and then destroyed at line 108 (when it falls out of scope), but pObj still has a pointer to it. When stackobj is destroyed, its reference count is 1, so the assert in the CRefCountedObject destructor will fire. All is well.
So when that assertion fired the other day, we thought we understood the problem. Just look up the stack, find the stack allocated refcounted object, and change it to a heap allocated object. But when we looked into it, there was no stack allocated object. So who was destroying the refcounted object before its time? How does a heap allocated object get destroyed other than using delete on it?
Digging deeper, we rediscovered a C++ detail that we had forgotten. Turns out we had some code that boiled down to this:
1class CBadObject: public CRefCountedObject
2{
3 //.. blah blah ..
4};
5
6CPtr<CBadObject> pGlobalBad;
7
8CBadObject::CBadObject()
9{
10 pGlobalBad = this;
11
12 DoSomethingWhichThrowsAnException();
13}
This object’s constructor assigned itself to a smart pointer outside itself, then threw an exception. C++ semantics state that if a constructor of a heap allocated object throws an exception, that the object’s destructor is called. Since CBadObject derives from CRefCountedObject, the CRefCountedObject destructor is called. It checks the reference count, sees that it is not zero (it’s one, because pGlobalBad has called Retain, but hasn’t called Release), and fires the assertion.
Updated May 2005: Actually, the correct statement of C++ semantics is that if a constructor throws an exception, a destructor is called for all the base classes that had been successfully constructed. In this case, the CBadObject destructor is not called (because the CBadObject constructor didn’t finish), but the CRefCountedObject destructor is called, because its constructor completed. It’s the CRefCountedObject destructor which causes the trouble here. More about this at More C++ constructor trivia.
So the lesson is: try not to throw exceptions from constructors, or at least understand that they will destroy the object for you.
Updated May 2005: Again, this is not correct. The real lesson is that throwing exceptions from constructors is perfectly legal, and C++ is precisely engineered to specify exactly what will happen. But you may be surprised by what happens. In particular, you may have objects in “impossible” states.
It just goes to show you:
You learn something new every day, no matter how hard you try.
By the way: there were two things we did to fix the problem. Do you know what they were? The first was to move most of the code out of the constructor, so that hard stuff happens in regular methods. The second was to add to the comment in CRefCountedObject’s destructor:
1virtual ~CRefCountedObject()
2 {
3 // CRefCountedObject's can be created on the stack,
4 // where they will be destroyed by falling out of scope.
5 // This assert checks the implicit assumption that no one
6 // is still interested in the object when it is destroyed.
7 // The other way this can happen is if a constructor
8 // fails, but it has already assigned itself to a CPtr
9 // someplace. When a C++ constructor fails, the compiler
10 // automatically destroys the object.
11 ASSERT(m_nRefCount == 0);
12 }
That way, future generations might be spared the head-scratching.
Comments
Also, your global variable will contain the last CBadObject created. Only an implicit constraint on the system (only one location in the code where a CBadObject is created) will ensure that pGlobalBad continues pointing to the same instance.
You are right: singletons should be done differently.
This is off topic, but what do you use to format you C++ into HTML? It looks really nice.
l = Lock()
class ScopedLock:
"Lock released with it goes out of scope"
def __init__:
l.acquire()
def __del__:
l.release()
def accessLockedResource():
locallock = ScopedLock()
.... lots of complicated stuff here ...
The problem we encountered was when accessLockedResource threw an exception, the lock was never released. It wasn't evident until we realized that sys.traceback holds all the value of each frame, and each frame includes all the local variables. Therefore, the lock wasn't getting released because the locallock hasn't been destroyed.
Where did you find this fact? Stroustrup, other sources (e.g. this), and my own experience (VC++ under Windows) indicates just the opposite: If the constructor throws an exception, the destructor is not called.
http://www.georgepsblogs.blogspot.com
Add a comment: