2011-07-20

debugging "clever" code

""Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

— Brian W. Kernighan and P. J. Plauger in The Elements of Programming Style.

This came up in a comment about an LWN article on how subtle and tricky some of [the Linux kernel] core code has become.  Some of the more interesting quotes from that article and its links:

(in the original email from Hugh Dickins):
That -ENOENT in walk_component: isn't it assuming we found a negative dentry, before reaching the read_seqcount_retry which complete_walk (or nameidata_drop_rcu_last before 3.0) would use to confirm a successful lookup?  And can't memory pressure prune a dentry, coming to dentry_kill which __d_drops to unhash before dentry_iput resets d_inode to NULL, but the dentry_rcuwalk_barrier between those is ineffective if the other end ignores the seqcount?
Let's call this "establishing the baseline" -- anyone who did not understand at least 75% of this will be lost as far as the real problem is concerned.  But what about the people who *did* understand it (or at least, had the best chance to):
There is a sobering conclusion to be drawn from this episode, though. The behavior of the dentry cache is, at this point, so subtle that even the combined brainpower of developers like Linus, Al, and Hugh has a hard time figuring out what is going on. These same developers are visibly nervous about making changes in that part of the kernel. Our once approachable and hackable kernel has, over time, become more complex and difficult to understand. Much of that is unavoidable; the environment the kernel runs in has, itself, become much more complex over the last 20 years. But if we reach a point where almost nobody can understand, review, or fix some of our core code, we may be headed for long-term trouble.
uh oh...!

No comments: