Discussions with open source leaders from the scan.coverity.com community about testing, analysis, and security in open source software.

Would you like to know about 0day defects months in advance?

There’s a lot of discussion today about a 0day Local Linux Root exploit. http://isc.sans.org/diary.html?storyid=6820 For readers who aren’t security-savvy, that means that a user logged into a Linux system, with shell access, can bypass system security mechanisms and elevate his access to be equivalent to the system administrator (or ‘root’ user).

It’s called a 0day because the exploit was released with no advance warning that people should patch their systems - even though the code change to close the hole was committed twelve days ago, on July 5th, lots of people still have systems that are running on vulnerable versions of the code.

As part of the Scan Project, we’ve been analyzing a wide range of open source projects. We generally don’t talk about the specifics of the issues we find, and when we do, we wait until long after patches have been issued, and security advisories published. 0day defects are very disruptive for system administrators, since they demand immediate attention - the consequences of ignoring a 0day can lead to spending more time to completely reinstall a system when someone uses a 0day and then installs a rootkit (persistent backdoor access).

Generally, creating a specific exploit involves three things:

  1. Knowledge of a particular technique for bypassing the protective mechanisms that are built into a system
  2. Finding source fragments with examples of the code construct required by the technique.
  3. Locating a point on the attack surface which lets you provided the needed trigger, and can reach the targeted vulnerable code.

You can think of this like inventing a Phillips screwdriver, locating some screws with a compatible head design, and having the screws be somewhere you can reach with the screwdriver. If any one of the three factors isn’t there, you can’t accomplish your goal. Of course, there’s more than one type of screwdriver, and there’s more than one type of exploit technique. All you need to find is one matched set of three.


In the case of today’s drivers/net/tun.c 0day, the relevant code fragment is this:


static unsigned int tun_chr_poll(struct file *file, poll_table * wait)
  	{
  		struct tun_file *tfile = file->private_data;
  		struct tun_struct *tun = __tun_get(tfile);
  		struct sock *sk = tun->sk;
  		unsigned int mask = 0;

  		if (!tun)
  			return POLLERR;


The actual exploit is complex. It involves a compiler optimization that prevents this code from working as written, and a mechanism for mapping the zero-page of memory, so that a NULL pointer references memory under the control of the attacker. Finally, the attacker needs execution to continue past the safety check, which would normally return the POLLERR error code.Others have done a good job of covering the details, so I’ll provide URLs to their write-ups below.

The code was added to the Linux kernel on Feb 5th, 2009. Specifically, the ’struct sock *sk’ line was added. Some of the comments on the URLs below include remarks such as “a source code audit of the vulnerable code would never find this vulnerability”, and “That’s still perfectly valid C code”.

http://www.reddit.com/r/programming/comments/921sg/root_hole_in_linux_2630_including_a_creative_new/

On the face of it, this appears to be a simple null pointer defect. If the ‘tun’ pointer returned from __tun_get(tfile) is NULL, then the expression tun->sk should cause a null pointer dereference. For programmers used to working outside of the kernel, the usual result of dereferencing a null pointer is a segmentation violation, and the operating system terminating the application process. When programming in kernel space the result is specific to your platform and environment.

The unusual situation in this particular defect is that the misbehavior happened at compile-time. gcc looked at the assignment to sk, which used tun as if it was guaranteed to be a valid non-NULL pointer, and decided that there was no point in performing the if (!tun) test below. The if() block was optimized out. It’s in the source code, but it is NOT in the machine code that the compiler output.

In addition to fixing the code block in question, changes were also committed to Linux on July 16th to disable this particular compiler optimization during the build process.

So, this is clearly not expected behavior on the part of the compiler. The programmers included a test to check whether tun was valid or not, and return an error if it was not. The compiler removed that test, resulting in binaries that would allow processing to continue past the intended checkpoint, even when tun’s value was invalid.

Now, let us go back to the comment that “a source code audit of the vulnerable code would never find this vulnerability”. It certainly true that manual auditing would not be likely to predict that the compiler would interpret the code in such a way that the explicit test of tun’s validity would be bypassed. However, code auditors might very well have noticed the contradiction in dereferencing a pointer before checking to see if it is NULL.


There is a further challenge though. Even if the code auditors in question are well versed in every aspect of the programming language they’re reviewing, keeping in mind every best practice and comparing them to a set of lines of code is very difficult. It’s difficult even when the lines are close together such as in this example, and it approaches impossibility when the lines are further apart, or separated by indirection, like macro definitions, and function calls. It also gets harder when there are multiple paths through the code, and every possible combination has to be considered.

Fortunately, there are tools that can identify code issues just like today’s vulnerability. In fact, there’s at least one tool that identified this particular issue specifically. The Scan Project analyzes the Linux kernel on an ongoing basis. This issue was identified months ago, and tracked up until the fix was committed. For developers with access to the Scan results, it is Coverity defect ID #13020. As it happens, our Linux builds were not running smoothly in February and March, so we didn’t analyze a version of code containing the change until March 31st. At that point, the analysis flagged the issue as an instance of ‘Use of NULL value before test’, and continued to find it until July 6th when it had been fixed.

Now, note that in the vulnerability revealed today, the questionable optimization behavior in gcc predated the introduction of the code in the Linux kernel. However, it could as easily have happened the other way around - Linux adding the code first, and then developing a kernel vulnerability when people started using newer versions of the compiler. That’s why it’s important to maintain good coding practices. Whether it’s a change in the compiler, or the platform, or a change introduced by another developer working in your code later, keeping the code clean of issues like this one will not only protect you from vulnerabilities today, but keep new ones from opening up in the future.

The Linux developers have been doing preventative maintenance using the Coverity Scan for some time. There are hundreds of issues that have been fixed in Linux as a result of being identified in the Scan, and some of those would have provided other avenues for today’s 0day exploit as well. We’re glad to see those fixed, and know that they aren’t still available to be used in exploits.

The author of the exploit (Brad Spengler of grsecurity.net) lists the timeline for his work. He starts by noting ‘Discovery time of bug in public: 7/6/09′. That time corresponds to the day after the bug had been patched in Linux, so we should be safe to assume that the discovery was made by reading the commit logs which included the reason for the fix.

The Scan analysis data is kept private, available only to the developers who are approved by our contacts within Linux’s leadership. Publishing the analysis results would give exploit authors the opportunity to identify target code like this one, and would be irresponsible of us. We thank the Linux developers for their ongoing efforts to harden the Linux codebase, striving for complete code integrity. We encourage the Linux developers who don’t yet take advantage of the Scan Project to help those who do, in their efforts to fix outstanding issues like the one behind today’s vulnerability, and to keep such issues out of the Linux codebase on a continuing basis.

http://isc.sans.org/diary.html?storyid=6820

http://lists.grok.org.uk/pipermail/full-disclosure/2009-July/069714.html

http://www.reddit.com/r/programming/comments/921sg/root_hole_in_linux_2630_including_a_creative_new/

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a3ca86aea507904148870946d599e07a340b39bf

Comments are closed.