Javascript is an incredibly easy language to evade signature-based detection with. Because Javascript can dynamically execute instructions from strings, it's easy to use simple url/html escaping, base64 encoding, Unicode escaping, string concatenation and dozens of other techniques to obscure exploits from pattern-based analysis. The de-obfuscated code can then be injected into the page through document.write("..."), eval("..."), new Function("...") or event handlers anywhere in the DOM.
As DPI and Anti-Malware engines worked to factor out basic evasion methods, the attackers stepped it up using Javascript-based cryptographic algorithms, substitution functions, obscure options like Microsoft's Javascript encryption for IE and an ever evolving library of fragmentation techniques. Over in the Fortinet Blog David Maciejak points out that scanners that take a file-based approach can be evaded simply by deploying portions of the malicious attack in different files, since no one file contains the full exploit. Brian Prince at eWeek covered a new trend of using AJAX to pull portions of the exploit low and slow before finally running the reassembled exploit in code. This scary method would easily evade any form of network-based inspection by slowly passing fragments as small as a few bytes. To make it worse, many of these techniques can be combined and exploits can be randomly obfuscated for each request.
Ultimately there is no way to reliably stop Javascript exploits with signatures.
This problem has researchers and developers thinking laterally about innovative ways to solve the issue.
In 2005 a team at Microsoft began work on a technique they called BrowserShield. It essentially re-wrote an incoming page and wrapped the scripts with a safety function that would, in essence be its own mini-IPS, capable of evaluating the script for malicious intent. The same thing could likely be accomplished in Greasemonkey for Firefox. While this technique showed some merit there are issues involved with re-writing the original content. The page itself may not function correctly if it was using reflection, dynamic code generation, or odd Javascript quirks like arguments.callee. Latency is also an issue anytime content is proxied and altered and it's unclear how many of the vulnerabilities in a browser could really be mitigated by this technique after all, the vulnerabilities could be in the Javascript interpreter itself.
In 2007 researchers from Google automated an instance of Internet Explorer inside a Virtual Machine. They performed a fingerprinting of the file system and registry before and after visiting URL’s from Google’s database. The technique was successful in identifying URLs that loaded malicious code onto the operating system. They found that, in paged with possible indicators of malicious intent, ten percent of the pages engaged in drive-by downloads. The same technique is used in Capture-HPC (part of The Honeynet Project) to instrument a variety of client browsers, productivity applications and multimedia applications. Due to the size of the Internet, the only conceivable way to maintain this type of bad list is a central service. A service of this nature, like Google's Safe Browsing API, is a step in the right direction but has many caveats. Like all forms of bad URL blocking (for example phishing sites) the information is quickly out of date. Malicious sites are rotated in and out every minute and legitimate sites can be compromises at any time. Besides lagging coverage of sites, this technique also is blind to authenticated or private content. It's also troubling that this technique only accounts for instances where the exploit made detectable system alterations. In the case of information leakage or rootkits the changes may not be visible. Trend Micro's Web Threat Protection augments this by including factors like a site’s age, any historical location changes, and other suspicious behavior.
If sanitizing script is too risky and blocking malicious URL's is spotty, there is always the big-stick method of NoScript. I swear by NoScript myself. Since its inception in 2006, NoScript has become a very comprehensive tool for protecting yourself when surfing. The problem with relying on it as a wide spread mechanism for the non-technically inclined is the potential for desensitized click-through (like those pesky SSL warnings). Users could fall into a behavior of automatically setting 'Allow All On This Page' every time the bar pops up or the page looks wrong. It also does nothing to prevent you from being infected by legitimate sites you choose to trust or sites you already trust that have since been compromised. Unfortunately well known sites being hacked to server up exploits is a constant reality.
Another approach, that can be implemented without user interaction is heuristics on the Javascript using a flow rather than file-based approach. A DPI engine can look for suspicious characteristics of the content, such as specific code-injecting functions, apparent regions of obfuscated strings (by density, length and spacing characteristics), patterns of crypto functions, AJAX bootstrappers that execute code, and the like. This significantly adds to the ability to do realtime detection and prevention but may result in false positives in a Web 2.0 world that legitimately uses many of these techniques when creating the dynamic web. Even looking for regions of obfuscation isn't safe due to its use in IP protection and bandwidth savings of increasingly script-heavy websites. As a detection mechanism, this works well, but in prevention the chances of false positives are very high.
While all of these methods help in the fight against Javascript exploits, no single method is the answer.
I would like to hear from you, what other techniques are being tried to face this threat?
(Reposted to fix the formatting)

Comments