
The malware creators and those who combat malware are engaged in an ongoing battle. The security industry and the antimalware industry in particular aim to detect/stop and remove malware. The authors of the malware on the other hand, try to create their malware in a way that makes detection harder.
One of the most used ways to spread malware is through infected web pages, which has been discussed in several of our security articles. See e.g. The ultimate surfing challenge: Avoiding web sites with malicious content from December last year.
In a comprehensive report from Google, the trends in this phenomenon are studied, discussed and conclusions are drawn. We will look more closely into the study.
It is hard to imagine any organization more well-placed for providing statistical analysis of web pages - malicious or not - than Google. Google's Safe Browsing infrastructure is used by more than 400 million users per week, and relies on different technologies for classifying a site as malicious.
In its report Trends in Circumventing Web-Malware Detection, Google used data sets consisting of huge number of web pages that were detected as malicious. One of the study's goals was to study the evolution of malware's ability to avoid detection. Another goal was to see how effective various detection techniques proved to be.
Google classifies four different detection technologies:
VM honeypots monitor and analyze changes to the system to detect potentially malicious behavior.
The advantage is that no prior knowledge of exploit techniques is needed. One of the main disadvantages is that it is difficult to precisely determine how the system was exploited.
This technique can be described as emulating different browsers to extract web page features that are perceived as malicious.
This technology is well suited to determine the exact technique that is used to trigger the exploit. However, emulators are not able to detect exploits against unknown vulnerabilities. Emulators therefore need to be constantly refined as new browser vulnerabilities are disclosed (and new browser versions are relaesed).
Traditional antivirus technology relies on signatures to detect malware.
Signature technology is able to detect exactly a malicious piece of software. However, this technology is reactive, as it relies on detecting known malware. Authors of malware use different types of packing techniques to avoid detection by antivirus software. Another problem with this technology is what is known as false positives (software is detected as malware even if it is not malicious).
Signature based detection needs continuous updating of new signatures in order to be efficient.
This is classification of web pages based on the hosting infrastructure.
The technique relies on analyzing historical data. Authors of malware attempt to circumvent detection of their malware by switching to new hosting domains very frequently.
As mentioned above, a huge number of web pages and sites (domain names) comprised the data set used in the study. The time period that was used as the basis for the study, was from 1 December 2006 to 12 October 2010.
The report mentions different trends in evasion techniques for the four detection techniques mentioned above.
Social engineering technology is emerging as a technique to avoid detection by (most) VM-based honeypots. One typical example mentioned is fake antimalware pages, which require user interaction (typically a mouse click) before the malicious file is sent to the user's browser.
Once a vulnerability is publicly disclosed, it is included in malware kits - some even appear in malware kits prior to public disclosure. Thus, there is a window of opportunity for malware to avoid detection by emulated browsers, before these are updated to emulate the new vulnerability.
Malicious code is also obfuscated in order to avoid correct execution in emulated browsers, while the code functions correctly in real browsers.
Obfuscation of malicious code is a technique frequently used to avoid detection from signature based detection.
The report also mentions the fact that the authors of malware may test their code using antivirus products before launching. Thus there will be a window when the malicious code is not detected, before the antivirus signatures are updated.
The technique used to avoid reputation based detection is to register new domains to distribute the malware, and set up redirectors to send traffic to these domains.
The report mentions that the median lifespan of malicious sites was reduced from over one month between 2007 and 2009, to one week in July 2010, and to 2 hours in October 2010.
This evasion technique employs serving benign content to malware detection technology and malicious content to ordinary users. The simplest way this is accomplished is to avoid serving the malicious content to requests coming to certain IP addresses (those belonging to the detection systems).
According to the report, IP cloaking has increased steadily since the beginning of 2008. It is said to be a significant contributor to the total number of malicious web sites, at the point in time when the report was written.
The conclusion from the report corresponds with the researchers initial hypothesis:
(...) [m]alware authors continue to pursue delivery mechanisms that can confuse different malware detection systems.
The primary countermeasure that is recommended is:
(...) adopting a multi-pronged approach can improve detection rates.