Ever wondered how the good guys catch bad guys? I meant to say, what’s the process of catching bad guys who create virus, malware, crypto-lockers? It’s sort of a grey area and mostly not discussed very openly about the different methodologies used for Catching bad guys who create and distribute viruses and malwares. In a cyber environment, live bait becomes honeypots—fake targets made to attract cyber criminals. Big data analysis replaces interrogations as millions of data points are analyzed and visually represented to determine behavioral patterns. Forensic analysis is just as clinical: viruses are carefully dissected in post mortem analyses. BTW, fair warning … TLDR
The end result? An average of 130,000 malicious files and more than 20,000 infected websites are detected and tracked every single day. Several million connections per minute to honeypot servers are made to push real-time counter-measures to our users’ computers and mobile devices.
Where Hackers Hang Out
AntiVirus companies gathers information about malware – and about the hackers who create them – by infiltrating hacker chat rooms, monitoring poisoned websites, and collecting forensic samples of malicious code to reverse engineer them. How these anti-malware and antivirus teams infiltrate chat rooms is a well-guarded secret, but the primary method of collecting code samples to reverse engineer is well known: ‘honeypots.’ But we can guess, they hire ex-cons and bait wannabe hackers to contribute codes to real hackers to gain trust and infiltrate hacking groups. In some cases, they hack hackers infrastructure or simply reverse engineer their codes to gain insider knowledge
Honeypots, Honeynets & Honeyfarms
If you were a hacker who is trying to spread malware, you would try to scan a subnet and find vulnerable computers. Once you’ve found it, voila… you’ve just got your first zombie. Remember rxbot? No? ahem!! Anyway, honeypots are regular computers, unprotected and connected to the Internet. Hackers regularly scan IP addresses across the Internet looking for vulnerable computers like these, and they have automated tools to penetrate these machines and look around for banking documents, passwords, address books, or any other information that could be sold to spammers and other hackers en masse. Often they leave a keylogger (records keyword strokes) and back doors (remote access) so they can continue to harvest new passwords and credit card data, and turn the computer into a botnet or spam relay.
Virus hunters monitor these hacker intrusions and learn as much as they can about criminals’ tools and intentions. To increase the likelihood of being discovered and hacked, they set up rows of these honeypots on virtual machines. To avoid the liability of the infected honeypots being used to harm others — what is known as ‘downstream liability’ — the honeypots are usually set up in a honeynet behind a firewall that allows all Internet traffic to come in, but none to go back out. The whole operation is called a honeyfarm. So in short
Honeypot <—> Lives in Honeynet (behind a separate network) <–> “X” amount of Honeypot + Honeynets are called Honeyfarm.
Of course, hackers monitor the forwarding of log files, keystroke captures, and outbound traffic. So Virus hunters have developed a number of ways to mimic such activity so that they can gather as much information as possible from the honeypot without the hackers detecting fake activity.
Social Engineering
Social engineering is the Achilles Heel of Antivirus security: we can stop malware code from executing on your machine, but no one can stop you from clicking on something you shouldn’t. Nigerian banking scams, Ransomware (like Cryptolocker and CryptoDefense), Spear-phishing and other successful scams all rely on of social engineering in combination with malware technology.
Social engineering is the art of convincing a computer user to willingly divulge passwords, social security numbers, and other personal identity information that can be harvested and sold to others, or held as ransom for immediate payment. Social engineering can also be used to convince users to download fake anti-virus software (which does exactly the opposite of protecting you) or to visit fake banking websites where you unwittingly enter your account numbers and passwords.
Analyse, identify and neutralize a virus in seconds
Many of these hack attempts are merely variants of malware we’ve seen before, so we can quickly handle them. But the exploits known as “zero-day” attacks are much more complicated to handle since they exploit unknown security holes in browsers, Java, Adobe Flash and other common software to infect the victim’s computer. How does one analyze massive data streams, identify unknown malicious files, and reverse-engineer them to code an antidote in seconds?
Virus hunters uses various types of ‘big data’ analyses coupled with some proprietary multi-variate statistics and clustering to actually predict which samples are likely to be malware. These predictions are based on automated analyses of the malware characteristics and statistical comparisons of those characteristics to other ‘families’ or clusters of malware with similar characteristics.
When a Virus hunter encounters a potential ‘zero-day’ threat, they determines its key characteristics and determines if it might be similar to a known family of malware and hence a known set of countermeasures might work to stop it. Now of course the bad guys use many tricks to obfuscate their executables in order to evade detection by automated anti-virus detection systems – they use clever algorithms to slightly alter key elements of their executable payloads in order to ‘look’ different each time. In some cases, they can simply encrypt the payloads and thus be completely transparent to AV scanners. Fortunately, predictive analytics identify these behavior.
Scanning engines uses the following three basic techniques to identify and stop bad things from happening to your device:
- Scan the traffic looking for any signatures that match to a database of known viruses and Trojans;
- Watch for programs on your device that start to exhibit malicious-looking behavior, such as making changes to your Registry or unpacking additional coded malware;
- Analyze program code (sometimes called disassemble program code) to look for malicious things. This analysis is often very complex and is usually done in the cloud on our high-powered servers, with the results sent back to your device almost instantaneously.
All of the above is happening on your computer in milliseconds, which is pretty amazing when you think of it.
I highlighted some key words because these are words that many people have heard of when reading about antivirus and security software. If you are interested, we’ll now discuss each of them in a bit more detail.
Signature-based detection is the most common antivirus technique. Every piece of malware has a unique fingerprint – it could be a particular series of bytes in the code, or a cryptographic hash of the file, or any other identifiable element – and that fingerprint can be matched against a database of known viruses and Trojans. The advantage of signature-based methods of detection is that it is fast and 100% effective for known malware. The downside is that it won’t stop any viruses or malware that hasn’t been seen before – and the bad guys are good at mutating their exploits to evade detection but still retain its functionality. So signature-based detection is an efficient first pass, but it needs to be used along side other detection methods.
Behavior-based detection (sometimes used interchangeably with the term ‘Heuristics-based’ detection) identifies malware by watching for suspicious behaviors – like attempts to modify the host file or initiating data calls out to dubious IP addresses. Heuristics-based detection technically means statically examining files without an exact signature match. Although no single behavior or observation might be enough to declare a file as malware, taken together, behavior and heuristics-based techniques can flag files and add up scores. By setting threshold scores and stopping any executable code that surpasses those limits, it allows the antivirus tool to detect the presence of previously unseen malware or virus, and keep your system protected.
Cloud-based detection collects potential malware samples from your computer and sends it to high-powered severs for analysis. This technique minimizes the load put on your computer. Cloud engines can observe patterns and correlate data across millions of other users. Another benefit to moving some analysis to the cloud is that it makes it more difficult for the Bad Guys to reverse-engineer and test their malware against our scanning engines without identifying themselves.
Sandbox-based analysis deploys a security mechanism for separating running programs, usually in an effort to mitigate system failures or software vulnerabilities from spreading. It is often used to execute untested or untrusted programs or code, possibly from unverified or untrusted third parties, suppliers, users or websites, without risking harm to the host machine or operating system. A sandbox typically provides a tightly controlled set of resources for guest programs to run in, such as scratch space on disk and memory. Network access, the ability to inspect the host system or read from input devices are usually disallowed or heavily restricted. Sandboxing is frequently used to test unverified programs that may contain a virus or other malicious code, without allowing the software to harm the host device.