VirusTotal.com Suggestion – Something of naught?
Many of you know that not all virus programs are guaranteed to pickup all viruses. While some programs tend to lead in terms of having the most complete antivirus definitions, no one program has 100% complete virus definitions. Even if you were to do your homework, citing studies done under the scientific method, not one program would have 100% detection based on just one antivirus companies’ antivirus definitions.
Because of the lack of 100% detection in any antivirus client, there is always a chance that the current antivirus you’re using has not detected a virus in a file you’ve downloaded. Potentially you could have a virus, and just not know it yet.
This is where free web based services like VirusTotal.com come into play. At VirusTotal.com, you can upload any file, and it gets scanned with currently about 30 different antivirus engines, all running the latest of their own virus definitions. You then get to see the results from each scanning engine all cleanly layed out in a table displaying the details of the antivirus engine, and the name of the virus, if it detected one. This allows you to get extra security beyond your own antivirus for files that you deem suspicious, and to secondly compare antivirus engines and definitions providers, and then appropriately make up your own opinion based on your experience with detection consistency.
The purpose of this article is not to compare the antivirus engines themselves, but to look at the potential VirusTotal.com has to drive more traffic and attention to their site.
So, equipped with a keyboard and (currently) ten fingers, I decided to write VirusTotal.com with a suggestion I have. Now, keep in mind how I communicate in my email that I fully understand there is no such thing as 100% detection, and that I understand that VirusTotal.com can’t be a replacement for antivirus.
> Hello. I really like your free service you provide. I understand that
> every file that gets uploaded is hashed and the hash / checksum is
> stored. I have a suggestion on how you could use that hash database to
> really improve exposure to your services.
>
> If you could make a client for Microsoft Windows that would download
> regular updates from your hash database, and scan files in realtime
> comparing their hash with the ones in your database, this could provide
> an excellent Backup Antivirus Solution. While it would only catch things
> if their exact hash matched, I understand it wouldn’t be a complete
> replacement for antivirus with heuristics, as viruses can be
> polymorphic, and have a different hash in every computer they infect.
> However, at least it would be considered a very reliable secondary
> Antivirus program, So if the user’s main antivirus program didn’t catch
> it, one of your other 27 antivirus that had already scanned it and had
> the hash on file would be a reliable database to compare the hash against.
Hello Tanner,
VirusTotal is just a tool for checking files, it is by no means a
substitute of an antivirus, even as backup or secondary option.
–
Regards,
Julio Canto | VirusTotal.com | Hispasec Sistemas Lab | Tlf: +34.902.161.025
| Fax: +34.952.028.694 | PGP Key ID: EF618D2B | jcanto@hispasec.com
Does he just not see the potential I see? I completely understand that checking files against VirusTotal.com’s hash list requires the virus definitions by the used antivirus clients to have the definitions for the given virus, and for VirusTotal.com to have the file’s hash already stored…
Let’s say the given user has their primary antivirus, but they want to have a little more… why not have a system utility that checks files against the VirusTotal.com hash database, and if the hash doesn’t exist, to automatically upload the file (if it is smaller than X megabytes) for virus checking by VirusTotal.com
So, here’s my thinking… If VirusTotal.com won’t write a utility, someone out there easily could assuming they know a common programming language, and can write a program that would parse the HTML from VirusTotal.com’s results.
Does he just not see the potential I do?
- Increased service and company exposure
- Higher web traffic, increasing domain value, advertising potential
- With the increased exposure, he could began to offer new services… and charge for them a decent fee
Look at how Google started, and their business model initially, and their business model now. Initially they started with the idea of collecting and analyzing information about how people go from searching the internet, to how they identify and sort through the results to get to what is relevant. Google stores all this information, and uses it in part to automatically improve search results for everyone based on previous A to B web browsing experiences from other Googlers.
In my eyes, what I see and what he didn’t see is the potential that his utility has. But, now some other willing soul will come along, see where he decided to drop the ball (my email and his response should give you a very good clue) and build upon the ground where VirusTotal.com is stagnant. After all, that is the nature of the law of progression.
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

http://www.tot-ltd.org/md5db/0000
http://www.tot-ltd.org/md5db/FFFF
I emailed him quite a while ago about the same thing, only to be rebuffed. The next database update will include just over 5 million md5 hashes. Since nobody else has the balls to step up to the plate and put their money where their mouth is, I figured I might as well be the first one to do it.
Adds an Explorer context menu item that checks the MD5 hash against Virus Total’s database.
@Andrew, thanks for that. Though, I think I’ve already taken a look at your utility. Unfortunately that is not quite what I’m looking for, as that one is only a context-called function, and then beyond that, all it does is initiate the uploading of the file for me. If I want to check it. All it really does for me is save me from having to type in virustotal.com and press my browse button. What I’m looking for is something that actually hashes the file out, then searches the hash online, then returns the parsed results back to the program on my computer, and all of this in as close to real time as possible. Support for database caching would be excellent, and even auto-rescanning if the previous scan was like 7+days ago. I would like a program to run as a background process, something like a real anti-virus program, just constantly hashing up files, and checking the VT database with those hashes against the scan results.
@idbeholda It looks like you have a good database there and growing, but I have a few questions for you.
How have you made this available for people looking to build up applications around it?
What software are you currently using to generate your database, and, what anti-virus programs are scanning files?
How does one submit a file for scanning, and do you have an API available?
The database is open to anyone who wants to use it. There is no “API” required, other than an internet connection of some kind that is able to access plain text files via standard http protocols.
The addresses are listed from http://www.tot-ltd.org/md5db/0000-FFFF, which are the first 4 digits of the md5 hash that you’re looking to verify/check for the presence of malware. The text files are parsed using line feeds, which reduces total size, since the database is uncompressed for ease of use and access. Even though the database is distributed through 65536 files, each section is only a few KB in size.
I currently have a system scanner up and available for download at http://www.tot-ltd.org/TT-Livescan.rar The scantimes are limited almost entirely by the hardware capabilities of the system the scanner is running on, and can process just over 50GB/Min at its peak.
As for the source of the database, one of my previous projects was VTE Virus Scanner. Since its inception, I’ve obtained samples and lists from countless places across the internet. Most recently, I obtain/utilize google’s malware blacklist, clamav.net’s database, honeynet.cz, and a few other sites that publicly list md5 hashes. Personally, I prefer to use jotti to double check samples that I obtain.
Hope this explanation was of some help/use.
@P. Tanner Williamson the tool I mentioned DOES check the hash, it doesn’t facilitate uploads beyond opening VT’s website.
@Andrew
I suppose that it is at least a good start. I see that it only works when called in via the explorer context menu. I would really like to see a utility that can run as a system process monitoring all files accessed, and check their hashes automatically.
I understand that is a bit more advanced than the tool you’ve developed thus far. While I don’t expect to see any such utility any time in the near future, it would be something of worth.
I would like to see beyond what I mentioned above about the active bloodhound monitoring, that after the program searches the hash, and If the hash doesn’t exist, or the previous scan was beyond a user specified threshold, I would like to see the program (re)upload the file, run the scan, and then return the results. If a positive result is found, then a visual warning could/should be displayed to the user, where as if the file is clean, then the process monitoring and hashing utility would continue on without interrupting the user.
@P. Tanner Williamson
I wouldn’t attempt to create a third-party tool such as that, it wouldn’t be fair to VirusTotal to swap their servers in that way.
@Andrew
It is true, such a utility especially in wide-spread use, would put a substantial load on their servers. For something to succeed it would require the proper infrastructure in place, and adequate resources, agreed.