Jump to content


Check out our Community Blogs

Register and join over 40,000 other developers!


Recent Status Updates

View All Updates

Photo
- - - - -

Md5 Algorithm For Searching Viruses

md5

  • Please log in to reply
17 replies to this topic

#1 Tonchi

Tonchi

    Helping the world with programming

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1249 posts
  • Location:Zagreb
  • Programming Language:C#, Others
  • Learning:C, C++, Python, JavaScript, Transact-SQL, Assembly

Posted 18 April 2012 - 02:22 PM

I have a MD5 code for creating signatures for some user input but what I want is to search local disk or full computer for viruses with MD5 to compare those signatures with signatures in txt database...How can I do that search? Is there any tutorial or can someone give me a code?
  • 0

Microsoft Student Partner, Microsoft Certified Professional


#2 WingedPanther73

WingedPanther73

    A spammer's worst nightmare

  • Moderator
  • 17757 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others

Posted 18 April 2012 - 02:28 PM

That sounds like a really bad idea. Let me explain my reasoning:

Let's say your signatures is 12345. If you have a 2Mb file, you have to hash 5 characters around 2 million times, rather than just looking for the five characters. Which do you think will be more efficient?
  • 0

Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

My MineCraft server site: http://banishedwings.enjin.com/


#3 Tonchi

Tonchi

    Helping the world with programming

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1249 posts
  • Location:Zagreb
  • Programming Language:C#, Others
  • Learning:C, C++, Python, JavaScript, Transact-SQL, Assembly

Posted 18 April 2012 - 03:07 PM

and what do you suggest? which type of alghoritm will be good
  • 0

Microsoft Student Partner, Microsoft Certified Professional


#4 Luthfi

Luthfi

    CC Leader

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1320 posts
  • Programming Language:PHP, Delphi/Object Pascal, Pascal, Transact-SQL
  • Learning:C, Java, PHP

Posted 18 April 2012 - 09:00 PM

I don't think you can use MD5 hash to find virii signatures. Virii signatures are not the same with file signatures. Virus signature is there so the virus code can avoid reinfect already infected file.

Virii can place themselves in arbitrary location in infected file. Therefore if you want to find virii using MD5 hash, you have to MD5 hash the file multiple times, perhaps as many as the file size minus the size of the virus. Very ineffective and time consuming. Not too mention that the longer you hold a file, the more the possibility you will collide with another process/activity that need access to the file.

For this, the good algorithm is by directly searching the virus signature in the examined file.
  • 0

#5 Tonchi

Tonchi

    Helping the world with programming

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1249 posts
  • Location:Zagreb
  • Programming Language:C#, Others
  • Learning:C, C++, Python, JavaScript, Transact-SQL, Assembly

Posted 18 April 2012 - 11:42 PM

and with wich algorithm I can do that?
  • 0

Microsoft Student Partner, Microsoft Certified Professional


#6 Luthfi

Luthfi

    CC Leader

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1320 posts
  • Programming Language:PHP, Delphi/Object Pascal, Pascal, Transact-SQL
  • Learning:C, Java, PHP

Posted 19 April 2012 - 12:03 AM

You don't need fancy algorithm for this. Open the file as stream of bytes, then inspect the bytes to see if any virii signatures found within it. Similar like you would to find a certain substring from a larger string of characters.
  • 0

#7 papabear

papabear

    CC Devotee

  • Senior Member
  • PipPipPipPipPipPip
  • 472 posts
  • Location:DarkSide

Posted 19 April 2012 - 04:41 AM

finding viruses using the file signatures or the MD5 of it was the old method used by anti virus to find and detect viruses.. they call it "Signature-Base Detection" You can use this method in creating a simple and lightweight antivirus if you wish so, however there are several new viruses that doesn't have signature yet or not yet known.. Why not try to research about Heuristics Analysis?
  • 0
Life has no CTRL+Z
Never Forget To HIT "LIKE" If I Helped

#8 WingedPanther73

WingedPanther73

    A spammer's worst nightmare

  • Moderator
  • 17757 posts
  • Location:Upstate, South Carolina
  • Programming Language:C, C++, PL/SQL, Delphi/Object Pascal, Pascal, Transact-SQL, Others
  • Learning:Java, C#, PHP, JavaScript, Lisp, Fortran, Haskell, Others

Posted 19 April 2012 - 07:29 AM

and what do you suggest? which type of alghoritm will be good

http://msdn.microsoft.com/en-us/library/ms228630%28v=VS.80%29.aspx
  • 0

Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

My MineCraft server site: http://banishedwings.enjin.com/


#9 Tonchi

Tonchi

    Helping the world with programming

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1249 posts
  • Location:Zagreb
  • Programming Language:C#, Others
  • Learning:C, C++, Python, JavaScript, Transact-SQL, Assembly

Posted 19 April 2012 - 11:50 AM

finding viruses using the file signatures or the MD5 of it was the old method used by anti virus to find and detect viruses.. they call it "Signature-Base Detection" You can use this method in creating a simple and lightweight antivirus if you wish so, however there are several new viruses that doesn't have signature yet or not yet known.. Why not try to research about Heuristics Analysis?


I was trying with Heuristic Algorithms but I was not able to find any code example in any language. Neither C/C++ nor .NET. I'm searching for that algorithm for about half a year :/
  • 0

Microsoft Student Partner, Microsoft Certified Professional


#10 Tonchi

Tonchi

    Helping the world with programming

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1249 posts
  • Location:Zagreb
  • Programming Language:C#, Others
  • Learning:C, C++, Python, JavaScript, Transact-SQL, Assembly

Posted 19 April 2012 - 01:52 PM

is this book good for beggining http://www.cs.nott.a...pdf/cag-phd.pdf
  • 0

Microsoft Student Partner, Microsoft Certified Professional


#11 papabear

papabear

    CC Devotee

  • Senior Member
  • PipPipPipPipPipPip
  • 472 posts
  • Location:DarkSide

Posted 19 April 2012 - 02:20 PM

is this book good for beggining http://www.cs.nott.a...pdf/cag-phd.pdf


yes try to read that ebook I think it would be a good book as I scan it.. for heuristic analysis.. you must have something like a "Virtual Machine" to run a program and analyze what it does.. :)
  • 0
Life has no CTRL+Z
Never Forget To HIT "LIKE" If I Helped

#12 Tonchi

Tonchi

    Helping the world with programming

  • Expert Member
  • PipPipPipPipPipPipPip
  • 1249 posts
  • Location:Zagreb
  • Programming Language:C#, Others
  • Learning:C, C++, Python, JavaScript, Transact-SQL, Assembly

Posted 19 April 2012 - 02:36 PM

it's fascinating for me that whenever i want to find something on google i can't find and when i search something randomly then i find what i was trying to find before :D
  • 0

Microsoft Student Partner, Microsoft Certified Professional






Also tagged with one or more of these keywords: md5

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download