|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #59527 sdeep fuzzy_compare only returns 0
Submitted: 2010-12-02 17:23 UTC Modified: 2010-12-02 18:10 UTC
From: islam dot sharabash at gmail dot com Assigned:
Status: Not a bug Package: ssdeep (PECL)
PHP Version: 5.2.13 OS: centos 5
Private report: No CVE-ID: None
 [2010-12-02 17:23 UTC] islam dot sharabash at gmail dot com
No matter the inputed string hashes fuzzy_compare only returns 

Reproduce code:

$string = "blah";
echo "blah";
var_dump($wee = ssdeep_fuzzy_hash($string));
var_dump(ssdeep_fuzzy_compare($wee, $wee));


Expected result:
string(7) "3:2n:2n"

//Match of 100 is expected

Actual result:
string(7) "3:2n:2n"

//Match of 0 returned, no matter the input


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2010-12-02 18:10 UTC]
Thank you for taking the time to write to us, but this is not
a bug.

This a function of the ssdeep algorithm and not a bug. The author 
of the ssdeep upstream package has previously made reference to the 
algorithm becoming more accurate with content above 4KB in length.

It is suited for checking files or large strings against each other 
and not individual words such as "blah". A simple MD5 or SHA1 hash 
is designed for checking that two strings are identical. Context 
piecewise hashing is designed to show approximately how similar two 
large strings or files are to each other.

Please try running the tests that come with the php_ssdeep package 
or increase the length of your sample text. Whilst investigating 
your report I found that the following would return the result you 

php > $hash = 
haasateytyuytkutdkusuht nmfbnzbnzbaerereyetywturyiuteutejbf 
najetjhr gtjidahoadfh aiohjda hipdj hhadphjfgpahjapeghut9euhiotejhi 
tjhe tjphjtejhgijdhkjhklghijst eih 
eapsjhpjtephjtpjhptjpihjtihjidasfjh dhj dpasiojh poeatojh ohj 
jhpoeatjhpotejhpoitejbjgji9rtsbiprpbjtaephetnhjetapihjpet eh peoaj 
hpejpteajhegbmzcklhkghjgdj hhj thj teabnpteanmpaeotnmp[');
php > var_dump(ssdeep_fuzzy_compare($hash, $hash));                             

Again though this extension is not intended to look for identical 
strings and you should be using SHA1 or MD5 hashes if you need to 
ensure they are the same. If you want to get a similarity match 
then ssdeep is the right way to go.
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Jul 15 04:01:28 2024 UTC