|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2010-12-02 17:23 UTC] islam dot sharabash at gmail dot com
Description: ------------ No matter the inputed string hashes fuzzy_compare only returns 0 Reproduce code: --------------- <?php $string = "blah"; echo "blah"; ?><br><? var_dump($wee = ssdeep_fuzzy_hash($string)); ?><br><? var_dump(ssdeep_fuzzy_compare($wee, $wee)); ?> Expected result: ---------------- blah string(7) "3:2n:2n" int(100) //Match of 100 is expected Actual result: -------------- blah string(7) "3:2n:2n" int(0) //Match of 0 returned, no matter the input PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Oct 24 07:00:01 2025 UTC |
Thank you for taking the time to write to us, but this is not a bug. This a function of the ssdeep algorithm and not a bug. The author of the ssdeep upstream package has previously made reference to the algorithm becoming more accurate with content above 4KB in length. It is suited for checking files or large strings against each other and not individual words such as "blah". A simple MD5 or SHA1 hash is designed for checking that two strings are identical. Context piecewise hashing is designed to show approximately how similar two large strings or files are to each other. Please try running the tests that come with the php_ssdeep package or increase the length of your sample text. Whilst investigating your report I found that the following would return the result you expect: php > $hash = ssdeep_fuzzy_hash('blahblahblahblahblahblahblahblahblahblahblahblah blahblahfegfhgdhdghgdhgdshgsdhghgsdhgdshghsdhgdhgsjgdsjgdjgjgsgsghg haasateytyuytkutdkusuht nmfbnzbnzbaerereyetywturyiuteutejbf najetjhr gtjidahoadfh aiohjda hipdj hhadphjfgpahjapeghut9euhiotejhi tjhe tjphjtejhgijdhkjhklghijst eih eapsjhpjtephjtpjhptjpihjtihjidasfjh dhj dpasiojh poeatojh ohj tpeojhpoaetjhoptejhoteajhotad jhpoeatjhpotejhpoitejbjgji9rtsbiprpbjtaephetnhjetapihjpet eh peoaj hpejpteajhegbmzcklhkghjgdj hhj thj teabnpteanmpaeotnmp['); php > var_dump(ssdeep_fuzzy_compare($hash, $hash)); int(100) Again though this extension is not intended to look for identical strings and you should be using SHA1 or MD5 hashes if you need to ensure they are the same. If you want to get a similarity match then ssdeep is the right way to go.