php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #48709 metaphone and 'wh'
Submitted: 2009-06-28 05:29 UTC Modified: 2009-06-29 03:35 UTC
From: brettz9 at yahoo dot com Assigned: felipe (profile)
Status: Closed Package: Unknown/Other Function
PHP Version: 5.2, 5.3, 6CVS-2009-06-28 (snap) OS: Windows
Private report: No CVE-ID: None
 [2009-06-28 05:29 UTC] brettz9 at yahoo dot com
Description:
------------
The source for metaphone() states in the comments that "WH becomes H" and when one tests it, e.g., with "whit", as per the comment on line 227 and the code which phonizes Next_Letter on line 233 when there is a 'W' followed by 'H', it returns 'H' ("ht" for 'whit'). 

However, according to the metaphone() algorithm (see http://aspell.net/metaphone/ and the original implementation at http://aspell.net/metaphone/metaphone.basic ), 'wh' should become 'w', not 'h'.

So, to fix it, you could change the comment on 233 to:

/* WH becomes W, 

and then change the 'if' on 231-234 to:

if (Next_Letter == 'H') {
    Phonize('W');
    w_idx += 2;
} else if (Next_Letter == 'R') {
    Phonize(Next_Letter);
    w_idx += 2;

Reproduce code:
---------------
<?php

echo metaphone('Whit');

?>

Expected result:
----------------
WT

Actual result:
--------------
HT

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-06-28 18:45 UTC] felipe@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.

Fixed in 5.2 and HEAD.
5.3 in soon.

Thanks.
 [2009-06-29 02:50 UTC] brettz9 at yahoo dot com
Although my patch was indeed better than the previous form, I was mistaken in saying that 'wh' should be pronounced as 'w' per the original algorithm (that is only if the 'w' is followed by a vowel).

Although treating 'wh' as 'w' (as you have it now) may indeed be more in the spirit of how metaphone() generally works, if you wish to have fidelity with the original algorithm such as implemented in Basic at http://aspell.net/metaphone/metaphone.basic (and as implemented on other systems), I think you would need to drop this handling.

So, to follow the original algorithm, it seems to me that the entire case for 'W' should be removed (line 230 in the previous snapshot).

But if you don't care about consistency with other original-based implementations (the Perl one added its own rules, including this one), then my previous patch should be kept.

My apologies for the confusion.
 [2009-06-29 03:32 UTC] felipe@php.net
But the initial letter exceptions are treated in the Basic code too.
 [2009-06-29 03:35 UTC] brettz9 at yahoo dot com
Ahh, ok, sorry, failed to check that this time, sorry... (Have a few more patches for metaphone() coming your way, though I'll open a new bug.)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Mar 19 03:01:29 2024 UTC