php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #4556 get_meta_tags - ( Fix supplied in this bug report )
Submitted: 2000-05-23 01:00 UTC Modified: 2001-02-10 21:40 UTC
From: c dot just at phoenixdigital dot com Assigned:
Status: Closed Package: Strings related
PHP Version: 4.0 Latest CVS (23/05/2000) OS: ALL ( Bug in source code )
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: c dot just at phoenixdigital dot com
New email:
PHP Version: OS:

 

 [2000-05-23 01:00 UTC] c dot just at phoenixdigital dot com
Since looking at the PHP4 distribution it seems to contain the same code so it would be affected by this problem as well.

I have previously submitted this bug in the PHP3 buglist this fix will apply there as well.
The PHP3 bug number is ???????. I just went to find it and it has been deleted?
Please let me know if you feel this should not be fixed?

The problem arises when a metatag definition takes up more than 1 line.

Current function works for this definition
<META NAME="DC.Language" SCHEME="freetext" CONTENT="english">

Current function fails for this definition
<META NAME="DC.Coverage.PlaceName" SCHEME="TGN" 
CONTENT="Oceania ? Australia ? Queensland ? Brisbane" > 

I am not yet adept in coding in C so I have written the fix function in PHP :)
Please feel free to ask me any questions regarding how it works.
-------------------------------------------------------------------------------
/************************************************
* GET_META_TAGS2
*************************************************
*
* IT IS A FIX FOR THE CURRENT VERSION OF THIS FUNCTION WITHIN PHP.
*
* THIS FUNCTION TAKES INTO ACCOUNT METATAGS WHICH SPAN MORE THAN 1 LINE.
*
* IT WILL EVEN ALLOW \r\n's WITHIN A name OR content DEFINITION.
*
*
*
*
*************************************************/
 
function get_meta_tags2($file){

  # OPEN FILE
  $fp = fopen($file,"r");

  # INITALISE ALL FLAGS
  #
  # THE PURPOSE OF EACH OF THEE FLAGS IS TO TELL THE SCRIPT IF THE CURRENT $buf VALUE IS
  # A CONTINUATION OF A PREVIOUS metatag, name OR content VALUE.
  # IF THE FLAG IS ACTIVE IT MEANS THAT THIS NEW LINE CONTAINS MORE INFORMATION ABOUT THE
  # metatag, name OR content.
  # THE FLAGS ARE ONLY KEPT RAISED IF AN ITEM SPANS MORE THAN 1 LINE.
  #
  $inmeta = 0;
  $inname = 0;
  $incontent = 0;
  
  # LOOP THRU FILE
  while( ($buf = fgets($fp,8191)) && (!eregi("</head>",$buf,$results)) ){

    # CHECK FOR START OF METATAG
    if( ($tmp = stristr($buf,"<meta")) || $inmeta){

      # IF START OF METATAG DETECTED MAKE $buf = $tmp OTHERWISE $buf CONTAINS CONTINUING TEXT FOR A METATAG.
      if (!$inmeta)
        $buf = $tmp;

      # RAISE INSIDE METATAG FLAG
      $inmeta = 1;

      # TRY TO RETRIEVE THE BEGINNING OF NAME 
      $tmp = stristr($buf,"name=\"");
      
      # CHECK FOR BEGINNING OF NAME OR ARE WE ALREADY INSIDE NAME SECTION
      if($tmp || $inname){

        # IS THIS LINE A CONTINUATION OF THE NAME FROM THE PREVIOUS LINE
        if($inname){
          $tmp = $buf; # RESET $tmp TO $buf AS stristr() FUNCTION RETURNED A BLANK

        # THIS IS THE BEGINNING OF THE NAME RETRIEVAL
        }else{
          $inname = 1;      # RAISE $inname FLAG
          $metaname = "";   # CLEAR METANAME VAR
          $tmp = substr($tmp,6); # MOVE POINTER TO AFTER THE name="
        }

        # FIND THE CLOSE QUOTE FOR THE NAME FIELD
        if($end = strpos($tmp,"\"")){
          $metaname .= substr($tmp,0,$end); # RETRIEVE NAME PART FROM $tmp AND APPEND TO PREVIOUS RETRIEVED VALUES.
          $metaname = strtolower(ereg_replace("\.|\\|+|\*|\?|\[|\]|\^|\(|\)|\\$","_",trim($metaname))); # REMOVE SPECIAL CHARACTERS
          $inname = 0;  # LOWER $inname FLAG 
        }else{
          $metaname .= $tmp; # NOT THE END OF THE NAME APPEND TO PREVIOUS VALUES AND PROCESS NEXT LINE.
        }

      }
      
      $tmp = stristr($buf,"content=\"");
      if($tmp || $incontent){

        # IS THIS LINE A CONTINUATION OF THE CONTENT FROM THE PREVIOUS LINE
        if($incontent){
          $tmp = $buf; # RESET $tmp TO $buf AS stristr() FUNCTION RETURNED A BLANK

        # THIS IS THE BEGINNING OF THE CONTENT RETRIEVAL
        }else{
          $incontent = 1;       # RAISE $incontent FLAG
          $metacontent = "";    # CLEAR METACONTENT VAR 
          $tmp = substr($tmp,9);  # MOVE POINTER TO AFTER THE content="
        }

        # FIND THE CLOSE QUOTE FOR THE CONTENT FIELD
        if($end = strpos($tmp,"\"")){
          $metacontent .= substr($tmp,0,$end);  # RETRIEVE CONTENT PART FROM $tmp AND APPEND TO PREVIOUS RETRIEVED VALUES.
          $incontent = 0; # LOWER CONTENT FLAG
        }else{
          $metacontent .= $tmp; # NOT THE END OF THE CONTENT APPEND TO PREVIOUS VALUES AND PROCESS NEXT LINE.
        }
                  
      }

      # CHECK FOR THE END OF THE METATAG DEFINITION
      if( ($tmp = stristr($buf,">")) && !$inname && !$incontent){
        $metatags[$metaname] = $metacontent;
        $inmeta = 0; # LOWER METATAG FLAG
      }
    }
    
  }
  
  fclose($fp);
  
  # RETURN METATAGS ARRAY TO USER.
  return($metatags);
  
}

---------------------------------------------------------------

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2000-07-30 00:09 UTC] zak@php.net
Confirmed bug using win2k, apache 1.3.12, php 4.0.1rc2
This should be a trivial fix :)
 [2000-08-17 03:03 UTC] sniper@php.net
Just tried get_meta_tags() with latest-CVS. And it didn't work at all..

--Jani
 [2000-08-17 04:54 UTC] stas@php.net
What didn't work at all? Could you please give a reproducing example?
 [2000-08-17 11:06 UTC] sniper@php.net
Oops, forgot to add that example:

<?php
$metas = get_meta_tags("http://kolumbus.fi",1);
print_r($metas);
?>

And this prints out:

array(0) { } 

And there are metas in that page..

--Jani
 [2000-08-17 11:12 UTC] stas@php.net
That page doesn't have metatags. It has META HTTP-EQUIV's, which are entirely different matter, and, in fact, nobody needs them. But get_meta_tags should be rewritten anyway (just like url_scanner was), because it doesn't get too much cases.
 [2000-08-17 16:22 UTC] sniper@php.net
Damn. Here's one which definately has metatags:

<?php
$metas = get_meta_tags("http://www.zend.com",1);
print_r($metas);
?>

And it doesn't work.

--Jani
 [2001-02-10 21:40 UTC] elixer@php.net
Fixed in CVS, please give it a whirl.

Sean
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 15:01:30 2024 UTC