php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #19640 [phpweb] Make mirrors not crawlable
Submitted: 2002-09-27 15:33 UTC Modified: 2002-11-27 11:21 UTC
From: bsdpunk at hi-techredneck dot org Assigned:
Status: Wont fix Package: Feature/Change Request
PHP Version: 4.4.0-dev OS: ALL
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2002-09-27 15:33 UTC] bsdpunk at hi-techredneck dot org
If I try to find others' examples of a particular function by searching google for that function, I get HUNDREDS of pages that are nothing more than mirrors of the documentation, which I have ALREADY READ (its the first place I go when I need help with a function).

This just plain sucks. I appreciate having faster places to read the documentation, but for christs sake, sometimes I just want to look at others' code because thats how I learn.

I know soneone is going to get angry and say this is not a bug, but to me, its an annoyance of such proportion that it should be considered a bug.

PLEASE PLEASE PLEASE ask doc-mirrors to not allow googlebot to crawl thier mirror as it's extremely frustrating.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-09-28 04:54 UTC] goba@php.net
First of all, we cannot correct this. Second think about PHP codes. If you would like to find HTML pages using the lenght JS property, would you search for ":.length" in google. No. Because JS content is not indexed. The same stands for PHP codes. As far as I see approximately 95% of PHP code available on the net is distributed in compressed files, or only available after authentication (phpclasses for example). So searching for a function name, counting on google to return PHP source code results is quite bad direction.
 [2002-09-28 06:41 UTC] spic@php.net
It would be a good idea to visit one of the many php-tutorial sites/script download sites (e.g. HotScripts.com).

 [2002-11-26 20:22 UTC] bsdpunk at hi-techredneck dot org
My point was totally lost. I will try again: I cannot find good tutorials that may include a particular function because everyone and his mother is mirroring the documentation for that particular function. It would not be overly difficult to remind mirror maintainers that googlebot needs not crawl thier mirror.
Its not like it really matters now, I've all but given up on php anyways.
 [2002-11-26 21:21 UTC] philip@php.net
How about adding such things as  -manual  in your search query.  In otherwords, - can be useful too.  Regarding this bug report this is a valid point to consider.  I'm reclassifying this as a phpweb feature request titled "Make mirrors not crawlable"  So, a robots.txt similar to:

User-agent: *
Disallow: /manual/

But, keep in mind the benefit of having mirrors indexable as it means less traffic on the already overburdened www.php.net server.  Also keep in mind that there are a ton of 'uncontrollable' unofficial mirrors that simply download the manual and put it on their site.  In this case maybe a meta tag might be more appropriate, such as:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Anyway, this is something to think about and is a feature request for the phpweb team.

Regarding giving up on PHP, that's too bad.  I hope it's not because there are too many php manuals indexed in google.
 [2002-11-27 11:21 UTC] imajes@php.net
just to add an extra nail here, there is absolutely no way we'd ask mirrors to not do this -- php.net gets enough traffic as it is without not having the support of its mirrors. 

as has been suggested, go to the links that have been offered to you -- hotscripts.com, phpbeginner.com, phpclasses.org etc...

it's not going to happen.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 22:01:29 2024 UTC