php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #59147 MoreLikeThis only parses one doc
Submitted: 2010-04-06 15:30 UTC Modified: 2010-04-27 23:14 UTC
From: max at blubolt dot com Assigned: iekpo (profile)
Status: Closed Package: solr (PECL)
PHP Version: 5.2.12 OS: Ubuntu Karmic, Gentoo
Private report: No CVE-ID: None
 [2010-04-06 15:30 UTC] max at blubolt dot com
Description:
------------
To confirm, this happens on both 0.9.9 and on SVN latest.

We're sending a request for MoreLikeThis data - 
mlt=true&mlt.fl=metaSimilarity&mlt.boost=true&mlt.count=10&m
lt.mindf=1&mlt.mintf=1&mlt.minwl=1&q=id:(pa
ntone-keyring2)&fq=siteId:bloomsbury AND 
userType:regular_display&terms.sort=count&fl=id - which is 
being handled just fine by your library and by Solr, and is 
returning an XML document (see the attached 
file). 

Unfortunately, SolrQuery::getResponse then throws a warning 
level error of 
	"Error unserializing raw response.", 
which is also thrown when using 
SolrUtils::digestXMLResponse. 

The issue appears to arise when more than one document is 
returned by a MoreLikeThis query - if the 
below result is stripped down to a single Doc element within  
<result name="pantone-keyring2" 
numFound="2693" start="0">, it parses without issue.


Reproduce code:
---------------
Do a MoreLikeThisQuery, or just use digestXMLResponse on

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">3</int>
 <lst name="params">
  <str name="mlt.minwl">1</str>
  <str name="mlt.boost">true</str>
  <str name="mlt.fl">metaSimilarity</str>
  <str name="indent">on</str>
  <str name="mlt.mintf">1</str>
  <str name="mlt">true</str>
  <str name="wt">xml</str>
  <str name="terms.sort">count</str>
  <str name="version">2.2</str>
  <str name="mlt.mindf">1</str>
  <str name="mlt.count">10</str>
  <str name="fl">id</str>
  <str name="q">id:(pantone-keyring2)</str>
  <str name="fq">siteId:bloomsbury AND userType:regular_display</str>
 </lst>
</lst>
<result name="response" numFound="1" start="0">
 <doc>
  <str name="id">pantone-keyring2</str>
 </doc>
</result>
<lst name="moreLikeThis">
 <result name="pantone-keyring2" numFound="2693" start="0">
  <doc>
	<str name="id">pantone-keyring4</str>
  </doc>
  <doc>
	<str name="id">pantone-keyring3</str>
  </doc>
  <doc>
	<str name="id">pantone-keyring1</str>
  </doc>
  <doc>
	<str name="id">pantone-keyring45</str>
  </doc>
  <doc>
	<str name="id">pantone-keyring6</str>
  </doc>
  <doc>
	<str name="id">pantone-cufflink4</str>
  </doc>
  <doc>
	<str name="id">pantone-cufflink7</str>
  </doc>
  <doc>
	<str name="id">pantone-cufflink5</str>
  </doc>
  <doc>
	<str name="id">pantone-cl-cherry</str>
  </doc>
  <doc>
	<str name="id">pantone-cl-jet</str>
  </doc>
 </result>
</lst>
</response>


Expected result:
----------------
A SolrObject full of delicious data instead of an Error and an 
exception would be nice :)

Actual result:
--------------
As above, failure to unserialize occurs when more than one 
MoreLikeThis document is present.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-04-13 04:34 UTC] trevor at blubolt dot com
I've tracked this down to the handling of result 
serialization in the response.  The primary result doc count 
was being used for the moreLikeThis results array 
serialization.  I've amended the xpath location and context 
accordingly to retrieve the correct count:

--- solr_functions_helpers.orig	1970-01-01 
10:14:37.000000000 +0100
+++ solr_functions_helpers.c	2010-04-12 
16:47:06.000000000 +0100
@@ -737,7 +737,7 @@
 
 	xmlAttr *curr_prop = properties;
 	xmlXPathContext *xpathctxt = NULL;
-	const xmlChar *xpath_expression = (xmlChar *) 
"/response/result/doc";
+	const xmlChar *xpath_expression = (xmlChar *) 
"child::doc";
 	xmlXPathObject *xpathObj = NULL;
 	xmlNodeSet *result = NULL;
 	long int document_count = 0;
@@ -763,6 +763,7 @@
 	}
 
 	xpathctxt = xmlXPathNewContext(node->doc);
+	xpathctxt->node = node;
 	xpathObj = xmlXPathEval(xpath_expression, 
xpathctxt);
 	result = xpathObj->nodesetval;
 	document_count = result->nodeNr;
 [2010-04-13 09:20 UTC] iekpo@php.net
Thanks a lot for the tip.

I will will make sure to test out and apply that to the next release.

Thanks for reporting this.
 [2010-04-27 23:14 UTC] iekpo@php.net
This bug has been fixed in SVN.

In case this was a documentation problem, the fix will show up at the
end of next Sunday (CET) on pecl.php.net.

In case this was a pecl.php.net website problem, the change will show
up on the website in short time.
 
Thank you for the report, and for helping us make PECL better.

This issue has been resolved in revision  298680

Special thanks to max at blubolt dot com for reporting the issue and submitting the patch.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 16 20:01:31 2024 UTC