exception 'Apache_Solr_HttpTransportException' with message ''0' Status: Communication Error' in /var/www/html/SolrPhpClient/Apache/Solr/Service.php:338

3,863 views
Skip to first unread message

kfmnla

unread,
Jun 8, 2012, 6:24:50 PM6/8/12
to php-sol...@googlegroups.com
Could someone please help by elaborating as to what the "webapp path" refers to for the following statement in the example script:
 
$solr = new Apache_Solr_Service('localhost', 8180, '/solr/');
 
What does '/solr/' refer to? Is this a path to the Solr application that I have running locally (/opt/apache-solr-3.6.0/example/solr/conf/)?  If so, should the statement read as follows:
 
$solr = new Apache_Solr_Service('localhost', 8180, '/opt/apache-solr-3.6.0/example/solr/conf/');
 
I appear to be having trouble communicating with my local Solr service. I am on a Linux machine and I can see that my Solr service is working fine via localhost:8983/solr/admin
 
I am getting the following messages:
 
exception 'Apache_Solr_HttpTransportException' with message ''0' Status: Communication Error' in /var/www/html/SolrPhpClient/Apache/Solr/Service.php:338
Stack trace:
#0 /var/www/html/SolrPhpClient/Apache/Solr/Service.php(1170): Apache_Solr_Service->sendRawGet('http://localhos...')
#1 /var/www/html/SolrPhpClient/solr_test.php(33): Apache_Solr_Service->search('kabab', 0, 10)
#2 {main}
 
Thank you very much!
 

Donovan Jimenez

unread,
Jun 9, 2012, 2:00:37 PM6/9/12
to php-sol...@googlegroups.com
It is the context path that your Solr server is running under. It is NOT the filesystem directory path.


so your host would be "localhost", your port would be 8983, and your path would be "/solr/"

Keep in mind localhost is only available from the same machine, if your code is running from another location then you'll have to use an IP or DNS host name, and make sure tomcat is bound to that IP (and not just localhost). The host parameter can take IPs or host names.

Donovan

 

--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/vFc4K5iMZpEJ.
To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/php-solr-client?hl=en.

kfmnla

unread,
Jun 12, 2012, 11:52:48 AM6/12/12
to php-sol...@googlegroups.com
Hi Donovan,
 
Thank you for taking the time to reply.
 
I should have mentioned in my post that my first try was using exactly what you had suggested and I received the same error messages.  I had just started trying other possibilities when that first try didn't work, but I still received the same messages.
 
Is there something else that you can think of that I may have setup incorrectly?
 
Thank you very much for your time.
To post to this group, send email to php-solr-client@googlegroups.com.
To unsubscribe from this group, send email to php-solr-client+unsubscribe@googlegroups.com.

Donovan Jimenez

unread,
Jun 12, 2012, 12:25:08 PM6/12/12
to php-sol...@googlegroups.com
On another recent thread, a user talked about how he had to work around default SELinux restrictions:


That would fit your description if you were running on a system with SELinux enabled. I'm not sure of what else it could be, so I hope this helps at least get you on the right track to seeing the root cause.

Donovan

To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/MuYOAs4xebAJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 12, 2012, 2:59:01 PM6/12/12
to php-sol...@googlegroups.com
Hi Donovan,
 
Thanks again for your assistance.
 
I had actually previously seen that post that you are referring to and had gone ahead and set SELinux to "Permissive" before posting my question.
 
The good news is that I have gotten past the error messages that I was receiving.  I looked closer at my Terminal (RHEL 6) messages when starting up and runing Solr and noticed a SEVERE message having to do with "missing field text".  I went back to my schema.xml file and compared it back to the original that I had saved and renamed and then copied over the definition for the catchall field named "text" to my schema.xml file.  Problem solved!
 
However, now when I run the sample PHP script, instead of getting error messages, I keep getting "Results 0 - 0 of 0:" for whatever I type into the search field.  Just to compare results, I type the same search arguments into the Solr Admin for my running Solr and I get the same thing, so I believe the script to be good.
 
I guess my question has to do with my understanding of what I am searching for and how to search for it.  My simple example of Solr that I have created  used the DataImport Handler to index some simple data into Solr from my test mySQL database.  This was successful as when I run the "default" search argument from the Solr Admin (that is, using "*:*", without the quotes, for the quesry string), I get the desired results (I set the limit to 100 so I could see all of my documents):
 
<result name="response" numFound="33" start="0" maxScore="1.0">
...followed by the 33 documents and their field values
 
When I use "*:*" (again, without the quotes) in the search field generated by the example PHP script, I get the same results.  That's good.
 
So I am struggling with how to enter my search criteria to match the data that I have indexed into my Solr documents, so as not to keep receiving 0 results.
 
In the search field I enter field names that I have defined in the schema.xml as well as the actual values of those fields but with no success.
 
I'm almost there.  I just seem to be bungling the actual searching part.  Perhaps there is a specific search syntax that I am missing.
 
Thanks again!

kfmnla

unread,
Jun 12, 2012, 3:16:59 PM6/12/12
to php-sol...@googlegroups.com
OK, I'm replying to my own post now so I know I'm going crazy.
 
So the syntax appears to be:  the field name, followed by a colon (:), followed by a value for the field
 
I suppose this means that I need to build more functionality into the PHP script such that the user does not need to know what the actual field names are.  Does this sound right to you?
 
Thanks.

Donovan Jimenez

unread,
Jun 12, 2012, 4:00:20 PM6/12/12
to php-sol...@googlegroups.com
There are several options:
  1. Take the user input and make it it into a "full" query with php - e.g. get "search text" and make it into "my_field:(search text)" before sending it to Solr. This is simplest to understand, but can be very error prone - since you might have to sanitize the user input or fully parse it to make sure they don't break the full query.
  2. You can set the default field in the search handler configuration in your solrconfig.xml. So any unprefixed terms will automatically be applied to this field, but prefixed terms can still be used by power users. You can get more information here: http://wiki.apache.org/solr/DisMaxQParserPlugin
  3. If you have multiple fields you want user input to be applied to, then use the dismax / edismax search handler. Similar to the default search handler, this takes a configuration for default field, but it can take multiple. edismax also has some extra features that are very attractive to most people - like enabling suffix wildcard queries. You can get more information here: http://wiki.apache.org/solr/ExtendedDisMax
You can do 3 by either using the qt parameter with a named query handler (in the default solrconfig.xml - last time i looked at least - there will already be a dismax handler you can play with), or using the defType parameter, or by making the dismax / edismax plugins part of your default search handler.  I encourage you to read the Solr wiki to understand all the options available to you in solrconfig.xml - especially in this case, regarding search handlers: http://wiki.apache.org/solr/SolrConfigXml and http://wiki.apache.org/solr/SolrRequestHandler

I, personally, use option 3; I have configured an edismax handler to my liking and tell solr to use it with the qt parameter.

In addition to the wiki, lucid imagination has a solr guide that's pretty nice: http://lucidworks.lucidimagination.com/display/solr/About+This+Guide

There are also several solr books, I'm not familiar enough with them all to recommend a specific one though.

Donovan

To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/1MEK4BVRiikJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

Donovan Jimenez

unread,
Jun 12, 2012, 4:03:49 PM6/12/12
to php-sol...@googlegroups.com
I meant that first link in 2 to be  http://wiki.apache.org/solr/SearchHandler 

Also, I forgot to mention you can also send the default field configuration along at query time with the df parameter - documented in that link. Similar, at query time field configuration is also possible with dismax / edismax with their qf parameter.

Donovan

kfmnla

unread,
Jun 12, 2012, 5:12:05 PM6/12/12
to php-sol...@googlegroups.com
Thanks Donovan.
 
That's a lot to absorb.  I have some reading to do.
 
In the meantime, is there a straight-forward way to modify the sample PHP script such that I could display, possibly, more than one value for each field?
 
For example, each doc would have only one field/value for company_id and one field/value for company_name.  But then item_id, item_title, item_desc, and item_price could each have more than one value, depending on the number of items in my test database for each company.
 
When I use the query string "company_name:Best" in the script, I get one result for my "Best Buy" store in my database, but only the values for the last db entries for the item fields.  When I use the same query string in the Solr Admin, I get all of the item results for each store.
 
I tried the following, to no avail:
 
<?php
    // iterate document fields
    foreach ($doc as $field)
    {
?>
          <tr>
      <th><?php echo htmlspecialchars($field, ENT_NOQUOTES, 'utf-8'); ?></th>
   
<?php
  // iterate field values
  foreach ($field as $value)
  {
?>   
            <td><?php echo htmlspecialchars($value, ENT_NOQUOTES, 'utf-8'); ?></td>
<?php
  }
?>
    </tr>
<?php
    }
?>
 
Thanks.

Donovan Jimenez

unread,
Jun 12, 2012, 5:45:26 PM6/12/12
to php-sol...@googlegroups.com
field values will come back multivalued (as an array) if they're indexed as multivalued.

First, in your schema.xml for solr, make sure the fields that will have more than one value are configured as multiValued="true"

Next, when indexing, I expect your code to look something like this:

<?php
...
  $doc = new Apache_Solr_Document();

  $doc->company_id = 12345;
  $doc->company_name = "Some Company"

  foreach ($items as $item) {
    // notice I'm using addField here, because its multivalued (will be set more than once)
    // If I had just used $doc->item_id, then only the last set would go through (i think this is what you're seeing)
    $doc->addField('item_id', $item['item_id']);
    $doc->addField('item_title', $item['title']);
    $doc->addField('item_desc', $item['desc']);
    $doc->addField('item_price', $item['price']);

    // alternatively, you can collect all your field values and do 
    // one set with an array like $doc->some_field = array("one", "two", "three");
  }

...
?>

Then, when iterating results, the multivalued fields should come back as arrays. Of interest, there is a client setting to collapse a single valued array in the results: 

Apache_Solr_Service instance method setCollapseSingleValueArrays(...)

It is true by default. So, you can set it false and always know which fields are multivalued, or you can leave it as default and check all result field values with is_array() or is_string() as you iterate.

Donovan


To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/lR9SgkBYjzwJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 14, 2012, 4:45:26 PM6/14/12
to php-sol...@googlegroups.com
Hi Donovan,

Thank you for your patience.  Going back to the previous post, I seem to be having difficulty with certain fields being labeled as type "string" instead of "array".

I am using the Data Import Handler to index some, very simple for now, test data from my local mySQL database.  After doing a full import, it appears from the Solr Admin that it is working as expected.

That is, when I use the Full Interface to run an abbreviated /select query (Start Row = 3, Max Rows Returned = 2), I see the following results:

<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">0</int>
    <lst name="params">
      <str name="explainOther"/>
      <str name="fl">*,score</str>
      <str name="indent">on</str>
      <str name="start">3</str>
      <str name="q">*:*</str>
      <str name="hl.fl"/>
      <str name="wt"/>
      <str name="fq"/>
      <str name="version">2.2</str>
      <str name="rows">2</str>
    </lst>
  </lst>
  <result name="response" numFound="33" start="3" maxScore="1.0">
    <doc>
      <float name="score">1.0</float>
      <arr name="item_desc">
        <str>ipod touch description</str>
        <str>Nerf Gun toy</str>
        <str>Nurf Gun</str>
      </arr>
      <arr name="item_id">
        <str>13</str>
        <str>14</str>
        <str>18</str>
      </arr>
      <arr name="item_price">
        <float>199.0</float>
        <float>19.99</float>
        <float>19.95</float>
      </arr>
      <arr name="item_title">
        <str>iPod Touch</str>
        <str>Nerf Gun</str>
        <str>Nerf Gun</str>
      </arr>
      <str name="store_id">4</str>
      <str name="store_name">Target Corp.</str>
    </doc>
    <doc>
      <float name="score">1.0</float>
      <arr name="item_desc">
        <str>Good for you</str>
        <str>Telephone system</str>
      </arr>
      <arr name="item_id">
        <str>16</str>
        <str>17</str>
      </arr>
      <arr name="item_price">
        <float>34.0</float>
        <float>728.0</float>
      </arr>
      <arr name="item_title">
        <str>asparagus</str>
        <str>Office Telepone</str>
      </arr>
      <str name="store_id">5</str>
      <str name="store_name">Vons Grocery store</str>
    </doc>
  </result>
</response>

All of my "item_" fields have been defined in my schema.xml file as multiValued="true" and they display above as "arr".  Actually, previously they also showed as "arr" before I updated the schema.xml file with the multiValued parameter.  I did follow up and do another full import after modifying the schema.xml.

The problem comes in when I try using the example PHP script.  In the script, when I execute gettype($field), the result is always "string" for all the fields.  Here's the output:

Results 1 - 2 of 33:
  1. store_id  string  4
    store_name  string  Target Corp.
    item_price  string  
    item_id  string  
    item_title  string  
    item_desc  string  
  2. store_id  string  5
    store_name  string  Vons Grocery store
    item_price  string  
    item_id  string  
    item_title  string  
    item_desc  string  

Here's the script:

<?php
    // iterate document fields / values
    foreach ($doc as $field => $value)
    {
        if (!(is_array($field)))

        {
?>
            <tr>
                <th><?php echo htmlspecialchars($field, ENT_NOQUOTES, 'utf-8'); ?></th>
                <td><?php echo "&nbsp" . gettype($field) . "&nbsp&nbsp" . htmlspecialchars($value, ENT_NOQUOTES, 'utf-8'); ?></td>
            </tr>
<?php
        }
        if (is_array($field))
        {           
            foreach ($field as $name => $element)
            {
?>
            <tr>
                <th><?php echo htmlspecialchars($name, ENT_NOQUOTES, 'utf-8'); ?></th>
                <td><?php echo htmlspecialchars($element, ENT_NOQUOTES, 'utf-8'); ?></td>
            </tr>
<?php
            }           
        }
    }
?>


Thank you for your time

Donovan Jimenez

unread,
Jun 14, 2012, 8:43:50 PM6/14/12
to php-sol...@googlegroups.com
just a small mistake in your script, you're doing all your operations on $field (which is the name of the field, and is always a string) rather than $value (which is the single value or array)

<?php
    // iterate document fields / values
    foreach ($doc as $field => $value)
    {
        if (!(is_array($value)))

        {
?>
            <tr>
                <th><?php echo htmlspecialchars($field, ENT_NOQUOTES, 'utf-8'); ?></th>
                <td><?php echo "&nbsp" . gettype($value) . "&nbsp&nbsp" . htmlspecialchars($value, ENT_NOQUOTES, 'utf-8'); ?></td>
            </tr>
<?php
        }
        if (is_array($value))
        {            
            foreach ($value as $element)
            {
?>
            <tr>

                <th><?php echo htmlspecialchars($field, ENT_NOQUOTES, 'utf-8'); ?></th>
                <td><?php echo htmlspecialchars($element, ENT_NOQUOTES, 'utf-8'); ?></td>
            </tr>
<?php
            }            
        }
    }
?> 

should work.

Donovan

To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/eP3_LS9oTdQJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 14, 2012, 9:41:34 PM6/14/12
to php-sol...@googlegroups.com
Donovan,

Thank you!  That was my problem.  Can't say how much I appreciate your time and patience.

kfmnla

unread,
Jun 18, 2012, 3:21:41 PM6/18/12
to php-sol...@googlegroups.com
Hi Donovan,

I hate to bother you with this but this was working when I last left it on Friday and now it all of a sudden is not.

I'm getting the following error, using both the test example PHP script as well as the Solr Admin interface /select query:

SEVERE: java.lang.StringIndexOutOfBoundsException: String index out of range: 4

I'm on a Red Hat Enterprise Linux 6 system.  The only thing that has changed since Friday that I can think of is that there were some system updates that were applied this morning.  I'm trying to look into whether any of those might have affected Java in some way.

This is really maddening.  If you have any other suggestions, I would love to hear them.  I am stuck.

Thanks again.

Donovan Jimenez

unread,
Jun 18, 2012, 5:37:43 PM6/18/12
to php-sol...@googlegroups.com
I've not seen that before. things I'd try:

restart your servlet container (tomcat / jetty / etc) if you haven't already. see if the problem goes away.

make a backup of your solr data directory, and then restart solr with an empty data directory (nuke your documents) and reindex a few - see if the problem goes away or persists.

If it goes away, it might be easiest to just reindex everything - depending on your index size. If the problem doesn't go away, then something odd has changed. To help in that case, we'd need more information: full stack trace, java version, solr version, query you're running that causes the exception.  It might also be beneficial at that point to try the solr user mailing list - since the solr developers themselves monitor that list.

Hope you find the problem,
Donovan

To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/rzVtaZLBwrcJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 18, 2012, 6:53:39 PM6/18/12
to php-sol...@googlegroups.com
Donovan,

Thank you again.

Simply restarting Java didn't help.  However, after backing up my solr/data directory and starting fresh with an empty one, then re-importing/indexing, all seems to be fine again.

I don't know what could have caused that to be the case but so far, so good!

Thank you!

kfmnla

unread,
Jun 18, 2012, 8:51:28 PM6/18/12
to php-sol...@googlegroups.com
Hi Donovan,

I've read through the information that you referred me to regarding the ExtendedDisMax.  There is much to absorb and I don't mind saying that I am a bit confused.

As a starting point, I read somewhere that "...
Extended DisMax is already configured in the example configuration, with the name edismax. Thus, to select the parser, use defType=edismax in your query, or use the local-param syntax {!edismax} ".  I take that to mean that I do not need to add a RequestHandler to my solrconfig.xml file.  If I am wrong and I do need to add one, I'm at a bit of a loss as to how that should look.

Therefore, I marched forward and tried using the following within my version of the example PHP script:

$start = 0;        // relative to 0, so use 0 to start at the beginning
$limit = 100;    // defaults is 10, can adjust lower or higher
$query = isset($_REQUEST['q']) ? $_REQUEST['q'] : false;
$results = false;

$additionalParameters = array(
    'defType' => 'edismax',
);

if ($query)
{
  require_once('Apache/Solr/Service.php');
  $solr = new Apache_Solr_Service('localhost', 8983, '/solr/');

  if (get_magic_quotes_gpc() == 1)
  {
    $query = stripslashes($query);
  }

  try
  {
    $results = $solr->search($query, $start, $limit, $additionalParameters);
  }
  catch (Exception $e)
  {
    die("<html><head><title>SEARCH EXCEPTION</title><body><pre>{$e->__toString()}</pre></body></html>");
  }
}

Am I way off base here?  I feel like I might just be missing a bunch.

Thanks.



On Tuesday, June 12, 2012 1:00:20 PM UTC-7, Donovan Jimenez wrote:

kfmnla

unread,
Jun 20, 2012, 8:42:12 PM6/20/12
to php-sol...@googlegroups.com
OK, I'm going to reply to my own post (again) with a different question as I seem to have made some progress.  Sorry for any confusion that I may have created previously.

The edismax seems to be working now, at least to some extent.  Of course, I am ecstatic about that.  However, my question(s) is/are:

1)  I don't seem to be able to get highlighting to work, at least the way that I understand it to function.

2)  When my query returns results, they look to be good so far.  However, if I get a hit on a particular search query word that matches, say, one item in on store, the results show the entire document where the match occured which includes every item in the array for that store.  How do I list only the item that matched and not all of them for the store in question?

I'm wondering if my question number 2 above has something to do with the way the Solr document is created, in this case via the data import handler.

Here's my sample script:

<?php

// make sure browsers see this page as utf-8 encoded HTML
header('Content-Type: text/html; charset=utf-8');


$start = 0;        // relative to 0, so use 0 to start at the beginning
$limit = 100;    // defaults is 10, can adjust lower or higher
$query = isset($_REQUEST['q']) ? $_REQUEST['q'] : false;
$results = false;

$additionalParameters = array(
    'defType' => 'edismax',
    'qf' => 'store_name^2.0 item_title^1.0 item_desc',
    'mm' => '2',
    'hl' => 'true',
    'hl.fl' => 'store_name item_title item_desc',
    'hl.snippets' => '3',
);

if ($query)
{
  // The Apache Solr Client library should be on the include path
  // which is usually most easily accomplished by placing in the
  // same directory as this script ( . or current directory is a default
  // php include path entry in the php.ini)
  require_once('Apache/Solr/Service.php');

  // create a new solr service instance - host, port, and webapp
  // path (all defaults in this example)

  $solr = new Apache_Solr_Service('localhost', 8983, '/solr/');

  // if magic quotes is enabled then stripslashes will be needed

  if (get_magic_quotes_gpc() == 1)
  {
    $query = stripslashes($query);
  }

  // in production code you'll always want to use a try /catch for any
  // possible exceptions emitted  by searching (i.e. connection
  // problems or a query parsing error)

  try
  {
    $results = $solr->search($query, $start, $limit, $additionalParameters);
  }
  catch (Exception $e)
  {
    // in production you'd probably log or email this error to an admin
    // and then show a special message to the user but for this example
    // we're going to show the full exception

    die("<html><head><title>SEARCH EXCEPTION</title><body><pre>{$e->__toString()}</pre></body></html>");
  }
}

?>

<html>
  <head>
    <title>PHP Solr Client Example</title>
  </head>
  <body>
    <form  accept-charset="utf-8" method="get">
      <label for="q">Search:&nbsp</label>
      <input id="q" name="q" type="text" value="<?php echo htmlspecialchars($query, ENT_QUOTES, 'utf-8'); ?>"/>
      &nbsp&nbsp
      <input type="submit"/>
    </form>
   
<?php
// display results
if ($results)
{
  $total = (int) $results->response->numFound;
  $start = min(1, $total);
  $end = min($limit, $total);
?>
    <div>Results <?php echo $start; ?> - <?php echo $end;?> of <?php echo $total; ?>:</div>
    <ol>
<?php
  // iterate result documents
  foreach ($results->response->docs as $doc)
  {
?>
      <li>
        <table style="border: 1px solid black; text-align: left">

<?php
    // iterate document fields / values
    foreach ($doc as $field => $value)
    {
        if (!(is_array($value)))
        {
?>
            <tr>
                <th><?php echo htmlspecialchars($field, ENT_NOQUOTES, 'utf-8'); ?></th>
                <td><?php echo htmlspecialchars($value, ENT_NOQUOTES, 'utf-8'); ?></td>
            </tr>
<?php
        }
        if (is_array($value))
        {           
            $iArr = 1;

            foreach ($value as $element)
            {
?>
            <tr>
                <th><?php echo htmlspecialchars($field, ENT_NOQUOTES, 'utf-8') . "[" . $iArr . "]"; ?></th>

                <td><?php echo htmlspecialchars($element, ENT_NOQUOTES, 'utf-8'); ?></td>
            </tr>
<?php
            $iArr++;
            }           
        }
    }
?>
        </table>
      </li>
<?php
  }
?>
    </ol>
<?php
}
?>

  </body>
</html>


Thanks again!  Getting closer to where I need to be.

Donovan Jimenez

unread,
Jun 21, 2012, 10:12:45 AM6/21/12
to php-sol...@googlegroups.com
index them differently. Usually your solr index is a denormalized or reversed version of your database tables.

So, if the unit of result you want returned, then you need a solr document per item. So do your select on the items joined to the companies for the data import stuff.

Donovan

To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/QR6ym4zps2sJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 21, 2012, 12:11:21 PM6/21/12
to php-sol...@googlegroups.com
Hello Donovan,

I'm sorry but I don't think I understand.  Could you please elaborate a bit, if possible?

Thank you very much.

Donovan Jimenez

unread,
Jun 21, 2012, 12:51:48 PM6/21/12
to php-sol...@googlegroups.com
In code it'd look more like this:

<?php
...
  foreach ($companies as $company) {
    $items = ...

    foreach ($items as $item) {
      $doc = new Apache_Solr_Document();

      $doc->company_id = $company['company_id'];
      $doc->company_name = $company['name'];
      // rest of company fields

      $doc->item_id = $item['item_id'];
      $doc->item_title = $item['title'];
     // rest of item fields

    // add document - best to do in batches
   }
}

...
?>

one document per item, with the company information included in it. Solr is not a database, so you have to denormalize (repeat) the company data and any other things that are important to be searched on or returned with an item result.


To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/SbHsLxQmWQYJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 25, 2012, 12:17:51 PM6/25/12
to php-sol...@googlegroups.com
Hi Donovan,

Just getting back to this again.  Thanks again for taking the time to reply.

I guess what I am confused about may be two things:

1)  What you say about "de-normalizing" seems to make sense.  My question is where to do it?  In other words, where would this code sample that you just presented actually be placed?  It seems that including it within the sample PHP script would be too late in the processing cycle since we are performing queries against the already formatted Solr data.  Would it need to be a standalone process that would run immediately after running a full data import from the mySQL database, perhaps as part of a regularly scheduled cron job?  Or...

2)  Does my DataImportHandler and associated data-config.xml need to be modified so as to index the mySQL database information differently?  If so, I'm struggling with how to do that.  For example, my current data-config.xml for the previously-mentioned store=>items scenario (one store to many items) looks like this:

<?xml version="1.0" encoding="utf-8" ?>
<dataConfig>
  <dataSource type="JdbcDataSource"
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://localhost/my_DB"
              user="********"
              password="********"/>
  <document>
    <entity name="STORE"
            query="select id,company_name from retail_companies">
       <field column="id" name="store_id"/>
       <field column="company_name" name="store_name"/>
       <entity name="ITEM"
               query="select id,title,description,price from store_items where retail_company_id ='${STORE.id}'">
              <field column="id" name="item_id"/>
              <field column="title" name="item_title"/>
              <field column="description" name="item_desc"/>
              <field column="price" name="item_price"/>
       </entity>
    </entity>
  </document>
</dataConfig>


Instead, would something like the following be allowable:

  <document>
    <entity name="STORE"
            query="select id,company_name from retail_companies">
       <field column="id" name="store_id"/>
       <field column="company_name" name="store_name"/>
    </entity>
  </document>
  <document>
       <entity name="ITEM"
               query="select id,title,description,price from store_items where retail_company_id ='${STORE.id}'">
              <field column="id" name="item_id"/>
              <field column="title" name="item_title"/>
              <field column="description" name="item_desc"/>
              <field column="price" name="item_price"/>
       </entity>
  </document>

But then how would I get the store_name to be part of the ITEM document?

Thank you.

Donovan Jimenez

unread,
Jun 25, 2012, 1:22:04 PM6/25/12
to php-sol...@googlegroups.com
you should do 2. I don't use data import handler myself, are joins allowed? if so, i'd do this.

<document>
       <entity name="ITEM_STORE"
               query="select si.id as item_id, si.title as item_title,si.description as item_desc, si.price as item_price from store_items si, rc.id as store_id, rc.company_name as store_name left join retail_companies rc on si.retail_company_id = rc.id'">
       </entity>
  </document> 

If not, then do this (the inversion what you previously had):

<document>
    <entity name="ITEM" query="select id,title,description,price,retail_company_id from store_items">

          <field column="id" name="item_id"/>
          <field column="title" name="item_title"/>
          <field column="description" name="item_desc"/>
          <field column="price" name="item_price"/>
           <entity name="STORE"  query="select id,company_name from retail_companies where id = ${ITEM.retail_company_id}">

              <field column="id" name="store_id"/>
              <field column="company_name" name="store_name"/>
       </entity>
    </entity>
  </document> 

Using a join is always going to be better than making dependency queries in a tight loop, but maybe there are reasons not to do it with DIH? I don't know if the data import handler itself does any result caching. You'll probably find better answers from the solr mailing list and documentation.

Donovan

To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/54QkvSCgSH8J.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 25, 2012, 1:56:35 PM6/25/12
to php-sol...@googlegroups.com
Donovan,

Thank you for your reply.

It would seem that your first choice, the join, might be the better approach, if allowed.  Doing the second option (the inversion) seems like it would give me a similar problem to what I have now.  That is, doing a search on store_name would result in multiple (extraneous) results when a store has more than one item.  Currently, I get the extraneous results when seaching for an item, as I see all items for the store.

If you wouldn't mind, could you please remind me how to access the Solr mailing list.  I have not seen anything in the Solr Wiki that specifically addresses what we are discussing here.

Again, thank you again for your generous time and consideration with responding to my questions.  I cannot thank you enough!

Donovan Jimenez

unread,
Jun 25, 2012, 3:11:02 PM6/25/12
to php-sol...@googlegroups.com
What you want as your end result really isn't clear to me. your documents need to be based on your items if you want individually ranked item results. If you want all companies that exist in a search result, you can do a facet search along with the main search. So you'll get both a page of item results as well as company results from the entire result set. If you just want to search on company names then you're better off having a separate set of company documents - either in a completely separate index or in the same but segragated by a "type" field ("item record" vs. "company record").

http://lucene.apache.org/solr/discussion.html has the mailing list for solr.

Donovan

To view this discussion on the web visit https://groups.google.com/d/msg/php-solr-client/-/JqWHfzN2X2YJ.

To post to this group, send email to php-sol...@googlegroups.com.
To unsubscribe from this group, send email to php-solr-clie...@googlegroups.com.

kfmnla

unread,
Jun 25, 2012, 8:22:03 PM6/25/12
to php-sol...@googlegroups.com
What is needed has been somewhat evolving as we go through this process of learning about Solr and simply having someone make a decision.

In the interest of time, what has been decided for the very near short-term is that the user needs to be able to enter a search word or words which will then be used to search for a match against the imported mySQL DB info, as follows:

1)  For now, only items (item_title or item_desc) from the store_items table.
2)  Eventually, items (item_title or item_desc) from the store_items table OR store (store_name) from the retail_companies table OR other DB elements from other DB tables, TBD.

Doing 1) above, working with only one DB table, seems to be straight-forward.  The challenge has been working with the DataImportHandler, and all of its associated config files, to get multiple Solr documents created, one for each result desired (item, store, etc.).

The setup that was shown to you in a recent post defined one Solr document with entities for store and item.  If a search query matches a store, then not only the store is returned in the results but also all of the items.  If a search query matches an item, again the store along with all of the store's items (not just the one that matched) is returned.  This seems to demonstrate the need to have desired search result information defined in separate Solr documents.

The dataConfig/dataSource sems to support the creation of a single Solr document only.  It's not clear if/how to define multiple Data Import Request Handlers and multiple config files, if that is what is needed.  This must be easier to accomplish than it appears at this time.

Once this can be successfully accomplished, we should be fine for the time being.  It seems that Solr version 4.0 will support Join functionality between Solr documents, whcih sounds vey useful for the future.

It seems that I need to get some clarification via the Solr mailing list as this issue does not appear to be specific to the Solr PHP Client, since it cannot identify what fields/values were matched.

Hiền Quản Trọng

unread,
Sep 4, 2013, 1:05:48 PM9/4/13
to php-sol...@googlegroups.com
Hi Donovan .
I have a question that needs your help.
when I search for data (*: *) oki it all that I have index data., but when I type in the keyword search field that I have shown previously that it can not find the information nothing. I really do not understand.

Donovan Jimenez

unread,
Sep 4, 2013, 1:30:49 PM9/4/13
to php-sol...@googlegroups.com
I'm not actually clear on your problem, but I'm going to assume you're not getting results that you expect, and if i'm right the most likely reason is that your default query field(s) is not setup correctly in your solr configuration. 

1. can you get results you want directly from solr with your browser using their admin query interface
2. can you get results if you specify the exact field like field:value e.g. "keywords:foobar"

That's where I'd start. References for how to configure solr can be found on the wiki: http://wiki.apache.org/solr/SolrRequestHandler specifically look at things like http://wiki.apache.org/solr/SchemaXml#The_Default_Search_Field  or if you're using dismax this: http://wiki.apache.org/solr/DisMaxQParserPlugin#qf_.28Query_Fields.29


--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-solr-clie...@googlegroups.com.

To post to this group, send email to php-sol...@googlegroups.com.

jmu...@dealstampede.com

unread,
Jan 6, 2014, 5:57:44 PM1/6/14
to php-sol...@googlegroups.com
Hello Solr Gurus,

I am having a similar issue as kfmnla, but the resolution isn't the same.

I am running a working Solr instance:
ie.  http://localhost:8983/solr/collection1/select?q=category%3A12%0A&wt=json&indent=true

But I am getting the same error as kfmnla:

exception 'Apache_Solr_HttpTransportException' with message ''0' Status: Communication Error' in /app/www/SolrPhpClient/Apache/Solr/Service.php:339
Stack trace:
#0 /app/www/SolrPhpClient/Apache/Solr/Service.php(1201): Apache_Solr_Service->_sendRawGet('http://localhos...')
#1 /app/www/solr_test.php(39): Apache_Solr_Service->search('category:12', 0, 10)
#2 {main}

I am using the CurlNoReuse transport mechanism and set it in my code as follows...

require_once(dirname(__FILE__) . '/SolrPhpClient/Apache/Solr/Service.php');
require_once(dirname(__FILE__) . '/SolrPhpClient/Apache/Solr/HttpTransport/CurlNoReuse.php');
$transportInstance = new Apache_Solr_HttpTransport_CurlNoReuse();


// create a new solr service instance - host, port, and webapp
// path (all defaults in this example)
  $solr = new Apache_Solr_Service('search.dealstampede.com', 8983, '/solr/collection1/');
$solr->setHttpTransport($transportInstance);

I even echoed out the final $queryString the Service.php generates and tried it manually against my Solr Instance and it works.

I know it's got to be something simple I am doing wrong. Any thoughts?

Many thanks,

Jim

Donovan Jimenez

unread,
Jan 6, 2014, 6:40:31 PM1/6/14
to php-sol...@googlegroups.com
You used "search.dealstampede.com" as your host when constructing the Apache_Solr_Service.... was that intentional?  Usually its going to be the internal domain name or IP of your solr instance. I can't say for certain that's what your problem is, but its the most suspicous since the error is telling you that it couldn't connect to the server you told it.


--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.

jmu...@dealstampede.com

unread,
Jan 6, 2014, 7:29:44 PM1/6/14
to php-sol...@googlegroups.com
Thanks for the quick reply. I did use search.dealstampede.com and changed it to localhost for illustration purposes only.

The verbatim code I used:

  require_once(dirname(__FILE__) . '/SolrPhpClient/Apache/Solr/Service.php'); 
  require_once(dirname(__FILE__) . '/SolrPhpClient/Apache/Solr/HttpTransport/CurlNoReuse.php');   
  $transportInstance = new Apache_Solr_HttpTransport_CurlNoReuse();

  // create a new solr service instance - host, port, and webapp
  // path (all defaults in this example)
  $solr = new Apache_Solr_Service('search.dealstampede.com', 8983, '/solr/collection1/');
  $solr->setHttpTransport($transportInstance);

I could move the php page to the Solr box, but would prefer not to.

Can you think of any reason why this wouldn't work? The final querystring works if I paste it in the browser - but the curl call does not work.
Jim

Donovan Jimenez

unread,
Jan 6, 2014, 8:19:54 PM1/6/14
to php-sol...@googlegroups.com
make sure the host you're using is resolving how you expect it to on the web box, and also make sure any firewalls are allowing the tcp connection to occur between the boxes. 

simplest way to check without relying on PHP is to do something like curl or wget on the admin url from the web box (assuming  you have one of those installed). If you don't have anything like that installed you can see what comes back (or is put into error log) from a php one liner like:

php -r 'echo file_get_contents("http://search.dealstampede.com:8983/solr/collection1");'

Jim Murphy

unread,
Jan 6, 2014, 8:23:17 PM1/6/14
to php-sol...@googlegroups.com
Oddly enough, this does return the desired result.

php -r 'echo
file_get_contents("http://search.dealstampede.com:8983/solr/collection1/select?wt=json&json.nl=map&q=category%3A12&start=15&rows=10");'

Donovan Jimenez

unread,
Jan 6, 2014, 8:30:07 PM1/6/14
to php-sol...@googlegroups.com
try using the FileGetContents transport (its the default if you don't override it) rather than the CurlNoReuse on your Apache_Solr_Service instance. If that works, but curl transport won't maybe there's a misconfiguration or something missing from your php installation (around the curl module)


--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-solr-client+unsubscribe@googlegroups.com.
To post to this group, send email to php-solr-client@googlegroups.com.

Donovan Jimenez

unread,
Jan 6, 2014, 8:49:30 PM1/6/14
to php-sol...@googlegroups.com


On Mon, Jan 6, 2014 at 8:23 PM, Jim Murphy <jmu...@dealstampede.com> wrote:
--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-solr-client+unsubscribe@googlegroups.com.
To post to this group, send email to php-solr-client@googlegroups.com.

Jim Murphy

unread,
Jan 6, 2014, 9:42:54 PM1/6/14
to php-sol...@googlegroups.com
I went to curl because the default didn't work. But Curl does work in other things I am doing on the site, which is what has me perplexed. It would seem Heroku (where I am hosted) is not allowing the curl call using that 8983 port. I can use curl to call out and pull in other sites on port 80, but not other ports.



On 1/6/14, 7:30 PM, Donovan Jimenez wrote:
try using the FileGetContents transport (its the default if you don't override it) rather than the CurlNoReuse on your Apache_Solr_Service instance. If that works, but curl transport won't maybe there's a misconfiguration or something missing from your php installation (around the curl module)
On Mon, Jan 6, 2014 at 8:23 PM, Jim Murphy <jmu...@dealstampede.com> wrote:
Oddly enough, this does return the desired result.

php -r 'echo file_get_contents("http://search.dealstampede.com:8983/solr/collection1/select?wt=json&json.nl=map&q=category%3A12&start=15&rows=10");'





On 1/6/14, 7:19 PM, Donovan Jimenez wrote:
php -r 'echo file_get_contents("http://search.dealstampede.com:8983/solr/collection1");'

--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-solr-clie...@googlegroups.com.
To post to this group, send email to php-sol...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-solr-clie...@googlegroups.com.
To post to this group, send email to php-sol...@googlegroups.com.

Jim Murphy

unread,
Jan 6, 2014, 9:54:56 PM1/6/14
to php-sol...@googlegroups.com
Donovan,

That was it. Darned Heroku was blocking the script from accessing content from non port 80 ports.
I moved the file to the Solr box and it works right out of the box.

Down the rabbit hole I went.

Thanks so much for your help. I appreciate it.

Jim

On 1/6/14, 7:49 PM, Donovan Jimenez wrote:
On Mon, Jan 6, 2014 at 8:23 PM, Jim Murphy <jmu...@dealstampede.com> wrote:
Oddly enough, this does return the desired result.

php -r 'echo file_get_contents("http://search.dealstampede.com:8983/solr/collection1/select?wt=json&json.nl=map&q=category%3A12&start=15&rows=10");'





On 1/6/14, 7:19 PM, Donovan Jimenez wrote:
php -r 'echo file_get_contents("http://search.dealstampede.com:8983/solr/collection1");'

--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-solr-clie...@googlegroups.com.
To post to this group, send email to php-sol...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "PHP Solr Client" group.
To unsubscribe from this group and stop receiving emails from it, send an email to php-solr-clie...@googlegroups.com.
To post to this group, send email to php-sol...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages