Strange error when posting a document - a technical issue?

20 views
Skip to first unread message

Paul Stubbe

unread,
Mar 2, 2015, 5:04:47 AM3/2/15
to dezi-...@googlegroups.com
I'm probably doing something wrong, but I do not know where to go from here? Can anyone point me in the right direction.
I will try to explain what I get.

I'm using dezi-dient-php

I get this kind of error: 
     {"build_time":"0.92304","success":0,"msg":"Failed to POST doc","code":"500"}

          This is the exception message I get form $client->index ($doc)

When I post a Dezi_Doc with dezi-client-php I rawurlencode the ['uri'] part of the Dezi_Doc.
For almost all my files this works fine. But for some files with 'french' tokens like in: "Appréciation" this will result in this kind of error.

I think (I'm not sure) that I get the error message from the Search-OpenSearch-Engine-Lucy-0.18/lib/Search/OpenSearch/Engine/Lucy.pm module
in sub POST
    my $exists = $self->GET( $doc->url );

    if ( $exists->{code} != 200 ) {
        return { code => 500, msg => 'Failed to POST doc' };
    }

I do not understand what could be wrong with the $doc->url

Is this a known issue?
Does anybody know what kind of encoding I should use for french tokens in the url part?

Thanks,

Paul


Peter Karman

unread,
Mar 3, 2015, 10:02:56 AM3/3/15
to dezi-...@googlegroups.com
On 3/2/15 4:04 AM, Paul Stubbe wrote:

> When I post a Dezi_Doc with dezi-client-php I rawurlencode the ['uri']
> part of the Dezi_Doc.
> For almost all my files this works fine. But for some files with
> 'french' tokens like in: "Appréciation" this will result in this kind of
> error.
>
> I think (I'm not sure) that I get the error message from the
> Search-OpenSearch-Engine-Lucy-0.18/lib/Search/OpenSearch/Engine/Lucy.pm
> module
> in sub POST
>
> my $exists = $self->GET( $doc->url );
>
> if ( $exists->{code} != 200 ) {
> return { code => 500, msg => 'Failed to POST doc' };
> }
>
> I do not understand what could be wrong with the $doc->url
>
> Is this a known issue?
> Does anybody know what kind of encoding I should use for french tokens
> in the url part?


That smells like an encoding bug, somewhere. Either on the server side
or php client side.

Can you try and create a test case via PR against this repo?

https://github.com/karpet/dezi-client-php

You could add a failing test here:

https://github.com/karpet/dezi-client-php/blob/master/t/001-client.t


--
Peter Karman . www.peknet.com . @peterkarman

Peter Karman

unread,
Mar 10, 2015, 10:37:21 AM3/10/15
to dezi-...@googlegroups.com
On 3/2/15 4:04 AM, Paul Stubbe wrote:

> I do not understand what could be wrong with the $doc->url
>
> Is this a known issue?
> Does anybody know what kind of encoding I should use for french tokens in the url part?
>

Not sure if my previous reply was detailed enough. I'm sorry if it was too terse.

Dezi expects and works exclusively with UTF-8. So if your URLs are not UTF-8 encoded then things
might not work as expected.

That said, it's certainly possible that there is a bug on the server side with encoding/decoding URL
values. A repeatable test case would really help to both debug and prove that the problem gets fixed.

HTH,
pek

Paul Stubbe

unread,
Mar 10, 2015, 1:50:53 PM3/10/15
to dezi-...@googlegroups.com
Peter,

When I run the following sample code:

 <?php
    require_once 'Dezi_Client.php';
    require_once 'Dezi_Doc.php';
 
    $html = "<html><body> TEST </body></html>";   
    $docargs['mime_type'] = 'text/html';
    $docargs['uri'] = rawurlencode ("//test.site/téstfrench");
    $docargs['mtime'] = time();
    $docargs['size' ] = strlen ($html);
    $docargs['content'] = $html;   

    $doc = new Dezi_Doc($docargs);
    try {
        $args['server'] = 'http://search.intra:5000';
        $client = new Dezi_Client($args);
        $client->index( $doc );
    } catch (Exception $e) {
        echo $e->getMessage();
    }   
         
?>

I get this result:

{"build_time":"0.80622","success":0,"msg":"Failed to POST doc","code":"500"}

I hope this helps you.
   

Peter Karman

unread,
Mar 10, 2015, 6:17:36 PM3/10/15
to dezi-...@googlegroups.com
Thanks very much.

I've added it to the test suite:
https://github.com/karpet/dezi-client-php/commit/1941a131d375017eb2686812fefe6dd245de349a

However, it passes for me. I'm using PHP 5.4 stock RPM on a CentOS 7 box, against a fresh install of
the latest Dezi 0.4.1.

What versions of Perl and Dezi are you running?

Paul Stubbe

unread,
Mar 11, 2015, 6:39:15 AM3/11/15
to dezi-...@googlegroups.com, pe...@peknet.com
Peter,

I'm running the following.

Server 1:
    PHP Version 5.3.3
    dezi-client-php0.002003  17 Oct 2012

    One modification in the Constructor for a new Client.
            try {
                $resp = $pest->get('/');
            }
            catch (Pest_NotFound $err) {
                // 404
                //die(sprintf("404 response from server: %s/\n", $this->server));
                throw new Exception(sprintf("404 response from server: %s/\n", $this->server));
            }
            catch (Pest_UnknownResponse $err) {
                //die(sprintf("Initial connect failed to server: %s/\n", $this->server));
                throw new Exception(sprintf("Initial connect failed to server: %s/\n", $this->server));
            }

Server 2: Search Server

    Perl -v: This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi
    Dezi -v: 0.004001

Now that I know that it works on your side I will look for a solution on my side.

Greetings and thanks (like always) for the support.

Paul

Reply all
Reply to author
Forward
0 new messages