HTTP ERROR: 404 (NOT_FOUND) when clicked query result link for linked document using web interface.

瀏覽次數:19 次
跳到第一則未讀訊息

jagadeesha kanihal

未讀,
2017年8月31日 下午1:57:522017/8/31
收件者:MG4J
I am trying out big.mg4j on a very small corupus and trying querying in web interface.
Although query results appear on query page, when you click on query result link it gives an 404 error instead of showing the corresponding document.

I tried using just mg4j (not big.mg4j) but the problem exists with that also.
Is there a problem with http server code linked with mg4j??

Here is the code that I am using and the screenshots of the error.
 

package try.mg4j;

import it.unimi.di.big.mg4j.document.FileSetDocumentCollection;
import it.unimi.di.big.mg4j.query.Query;
import it.unimi.di.big.mg4j.tool.IndexBuilder;

import java.io.File;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.commons.io.FileUtils;

public class TryMg4j {
/**
* Indexes a directory of HTML document files.
* From command line, this can be run as
* mvn exec:java -Dexec.inClass="try.mg4j.TryMg4j"
* -Dexec.args="corpus index"
* Then visit http://localhost:4242/Query
*
* @param args [1]=/path/to/corpus/dir [2]=/path/to/index/dir
* @throws Exception
*/
public static void main(String args[]) throws Exception {
final String corpusPath = args[0], indexPath = args[1];
final File corpusDir = new File(corpusPath),
indexDir = new File(indexPath);
assert corpusDir.isDirectory() && indexDir.isDirectory();
Collection<File> docFiles =
FileUtils.listFiles(corpusDir, new String[]{"txt"}, false);
ArrayList<String> mg4jArgs = new ArrayList<String>();
mg4jArgs.add("-f");
mg4jArgs.add("it.unimi.di.big.mg4j.document.tika.TextDocumentFactory");
final File collectionFile = new File(indexDir, "corpus.collection");
mg4jArgs.add(collectionFile.getAbsolutePath());
for (File docFile : docFiles) {
mg4jArgs.add(docFile.getAbsolutePath());
}


FileSetDocumentCollection.main(mg4jArgs.toArray(new String[]{}));
IndexBuilder.main(new String[]{
"-S", collectionFile.getAbsolutePath(),
(new File(indexDir, "cs635")).getAbsolutePath()
});
Query.main(new String[]{
"-h", "-i", "it.unimi.di.big.mg4j.query.FileSystemItem",
"-c", collectionFile.getAbsolutePath(),
(new File(indexDir, "cs635-text")).getAbsolutePath()
});
}
}



jagadeesha kanihal

未讀,
2017年8月31日 下午2:14:502017/8/31
收件者:MG4J
these are maven mg4j versions that I'm using.

        <!--<dependency>-->
            <!--<groupId>it.unimi.di</groupId>-->
            <!--<artifactId>mg4j</artifactId>-->
            <!--<version>5.2</version>-->
        <!--</dependency>-->

        <dependency>
            <groupId>it.unimi.di</groupId>
            <artifactId>mg4j-big</artifactId>
            <version>5.4.3</version>
        </dependency>

Sebastiano Vigna

未讀,
2017年8月31日 晚上7:14:552017/8/31
收件者:mg...@googlegroups.com

> On 31 Aug 2017, at 19:57, jagadeesha kanihal <jagadk...@gmail.com> wrote:
>
> I am trying out big.mg4j on a very small corupus and trying querying in web interface.
> Although query results appear on query page, when you click on query result link it gives an 404 error instead of showing the corresponding document.

If the URL is of file file://, some browsers will not allow you to access a file through a link, unless it comes from a file. Which URL gives you 404?

Ciao,

seba

jagadeesha kanihal

未讀,
2017年9月1日 凌晨2:40:252017/9/1
收件者:MG4J、vi...@di.unimi.it

As you can see, url is not file:// , There something wrong with httpservlet

Using jetty server seems to work.

Can you please verify FileSystemItem in mg4j ? 

jagadeesha kanihal

未讀,
2017年9月1日 下午1:31:372017/9/1
收件者:mg...@googlegroups.com
@vigna
Any updates,  are you able reproduce the issue? If you need log files, please let me know.

--
You received this message because you are subscribed to a topic in the Google Groups "MG4J" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mg4j/ht0TEiovbew/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mg4j+unsubscribe@googlegroups.com.
To post to this group, send email to mg...@googlegroups.com.
Visit this group at https://groups.google.com/group/mg4j.
For more options, visit https://groups.google.com/d/optout.

Sebastiano Vigna

未讀,
2017年9月1日 晚上8:21:282017/9/1
收件者:mg...@googlegroups.com

> On 1 Sep 2017, at 19:31, jagadeesha kanihal <jagadk...@gmail.com> wrote:
>
> @vigna
> Any updates, are you able reproduce the issue? If you need log files, please let me know.
>

I'm on vacation, it won't be that quick, but yes, please send me log.

Ciao,

seba

Sebastiano Vigna

未讀,
2017年9月9日 下午2:13:092017/9/9
收件者:mg...@googlegroups.com

> On 31 Aug 2017, at 19:57, jagadeesha kanihal <jagadk...@gmail.com> wrote:
>
> I am trying out big.mg4j on a very small corupus and trying querying in web interface.
> Although query results appear on query page, when you click on query result link it gives an 404 error instead of showing the corresponding document.
>
> I tried using just mg4j (not big.mg4j) but the problem exists with that also.
> Is there a problem with http server code linked with mg4j??
>

OK, I replicated the problem. I think HttpFileServer is not working for some reason (Jetty evolves very quickly). I'll try to understand what's wrong...

Ciao,

seba

jagadeesha kanihal

未讀,
2017年9月26日 凌晨2:11:152017/9/26
收件者:MG4J
Hi,
Any updates on fix for this issue?
回覆所有人
回覆作者
轉寄
0 則新訊息