Lucere API: What should it be?

15 views
Skip to first unread message

Troy Howard

unread,
Nov 11, 2010, 5:46:40 AM11/11/10
to luc...@googlegroups.com
All,

This post is meant to get the discussion started about designing the
new API. As a side note, I'm cross posting this to the mailing list.

What I'd love to see are some sort of "top down" pseudo-code snippets
that show how you envision a new API interacting with your code.


Some ideas/questions:

- Maybe for querying we should implement IQueryable<T>? How would that look?

- Maybe for indexing we should implement IObservable<T>/IObserver<T>?
How would that look?

- How can we facilitate parallelization? What kinds of domain entities
should be serializable so that you can send them across a wire as part
of a distribution model?

- How should transactions and locking work?

- What kind of architectural patterns make sense for this problem domain?

- We should totally implement IDisposable!.. or should we? Maybe not
everything needs to be disposable or should be. What do you think?

- Generic collections and IEnumerable<T> interfaces... Great... but
where exactly? What about collections that don't have a .NET BCL
implementation already? Existing libraries for that? or roll our own?

- Injectable behaviours using delegates like Action<T> or Func<T>...
for filtering, scoring, sorting?

That's just a start of some of the things floating around in my head
at the moment. I want to know what you think and I *really* want to
see some pseudo-code examples of how you think the API should work.


Thanks,
Troy

Ayende Rahien

unread,
Nov 11, 2010, 5:50:50 AM11/11/10
to luc...@googlegroups.com
Troy,
I would say that the first thing you want to do is to allow the actual implementation.
Everything that you are talking about is high level concepts that can be added at a later step


--
You received this message because you are subscribed to the Google Groups "Lucere" group.
To post to this group, send email to luc...@googlegroups.com.
To unsubscribe from this group, send email to lucere+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lucere?hl=en.


Ayende Rahien

unread,
Nov 11, 2010, 5:51:21 AM11/11/10
to luc...@googlegroups.com
My suggestion is to take the current Lucene.NET drop and start .Nettifying it.

Simone Chiaretta

unread,
Nov 11, 2010, 5:57:45 AM11/11/10
to luc...@googlegroups.com
Yep, I agree... starting from scratch doesn't make really sense.
As Ayende suggests, I'd take the latest Lucene.Net (2.9.2) and start changing getter/setters method to Properties, adding enums instead of the classes used as enums, and so on.

Futhermore, there is another opensource project that just started with a similar (if not same goal):

Might be worth going forces with them

Simone
--
Simone Chiaretta
Microsoft MVP ASP.NET - ASPInsider
Blog: http://codeclimber.net.nz
RSS: http://feeds2.feedburner.com/codeclimber
twitter: @simonech

Any sufficiently advanced technology is indistinguishable from magic
"Life is short, play hard"

Ayende Rahien

unread,
Nov 11, 2010, 5:58:35 AM11/11/10
to luc...@googlegroups.com
Not having IDisposable is one of the things that I really hate about Lucene, FWIW

Ciaran Roarty

unread,
Nov 11, 2010, 6:00:54 AM11/11/10
to luc...@googlegroups.com
On 11 November 2010 10:51, Ayende Rahien <aye...@ayende.com> wrote:
My suggestion is to take the current Lucene.NET drop and start .Nettifying it.
 
+1 for me.
 

Simone Chiaretta

unread,
Nov 11, 2010, 6:09:11 AM11/11/10
to luc...@googlegroups.com
Or another option could be to keep Lucene.Net as it is now, and on top of it writing a wrapper layer that converts .NET-like API to the current implementation of Lucene.
Not the best solution as some of the problems (like the IDisposable or the fact that you cannot use lambdas or anonymous classes for the various sorting methods), but can still be a start

Simone

--
You received this message because you are subscribed to the Google Groups "Lucere" group.
To post to this group, send email to luc...@googlegroups.com.
To unsubscribe from this group, send email to lucere+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/lucere?hl=en.

Ciaran Roarty

unread,
Nov 11, 2010, 8:08:19 AM11/11/10
to luc...@googlegroups.com
Sorry, to be clear, I think the best way to start is to take Lucene.Net 2.9.2 and make it a real .NET library rather than the line by line port it is.
 
Subsequently, the team will have a) deep knowledge of how Lucene.Net works and b) be able to review the 3.x version in Java and decide what to do next.
 
Ciaran

Goru

unread,
Nov 11, 2010, 5:33:19 PM11/11/10
to Lucere
I totally agree with Ciaran. I think start should be to simply try and
make it a real .Net library. While doing that we can look into
architectural changes if required.

On Nov 11, 8:08 am, Ciaran Roarty <ciaran.roa...@gmail.com> wrote:
> Sorry, to be clear, I think the best way to start is to take Lucene.Net
> 2.9.2 and make it a real .NET library rather than the line by line port it
> is.
>
> Subsequently, the team will have a) deep knowledge of how Lucene.Net works
> and b) be able to review the 3.x version in Java and decide what to do next.
>
> Ciaran
>

Digy

unread,
Nov 11, 2010, 7:46:02 PM11/11/10
to luc...@googlegroups.com
Assuming that .NETifying can be done on Lucene.Net 2.9.2, what will be the
gain in having more .NET like code?
Will it perform better or will it be just a better looking code?. Even if it
performs better, what will you do at the next release of Lucene Java.

I think people underestimate the cost of porting new Lucene Java releases.
Before starting to make a "real .NET library", I would suggest to take the
initial port of Lucene 3.0.2 (http://hg.slace.biz/lucene-porting/downloads)
which is created by an automated tool, and try to get it compiled.(I don't
even say it should work, just get it compiled!).

DIGY

--

Troy Howard

unread,
Nov 11, 2010, 8:01:57 PM11/11/10
to luc...@googlegroups.com
Missing IDispoable is a huge pain point for us as well.

As a stop gap, we've been using a simple wrapper to add IDisposable
support. In a nutshell, the wrapper object implements IDisposable,
exposes the wrapped object via a property, and takes a Action<T> in
the constructor to define a dispose action. When Dispose is called on
the wrapper it invokes the Action<T>. An extension method is included
which makes using it a bit more light-weight.

It's worth noting that this is a generally useful class for
implementing IDisposable integration on any class that doesn't
implement it.

Here's the classes for that with an example of how to use it:

https://gist.github.com/673545

There's room for improvement on that concept but it basically works.
The larger issue that we find is that calling Close on Lucene objects
doesn't always free resources... Which is one of the reasons it needs
some serious re-writing (not just refactoring) to work correctly on
.NET.

Thanks,
Troy

Christopher Currens

unread,
Nov 11, 2010, 8:16:58 PM11/11/10
to Lucere
I wrote a fairly long reply but decided to scrap it because I can say
most of what I wanted to say in just a few sentences. ".Net-ifying"
the code won't truly help us. The code and API was designed for the
JRE, and while JRE and .Net have similar functionality, they are still
very different. Java Lucene is a robust, fast indexing library for
Java, but doesn't translate well to .Net Square-peg; round hole. It
takes some major work to get it to fit, and even when you do, it's not
perfect. We could waste time trying to fit a Java-designed library
into the CLR, but ultimately, we'd run into the same bottlenecks that
we already hit with the current port. There's no denying that either
approach is going to be an intense amount of work, which is why it is
so beneficial for us to take some time and really get a good idea of
the best way to go about this using the .Net framework. Measure
twice, cut once.

I don't expect to get a huge performance gain in indexing, searching
or writing, but a re-write of Lucene for the CLR would both improve
memory management and ease implementation into external projects.
Lucene looks fairly straightforward when you break it down, but
there's a lot of code in there, why waste time with a re-write when
what you would really get is simply better looking code? Maybe, just
maybe, you'll get better memory management. On a project this large,
you'd be foolish not to plan ahead.

-Christopher

Troy Howard

unread,
Nov 11, 2010, 9:09:41 PM11/11/10
to luc...@googlegroups.com
So, the general theme I'm getting out of this is that most of you feel
that refactoring the existing codebase would be more reasonable than
re-writing it from scratch.

This is a very pragmatic response. I partially agree and disagree...
and as such, our plan of action includes both concepts, rather than
just one or the other.

Probably a good way to start this discussion is explain what my
intended plan of action is.

The first step will be extracting the interfaces from the 3.0.2 port
and placing them in their own project, as well as converting the
static enums to real enum types in the same library.

The next step will be re-writing the unit tests to work against those
interfaces. The unit tests will all fail because there will be no
implementations backing them. This is good.

While performing those coding tasks, we will be engaging in community
discussion to derive a ideal interaction contract for the API. once
we've come up with a basic plan for how the API should work, we will
refactor the interfaces (note: still no implementation!!) until they
look and feel how we want and support whatever we think they should
support. Then the unit tests and example apps will need to be
refactored to match.

After that we will implement mocks of all the types which contain
static data, hopefully allowing the unit tests to pass based on the
mock behaviour and data. Unit tests will probably pass but
integrations tests will probably not pass.

The we will start implementing those interfaces. At this stage, we may
find that we can refactor and improve the Lucene.Net automated 3.0.2
port to comply with the new interface... or maybe not.. or maybe only
partially. Where it doesn't fit we can either change it or write new
code.

Once the library is implemented and passes all the tests, we'll start
doing the same kind of port of the contrib libraries. After that we
will enter into maintenance mode where we attempt to integrate changes
from the Java Lucene project into our library. As Ayende said, keeping
up probably won't be that difficult assuming the Java developers
maintain their current awesome habits of explaining everything in good
detail and answering questions we may have.

Since we won't be part of the ASF, there will be no rigid expectations
set for our release schedule. This means that we can take as long as
necessary to do it right. I'm not afraid of being slow to release if
necessary, as long as what we release is quality code.

One other thing that should be stated: We will retain file-level
compatibility with all other Lucene implementations.

As far as aimee.net goes.. I like the idea of performing the
refactoring and making that a series of codeproject articles. I don't
want to interfere with his plans for that because I think it will be a
valuable exercise because the scope is small, so it can be done
quickly, and the articles will be valuable to people who are learning
about refactoring.

In fact, the aimee.net project relieves pressure from Lucere to do the
same thing and allows us to extend the initial release cycle longer.
We may find that just as we are entering into the implementation
stage, aimee.net is making it's first release of a refactored
codebase... Which means we could just use the aimee.net code instead
of the Lucene.Net code to implement the contact we've designed.

So because of our intention to use design-by-contract and test driven
development, we do not want to just jump in and start adding
IDisposable and changing getters and setters to properties and what
not... Specifically, we don't want to be focused on implementation
details in the initial stage, we want to be focused on creating a
great user experience for consumers of the library by designing a
great interface first.

This may seem like a unpragmatic approach, but honestly, the pragmatic
approach is what Lucene.Net is already doing. This project is meant to
diverge from that and lean a little more in the idealistic direction.
It may be important for our community to to understand that we are not
attempting to *port* the library, we are attempting to *reimplement*
it. This is very different and we are prepared for the challenges that
this approach will bring and positive about our ability to see it
through.

Thanks,
Troy

Troy Howard

unread,
Nov 11, 2010, 9:18:19 PM11/11/10
to luc...@googlegroups.com
Christopher makes a very good point.

Lucene was originally written in 2000 and was probably written for
J2SE 1.2 / 1.3 ... At that time Java did not support annotations,
enums or generics and it still doesn't support lambdas.

It makes sense why the Lucene library seems so primitive compared to
the much more advanced modern-day .NET features. The Lucene project
itself is working on fixing this and re-defining their API to use
modern Java 5/6 features. It would be silly of us to work with the old
API, as Lucene won't even be using that moving forward. Instead, we
should be looking at the most recent release and the current
development branch to see where it's headed and design our API to be
able to support the API changes they will be making in the future.

The whole goal is to get un-stuck from the past.

Thanks,
Troy

Ciaran Roarty

unread,
Nov 12, 2010, 1:19:35 AM11/12/10
to luc...@googlegroups.com
Troy

I think this is a good set of processes to follow. However, it feels a
bit all or nothing which is where my thoughts - as Ayende initially
suggested - of taking 2.9.2 as a base came from.

I am happy to start with 3.0.2 ported and compiling instead of 2.9.2
but I think a pragmatic approach subsequent to that is to take X
amount of use cases and see if the approach works: index a Document,
retrieve a document using a Query.... For example. This would allow us
to prove the approach without going too deep too quickly.

I fully support the dev. of tests and the introduction of Mockable
Objects but I'd like to make sure that the file format stays the same
so I'd like a concrete-ish implementation of those core use cases
pretty early.

As for contrib projects, I'd far rather define the core initially and
get that working. Document, Field, IndexWriter etc .... There aren't
that many public objects required to support some very valuable use
case proofs.

I think the performance gains will be considerable and we should try
to benchmark from the start to prove this.

I note that the bare skeleton projects in the solution target .NET 4.
I am happy with this, is everybody else? Perhaps .NET 3.5 might be an
easier sell? I would not want to go any lower: 3.5 is 2.0 under a
slightly different guise after all.... The only real excuse for
someone not using it in production being the inclusion of .NET 2.0
SP2.

Finally, thanks for setting this project up; I really despaired of the
line by line approach going forward though I thank the Lucene.Net team
for their great work.

Ciaran

Troy Howard

unread,
Nov 12, 2010, 2:00:52 AM11/12/10
to luc...@googlegroups.com
I agree that incremental proofs and metrics are valuable. I also agree
that a more iterative approach to implementation might be a better
idea.

One way we could do that would be to work in layers. Though it is not
explicitly stated anywhere Lucene does seem to be nicely organized
into layers:

- Disk Access (Directory, File, Lock, etc..)
- Streamable Read and Write ( Various readers/writers )
- Persistable Domain Objects (Document, Field, Term, etc)
- Logical Domain Objects (Querys, Scorers, Searchers, Analyzers, etc)

A staged approach that works from the bottom up to implement the
interfaces and pass the unit tests, starting with the disk layer might
be a good approach.

I like the idea of benchmarking this goes on. This will provide an
opportunity to show how each of the decisions we make impacts
performance, and also set some goals for us if we find that our
implementation is slower for some reason.

Regarding using 2.9.2 or 3.0.2 as the base... I'd like to start with
3.0.2 because of the API changes that are already present in that
build, and because, with all likelihood this project will not progress
as quickly at the Java Lucene project. That means we should start at
their current release so that the gap which is inevitably created by
their forward progress will be as small as possible.

Unfortunately, we don't have a functioning port of 3.0.2 available
from Lucene.Net, however I fully expect to see one by the end of the
year. In the bitbucket repo that Aaron Powell set up for testing
porting mechanisms, there's a fairly complete port in the
JavaToVbCSharpConverter subdirectory. It doesn't compile on my
machine, but I think I can make that work.

Here's that link:

http://hg.slace.biz/lucene-porting

Assuming that interfaces extracted from this build won't be terribly
different than the final product, we could start by taking just the
disk classes, extracting their interfaces, porting the unit tests that
apply to that layer (not included in the above package unfortunately),
and then filling in the implementation with the Lucene.Net automated
port code.

Establish a set of benchmarks based on those unit tests, and then
refactor that layer as needed for perf/desired API, verifying through
each refactoring that a) it continues to work as expected and b)
performance doesn't slip.

This could be a nice rinse and repeat process to work our way up to
the top API layer.


Thanks,
Troy

Troy Howard

unread,
Nov 12, 2010, 3:06:29 AM11/12/10
to luc...@googlegroups.com
Ok, chewing on it a bit.... 2.9.2 is probably a better starting point
for us to work from.

A fully functioning library that is already field tested is going to
make a much more solid basis of comparison than whatever we take out
of 3.0.2... We're going to be mangling the API anyway so I guess API
level is less relevant. Also, I don't think 3.0.2 changed much in the
API vs 2.9.2, other than the obvious -- omitting deprecated/obsolete
elements.

So maybe our standard is to start from 2.9.2, but only incorporating
the non-deprecated/non-obsolete elements. If that sounds good to
everyone, along with the layered approach I described earlier, I'll
start this ball rolling.

That said, the questions I initially posed are still relevant, and
haven't yet been discussed:

- What should the new API look like, and what should it support?

The earlier we start implementation, the earlier we need to work out
those design requirements. We need to make sure that whatever
implementation we end up with is able to support our new API's needs
without too much hackage. I don't want to end up refactor-thrashing
(kind of like thread thrashing but for coding. ;))... Top level API
needs can easily cascade refactorings deep into the core layers. That
will be an arduous process if we have to iterate too many times.

Thanks,
Troy

Prescott Nasser

unread,
Nov 12, 2010, 3:22:30 AM11/12/10
to luc...@googlegroups.com
Stagged approach sounds solid to me, Aaron's port works for me (although I treat warnings as errors and there are roughly 1500 of them). I'd rather go off this, than the 2.9.2, the changes made to the API are based on a very iterative approach from what the Java Lucene team has found over time. I think its' good to start with that. Even if it's buggy - we are going to be gutting that anyway.
 
We definately have to benchmark - I would like to see us follow the progress against Lucene.Net and Java Lucene as each layer/piece is updated and refactored. That way we can benchmark and say "our disk reads are X, Lucene.Net's reads are X, and Java's reads are X" Obviously, the java benchmark is for kicks - but clearly we can use Lucene.Net as a baseline - if we can't beat it, then we know we have a problem.
 
Working our way up should let us get indepth knowledge of Lucene, and it shouldn't affect the API - disk access is disk access, (right?)..
 
~P

Ciaran Roarty

unread,
Nov 12, 2010, 5:06:21 AM11/12/10
to luc...@googlegroups.com
I can see value in 2.9.2 and 3.0.2. 2.9.2 is the pragmatic approach, I think, and means we can learn Lucene.Net from the ground up ( and refer to Lucene to see the changes George, Digy et al have made to the code to make it work on .NET).
 
My preference would be to to take use cases and work through the layers but I can still see value in the layered approach. I think if we lose sight of the Document, Fields, etc interaction then we might build something unusable.
 
Troy asked - What should the new API look like, and what should it support?
To me, this starts as a question about which version of the Framework because if we are rewriting the API then we should make full use of the version of the Framework we are targeting.... 2.0 means at most generics in the API, 3.5 affords Linq, lambdas etc.... and 4? Does 4 give us anything to use in a public API, probably not...... but I bow to anyone's superior knowledge.
 
Do we think there is a performance benefit of targeting .NET4?
 
Ciaran

Troy Howard

unread,
Nov 12, 2010, 6:58:31 AM11/12/10
to luc...@googlegroups.com
I'd say 3.5 is appropriate. 4.0 doesn't provide much that we can
use... Well, I'm pretty excited about code contracts, and the
BigInteger type might be handy. It takes away the "eventually the
system can't handle your level of scale" issue. This is a constant
problem for us at my day job, as we are constantly handling very large
volumes of data (anywhere from 1 terabyte up to about 1/2 a petabyte
of documents and email, and we generally have numerous jobs of that
size going on at once, which turn around in less than a month).
Application level use of BigInteger along with Lucene's VarInt storage
would be a nice combination.

Anyhow, I couldn't go back to 2.0 and I don't think that doing so
would be in anyones best interest. Also, (and this is becoming a
convenient excuse) Lucene.Net already covers legacy framework
compatibility.

Regarding 2.9.2 vs 3.0.2.... If we can get a working build of 3.0.2
going, well.. step one is to make sure Lucene.Net gets that build into
their hands. If we can share a common implementation of 3.0.2 to work
from then I think that would be a better place to start. If we're
looking at many months before that's ready for use... It's a toss up.
I do like the idea of having something solid and well field-tested to
base testing and comparison on (ie 2.9.2)... so I'm leaning in that
direction at this point, because I think it will be a while before
Lucene.Net 3.0.2 is really fully flushed out. My assumption is that a
release made under deadline pressure is going to be a weak release and
probably ship with bugs or at least reduced performance.

That said, I'm still not totally opposed to simply taking the latest
build of Java Lucene (3.x or 4.0) and extracting/porting the
interfaces from that and just manually implementing everything to
fulfill those contracts. It would take a long time, but really, I
think it might end up being a more healthy process than attempting to
improve and refactor the Lucene.Net code.

Prescott -- Did you say you were able to compile Aaron's port? I get
massive issues with the code that need manual porting with all
versions. Mostly related to uses of generics. Maybe I'm missing
something?

Thanks,
Troy

Prescott Nasser

unread,
Nov 12, 2010, 7:03:42 AM11/12/10
to luc...@googlegroups.com
I didn't do anything but download Aarons 3.0.2 post processed code and hit build.
 
Using 4.0 would be nice to give me an excuse to really start playing with the new features. Otherwise I think 3.5sp1 is the minimum we should bother looking at.

~Prescott 

Troy Howard

unread,
Nov 12, 2010, 7:20:05 AM11/12/10
to luc...@googlegroups.com
Prescott,

Interesting...

I made a local clone from the HG repo at
(http://hg.slace.biz/lucene-porting) and using a plain-jane install of
Visual Studio 2010 Express, opened the solution file, hit build and
got the following errors listed below (21 errors, 4 warnings)... After
digging into the errors a bit, fixing them as I went, I noticed they
chained into other errors, and in total, there were quite a few errors
to address littered around the code with some very obvious syntax
issues, incorrect use of generics, missing classes, etc...

It's ~4:20am here in Portland, Oregon, so I'm not about to try to fix
all of them before going to bed.. ;)

Are we looking at the same code base? Is there another revision of it
available that I'm unaware of?

Thanks,
Troy


Error 1 Invalid token '(' in class, struct, or interface member
declaration C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 68 27 Lucene.Net
Error 2 ; expected C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 68 56 Lucene.Net
Error 3 ; expected C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 68 95 Lucene.Net
Error 4 Invalid token '=' in class, struct, or interface member
declaration C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 70 18 Lucene.Net
Error 5 Invalid token ';' in class, struct, or interface member
declaration C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 70 29 Lucene.Net
Error 6 Invalid token '=' in class, struct, or interface member
declaration C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 71 28 Lucene.Net
Error 7 Invalid token '(' in class, struct, or interface member
declaration C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 71 82 Lucene.Net
Error 8 Invalid token ')' in class, struct, or interface member
declaration C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 71 95 Lucene.Net
Error 9 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 78 11 Lucene.Net
Error 10 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 88 11 Lucene.Net
Error 11 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 95 20 Lucene.Net
Error 12 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 97 14 Lucene.Net
Error 13 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 97 55 Lucene.Net
Error 14 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 106 20 Lucene.Net
Error 15 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 111 19 Lucene.Net
Error 16 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 112 26 Lucene.Net
Error 17 Expected class, delegate, enum, interface, or
struct C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 113 26 Lucene.Net
Error 18 Type or namespace definition, or end-of-file
expected C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 115 3 Lucene.Net
Warning 19 Type parameter 'K' has the same name as the type parameter
from outer type
'Lucene.Net.Util.Cache.Cache<K,V>' C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Util\Cache\Cache.cs 32 37 Lucene.Net
Warning 20 Type parameter 'V' has the same name as the type parameter
from outer type
'Lucene.Net.Util.Cache.Cache<K,V>' C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Util\Cache\Cache.cs 32 40 Lucene.Net
Warning 21 Type parameter 'K' has the same name as the type parameter
from outer type
'Lucene.Net.Util.Cache.SimpleMapCache<K,V>' C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Util\Cache\SimpleMapCache.cs 75 45 Lucene.Net
Warning 22 Type parameter 'V' has the same name as the type parameter
from outer type
'Lucene.Net.Util.Cache.SimpleMapCache<K,V>' C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Util\Cache\SimpleMapCache.cs 75 48 Lucene.Net
Error 23 The namespace '<global namespace>' already contains a
definition for 'SavedStreams' C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\StopAnalyzer.cs 101 18 Lucene.Net
Error 24 Elements defined in a namespace cannot be explicitly declared
as private, protected, or protected
internal C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\Standard\StandardAnalyzer.cs 109 25 Lucene.Net
Error 25 Elements defined in a namespace cannot be explicitly declared
as private, protected, or protected
internal C:\dev\lucene\lucene-porting\JavaToVbCSharpConverter\Lucene.Net
3.0.2\Lucene.Net 3.0.2 -
AfterSomeDumpPostProcess\Analysis\Standard\StandardAnalyzer.cs 109 25 Lucene.Net

Prescott Nasser

unread,
Nov 12, 2010, 7:25:26 AM11/12/10
to luc...@googlegroups.com
Apologies, I downloaded the uploaded file attached to the upgrade to VS2010 by Wyatt (https://issues.apache.org/jira/browse/LUCENENET-377)
 
I *thought* this was one of the 2.9.2 pre-releases, reorgs for vs2010. I could be off.

Troy Howard

unread,
Nov 12, 2010, 7:33:07 AM11/12/10
to luc...@googlegroups.com
Ok, that makes more sense.. thats version 2.9.2.

I was under the impression you were successfully building a 3.0.2
port. If you were, well, I think we would all be very happy to have
that codebase... ;)

Thanks,
Troy

Gaurave Sehgal

unread,
Nov 12, 2010, 10:45:14 AM11/12/10
to luc...@googlegroups.com
I agree that as of now 3.5 is appropriate but 4.0 is really useful if we have to use TPL. Another thing which I feel is that by the time this project will be released 4.0 will be in much more use and then we will need to upgrade to 4.0 again. And all such upgrades are just syntactic updates only. If we are starting this project from start no- matter if 4.0 functionality is useful or not as of now I think we should still go with 4.0.

Thanks
Gaurave

Christopher Currens

unread,
Nov 12, 2010, 11:57:52 AM11/12/10
to Lucere
In regards to .Net 3.5 vs 4, I agree with Gaurave. While I don't have
a whole lot of experience with .net 4, there are a few things I think
would be especially helpful with multi-threading/parallel processing
and the associated thread-safety. The TPL is obvious benefit, but
even if we decided not to use the TPL, the new
System.Collections.Concurrent may be worth looking at for thread
safety. Thread-safe collections like these could be an easy way to
deal with implementing multisearchers and the like. Again, though,
since I haven't used .net 4 much, I can't say 100% how useful these
things are or if there are any other framework improvements we could
utilize.

-Christopher
> > On Fri, Nov 12, 2010 at 2:06 AM, Ciaran Roarty <ciaran.roa...@gmail.com>
> > > On 12 November 2010 08:22, Prescott Nasser <geobmx...@hotmail.com>
> > wrote:
>
> > >> Stagged approach sounds solid to me, Aaron's port works for me (although
> > I
> > >> treat warnings as errors and there are roughly 1500 of them). I'd rather
> > go
> > >> off this, than the 2.9.2, the changes made to the API are based on a
> > very
> > >> iterative approach from what the Java Lucene team has found over time. I
> > >> think its' good to start with that. Even if it's buggy - we are going to
> > be
> > >> gutting that anyway.
>
> > >> We definately have to benchmark - I would like to see us follow the
> > >> progress against Lucene.Net and Java Lucene as each layer/piece is
> > updated
> > >> and refactored. That way we can benchmark and say "our disk reads are X,
> > >> Lucene.Net's reads are X, and Java's reads are X" Obviously, the java
> > >> benchmark is for kicks - but clearly we can use Lucene.Net as a baseline
> > -
> > >> if we can't beat it, then we know we have a problem.
>
> > >> Working our way up should let us get indepth knowledge of Lucene, and
> > >> it shouldn't affect the API - disk access is disk access, (right?)..
>
> > >> ~P
>
> > >> > On Thu, Nov 11, 2010 at 11:00 PM, Troy Howard <thowar...@gmail.com>
> ...
>
> read more »

Peter Mateja

unread,
Nov 12, 2010, 11:58:56 AM11/12/10
to luc...@googlegroups.com
I'd also throw my vote behind starting with .Net 4.0.  One aspect of this project that I'm personally (not really professionally) interested in is to remain functional on the mono stack.  While the most recent stable branch of mono (2.6.x) only supports up to 3.5, the current trunk state (2.8.x) of mono is now supporting .Net 4.0.  I'd think by the time Lucere is in a decent state this should be close to being stable.

Peter Mateja
peter....@gmail.com

Troy Howard

unread,
Nov 12, 2010, 2:18:42 PM11/12/10
to luc...@googlegroups.com
Ok, lots of compelling reasons to move forward, not too many
compelling reasons not to. 4.0 sounds good.

Thanks,
Troy

Reply all
Reply to author
Forward
0 new messages