A bit lost

Ciaran Roarty

unread,

Nov 17, 2010, 3:46:34 PM11/17/10

to lucer...@googlegroups.com

Hi

I fully admit to being a bit lost about the approach at the moment so please
accept my apologies in advance.

I've got the source repo down to my machine and I can see that a lot of work
has been done so far; but currently I can't build the solution because the
tests don't build. Is this the current state of play? I don't understand
where I can 'jump in' and be of use.

As it is, I've decided to go and have a look at the Java source and see if I
can match up what's been done.

I've unfortunately had a stomach bug this week and this is the first chance
I've had to look at Lucere in anger.

Ciaran

Troy Howard

unread,

Nov 17, 2010, 4:44:00 PM11/17/10

to lucer...@googlegroups.com

Generally speaking, we should try to avoid checking in incomplete code
that doesn't compile.

In this case though, Christopher asked me if he should check in what
he had (which had broken unit tests) or wait until he had time to fix
them. It's my opinion that checking in earlier is better than waiting.
This way others can help you to accomplish your goals. I'd rather have
broken code to fix than no code at all.

Ideally, Christopher would have posted a message to the list to the
effect of "I finished up the interfaces in lucere.definition and
started in on the unit tests for the same but haven't had time to
complete the unit tests. I went ahead and checked this in, could
someone please finish up the unit tests?"....

That said, it's better to just omit from your check-in things that
aren't compiling. Failign that you could set the build profile for the
project or individual file's behaviour to not compile. This way,
things that can build, do build, and things that don't don't hold us
up.

An obvious place to jump in at this point would be to finish up those
broken tests. ;)

Beyond that, any interfaces that have not yet been defined or any unit
tests that have not yet been ported, are open to be claimed as a work
task.

The way to claim something is just to post publicly that you're going
to do XXX and start in on it. You should check-in anything complete as
soon as you can do that (without breaking other code, see previous).
If we end up with overlapping work, it's because someone didn't follow
this policy. Even still that's not the end of the world, just less
efficient. ;)

The longer you take to claim a piece of work, the more likely someone
else is to have started on it. The longer you take to commit your
work, the more likely someone else is to have already done it and
committed before you. That said, if someone has claimed something, it
doesn't make a lot of sense to start working on it unless that person
has gone unresponsive and an inordinate amount of time has passed.
It's also better to "unclaim" work earlier than later if you don't
have enough free time to work on it. At the moment, we have a large
body of willing developers, so sharing work, delegating, and
communicating well about what you can and can't do will be important
for us to leverage this support to it's fullest.

Thanks,
Troy

> --
> You received this message because you are subscribed to the Google Groups "Lucere Development" group.
> To post to this group, send email to lucer...@googlegroups.com.
> To unsubscribe from this group, send email to lucere-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/lucere-dev?hl=en.
>
>

Digy

unread,

Nov 17, 2010, 5:01:22 PM11/17/10

to lucer...@googlegroups.com

> Christopher asked me if he should check in what he had.......
What do you use a comm. media. I didn't see anything in maling lists. I also
don't see anything new in
https://hg01.codeplex.com/lucere

Is there anything I missed to follow?

DIGY

Ciaran Roarty

unread,

Nov 17, 2010, 5:02:34 PM11/17/10

to lucer...@googlegroups.com

Troy

Thanks for that update - that makes a lot of sense. I am just getting to
grips with the project.

Ciaran

-----Original Message-----
From: lucer...@googlegroups.com [mailto:lucer...@googlegroups.com] On
Behalf Of Troy Howard
Sent: 17 November 2010 21:44
To: lucer...@googlegroups.com
Subject: Re: [lucere-dev] A bit lost

Troy Howard

unread,

Nov 17, 2010, 5:03:12 PM11/17/10

to lucer...@googlegroups.com

Chris and I work for the same company. We spoke in person.

We should keep discussion like this on the mailing list. My apologies.

Thanks,
Troy

Ciaran Roarty

unread,

Nov 17, 2010, 6:40:55 PM11/17/10

to lucer...@googlegroups.com

Ok, I've done work on TestDocument.cs and I've added Moq.dll but I don't
seem able to push the changes to Codeplex..... can anyone help? I'm using
Tortoise Hg.

Ciaran

Ciaran Roarty

unread,

Nov 17, 2010, 6:45:15 PM11/17/10

to lucer...@googlegroups.com

The following is what I get when I try a hg push

pushing to https://hg01.codeplex.com/lucere
searching for changes
abort: push creates new remote heads on branch 'default'!
[command interrupted]

-----Original Message-----
From: Ciaran Roarty [mailto:ciaran...@gmail.com]
Sent: 17 November 2010 23:41
To: 'lucer...@googlegroups.com'
Subject: RE: [lucere-dev] A bit lost

Ok, I've done work on TestDocument.cs and I've added Moq.dll but I don't
seem able to push the changes to Codeplex..... can anyone help? I'm using
Tortoise Hg.

Ciaran

-----Original Message-----

From: lucer...@googlegroups.com [mailto:lucer...@googlegroups.com] On
Behalf Of Troy Howard

Sent: 17 November 2010 22:03

Troy Howard

unread,

Nov 17, 2010, 6:54:18 PM11/17/10

to lucer...@googlegroups.com

For UI steps: First step would be to commit to your local repository.
Next, go to Synchronize and then select Push.

There may be some configuration you need to do to get it to know that
the codeplex site is where you want to push and to know your user
credential.

The error message you're seeing is because someone has committed since
you updated your local copy. To Mercurial this means that either you
need to do a pull/merge before pushing, or you should make a new
branch.

See:

https://developer.mozilla.org/en/Mercurial_FAQ

Under "How do I deal with "abort: push creates new remote heads!"?" Section

That gives full details and instructions, but those are for
commandline hg. For Tortoise HG, in short, if you feel confident to do
the merge, then do a pull then a merge, and then a push. Otherwise, if
you'd rather push a branch, and have someone else merge it later
select "Push New Branch" checkbox.

Thanks,
Troy

Hakeem

unread,

Nov 17, 2010, 8:26:05 PM11/17/10

to Lucere Development

I'm kind of in the same situation. Not sure where to get started as
I'm a Lucene n00b. I have the first edition of the In Action book that
I'm using to get up to speed. Do you guys know if there have been
significant changes to the core since that book came out in '04/'05. I
ran a comparison of the TOC with the 2nd ed. and didn't find
significant additions other than the case studies

Thanks!

On Nov 17, 6:54 pm, Troy Howard <thowar...@gmail.com> wrote:
> For UI steps: First step would be to commit to your local repository.
> Next, go to Synchronize and then select Push.
>
> There may be some configuration you need to do to get it to know that
> the codeplex site is where you want to push and to know your user
> credential.
>
> The error message you're seeing is because someone has committed since
> you updated your local copy. To Mercurial this means that either you
> need to do a pull/merge before pushing, or you should make a new
> branch.
>
> See:
>
> https://developer.mozilla.org/en/Mercurial_FAQ
>
> Under "How do I deal with "abort: push creates new remote heads!"?" Section
>
> That gives full details and instructions, but those are for
> commandline hg. For Tortoise HG, in short, if you feel confident to do
> the merge, then do a pull then a merge, and then a push. Otherwise, if
> you'd rather push a branch, and have someone else merge it later
> select "Push New Branch" checkbox.
>
> Thanks,
> Troy
>

> On Wed, Nov 17, 2010 at 3:40 PM, Ciaran Roarty <ciaran.roa...@gmail.com> wrote:
> > Ok, I've done work on TestDocument.cs and I've added Moq.dll but I don't
> > seem able to push the changes to Codeplex..... can anyone help? I'm using
> > Tortoise Hg.
>
> > Ciaran
>
> > -----Original Message-----
> > From: lucer...@googlegroups.com [mailto:lucer...@googlegroups.com] On
> > Behalf Of Troy Howard
> > Sent: 17 November 2010 22:03
> > To: lucer...@googlegroups.com
> > Subject: Re: [lucere-dev] A bit lost
>
> > Chris and I work for the same company. We spoke in person.
>
> > We should keep discussion like this on the mailing list. My apologies.
>
> > Thanks,
> > Troy
>

> > On Wed, Nov 17, 2010 at 2:01 PM, Digy <digyd...@gmail.com> wrote:
> >>> Christopher asked me if he should check in what he had.......
> >> What do you use a comm. media. I didn't see anything in maling lists.

> >> I also don't see anything new inhttps://hg01.codeplex.com/lucere

> >> <ciaran.roa...@gmail.com>

Hakeem

unread,

Nov 17, 2010, 9:58:22 PM11/17/10

to Lucere Development

OK I just re-read Troy's notes from the *Plan of action* thread

Each layer will need to be worked on in series:
- Starting with the 3.0.2 Java code base, read the Java code and
create the appropriate .NET interfaces, enums, etc.
- Update existing test cases to use the new interfaces and to use
MbUnit framework
- Create new unit tests where needed (see Java code for new ones for
3.0.2, or just create them if you notice they are lacking coverage)

Since Lucene.NET is a direct port of the Java code and some people
might be more hands on with C# than Java (I'm talking about myself
here ;)), I was wondering if it would be much quicker to read the
Lucene.NET code and create interfaces etc. As I see it Lucene.NET only
has about 38 interfaces defined, too few in a library with so many
types. So we would still have a lot to work on even with C# code as
the source. I also see that Lucene.NET code is directly new-ing up
other classes so we have room there to plan for factories and other
DPs. In essence start with proven practices

Also, I noticed that the namespaces/interfaces/classes in Lucere don't
have a 1-1 correspondence with what is in the Java library. Do we have
some doc somewhere that indicates what package/namespace in the source
is mapped to which namespace in the target?

Thanks!

Troy Howard

unread,

Nov 17, 2010, 10:46:46 PM11/17/10

to lucer...@googlegroups.com

Unfortunately, working from Lucene.Net won't work, as we're targeting
Lucene 3.0.2 and Lucene.Net has not made it to that level of
compatibility yet.

Regarding the namespace mappings, I'll be publishing a list tonight
for that mapping, along with a system for claiming work and keeping
track of work to be done.

Regarding going from Java to .NET... This is a bit of an issue,
because there are some tricky conversion problems.

One of the things we're doing with this conversion is flattening
nested classes and making all classes public and not sealed with an
empty constructor (when we get to implementation).

There are some other subtleties to the migration process, such as
fixing up the weird naming conventions like "NumDocs" should be
"DocumentCount", and "Freq"->"Frequency", etc..

Another thing to note that may be confusing is scoping:

In Java, the default scoping is "package-private" which equates to
.NET's "internal". So if you saw a class like:

public class Foo
{
int bar;
}

It would be necessary, when migrating, to expose bar via the interface
with get/set allowed. This is because another class, within the Lucene
package, could have read/write access to that member (and there are
instances in the code where this is taken advantage of).

However if that class was listed as:

public class Foo
{
final int bar;
}

You should only expose a get, as the final keyword is equivalent of
"readonly" in C# for that usage.

Another gotcha are nested classes. We should flatten nested class
hierarchies and expose those classes via public interfaces. One thing
that might throw you for a loop as a C# dev is this:

public class Foo
{
public static class Bar
{
// ...
}
}

You may thing "oh, the nested class Bar is static, I don't need to
make an interface for that". Unfortunately the static keyword, in this
context in Java, doesn't mean what you might think it does. I simply
means the nested class should be exposed outside of the containing
class. It's not static in the sense that "it's a singleton, you can't
instantiate an instance of it" like it would in C#.

There's a lot of scoping rules in Java that are different from .NET
scoping rules and keywords are reused with very different meanings in
different contexts. For example, "final" when used in a class
declaration means "sealed" but on a field in a class, means
"readonly".

Some other issues are Java Attributes, anonymous types and what .NET
BCL classes to use for what Java BCL classes..

Some quick notes on that:

java.io.Reader <-> System.IO.TextReader .... BUT! .Net's TextReader
doesn't support Reset(), which is used in Java Lucene. We'll need to
make our own types.

java.lang.Number <-> System.ValueType *this is imperfect... but works

java.io.Serializable <-> System.Runtime.Serialization.ISerializable
*this only really maps to writeObject(..) from Java class. for
readObject support, that must be an implementation detail where a
specific constructor is on any serializable class.

java.util.Collection<T> <-> System.Collections.Generic.IList<T>
java.util.Vector<T> <-> System.Collections.Generic.IList<T>
java.util.Map<K, V> <-> System.Collections.Generic.IDictionary<TKey, TValue>

java.lang.Cloneable <-> System.ICloneable<T>
* Deep copy or Shallow Copy? Who knows?
Brad Abrams (coauthor of Framework API Design Guidelines) discusses
this issue in a blog post here:

http://blogs.msdn.com/b/brada/archive/2003/04/09/49935.aspx

and basically says don't use this.. but it's all over Java Lucene.

java.lang.Comparable <-> System.IComparable<T>
java.util.Comparator<T> <-> System.IComparer<T>

Some other notes.. use int/string/object instead of Int32/String/Object.

If there is a method GetXXX which takes no parameters, make it a
property with a getter only. If there is a method with exactly the
same name but it's SetXXX and takes a parameter that is the same type
as returned by the GetXX (and there is no return type of the Set
method)... add a setter to the property. However, if there's a SetXXX
method that takes some other type, include the SetXXX method in the
interface.

If there is a method that takes no parameters and returns a value,
probably it should be converted to a property get even if the method
name doesn't start with "Get"... There are of course exceptions to
this.. Like .. boolean next() .. Shouldn't be a property. It probably
means the type is a enumerator and so, in the interface, we should
implement IEnumerator<T>, which has bool MoveNext() .. It also has: T
Current { get; } , which would replace whatever method on the Java
enumerator held the current value... However that might be more than
one value that is current.

In that case you should make a new type for that enumeration...

Example:

Java:
public class Foo
{
// returns the current Bar
public Bar getBar()
{
// ...
}

// returns the current Baz
public Baz getBaz()
{
// ...
}

// moves the enumeration forward or returns false
public boolean next()
{
// ...
}
}

.NET:

public class Foo : IEnumerator<BarBaz>
{
}

public class BarBaz
{
public Bar { get; }
public Baz { get; }
}

So... it can be a bit tricky to map things correctly and won't be exactly 1:1.

Thanks,
Troy

Sergey Mirvoda

unread,

Nov 18, 2010, 1:18:04 AM11/18/10

to lucer...@googlegroups.com

The mosy difficult tusk for me is to handle exceptions.

Java exceptions hierarchy is huge.

this articles helps me a lot

http://pclt.cis.yale.edu/pclt/exceptions.htm

http://www.artima.com/designtechniques/exceptions.html

--Regards, Sergey Mirvoda

Digy

unread,

Nov 18, 2010, 4:57:59 AM11/18/10

to lucer...@googlegroups.com

> java.io.Serializable <-> System.Runtime.Serialization.ISerializable

> *this only really maps to writeObject(..) from Java class. for readObject support,

> that must be an implementation detail where a

> specific constructor is on any serializable class.

There may be no need for a specific constructor for each class.

For ex,

[Serializable]

public class XXX : System.Runtime.Serialization.IObjectReference

{

//"readResolve"s equivalent for .NET

public Object GetRealObject(System.Runtime.Serialization.StreamingContext context)

{

Digy

unread,

Nov 18, 2010, 5:03:44 AM11/18/10

to lucer...@googlegroups.com

> java.io.Serializable <-> System.Runtime.Serialization.ISerializable
> *this only really maps to writeObject(..) from Java class. for readObject
support,
> that must be an implementation detail where a
> specific constructor is on any serializable class.

There may be no need for a specific constructor for each class.
For ex,

[Serializable]
public class XXX : System.Runtime.Serialization.IObjectReference
{
//"readResolve"s equivalent for .NET
public Object
GetRealObject(System.Runtime.Serialization.StreamingContext context)
{

}
}

DIGY

-----Original Message-----
From: lucer...@googlegroups.com [mailto:lucer...@googlegroups.com] On
Behalf Of Troy Howard

Sent: Thursday, November 18, 2010 5:47 AM

Reply all

Reply to author

Forward