Segments.joinWithNext(int) changes code ids

4 views
Skip to first unread message

krsk...@gmail.com

unread,
Aug 30, 2022, 4:49:43 PM8/30/22
to okapi-devel
While debugging an Okapi application, I noticed that Segments.joinWithNext(int) (and probably all the join methods) changes the code ids of the isolating closing code whose other half exist in other segments.

Let's say we have this XML document:
```
<run1>
For <run2>example</run2>:
Hello, <run3>Tom and Mary</run3>
</run1>
```

We run a segmentation program and that splits this to:
```
<run1>
For <run2>example</run2>:
```
and
```
Hello, <run3>Tom and Mary</run3>
</run1>
```

At this point, the code id for <run1> and </run1> is 1.

If joinWithNext(0) is applied to this Segments, the code id for </run1> gets bumped to 4.

Is this by design?

Kuro

jimbo

unread,
Sep 1, 2022, 11:33:20 AM9/1/22
to okapi...@googlegroups.com, krsk...@gmail.com

Hi Kuro,

To review what we discussed in the Okapi meeting. As originally implemented the various join methods would renumber inline codes, as well as destroy any properties associated with the Segment. This is probably a bug. To that end I have recently added methods like " joinAll(boolean keepCodeIds)" that can be used to preserve the original ids. I missed "joinWithNext" which doesn't provide the keepCodeIds option.

I will submit a PR that makes sure that all methods in the Segments class have an option to preserve the original code ids.

thanks!

Jim

--
You received this message because you are subscribed to the Google Groups "okapi-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-devel/872ad39e-2517-44e8-b53b-bf01c7d15fb6n%40googlegroups.com.

krsk...@gmail.com

unread,
Sep 1, 2022, 11:45:30 AM9/1/22
to okapi-devel
Thank you for clarification, Jim.
Although I volunteered to add a test case, I am no thinking it's easier and quicker for you to add one. Let me know if I should come up with a test case.

Reply all
Reply to author
Forward
0 new messages