public interface Segments {
/** * Returns a {@code Stream} of the {@code CharSequence}s for all of the segments in the source * sequence. Start from the beginning of the sequence and iterate forwards until the end. * @return a {@code Stream} of all {@code Segments} in the source sequence. */ Stream<CharSequence> subSequences();
/** * Returns the segment that contains index {@code i}. Containment is inclusive of the start index * and exclusive of the limit index. * * <p>Specifically, the containing segment is defined as the segment with start {@code s} and * limit {@code l} such that {@code s ≤ i < l}.</p> * @param i index in the input {@code CharSequence} to the {@code Segmenter} * @throws IllegalArgumentException if {@code i} is less than 0 or greater than the length of the * input {@code CharSequence} to the {@code Segmenter} * @return A segment that either starts at or contains index {@code i} */ Segment segmentAt(int i);
/** * Returns a {@code Stream} of all {@code Segment}s in the source sequence. Start with the first * and iterate forwards until the end of the sequence. * * <p>This is equivalent to {@code segmentsFrom(0)}.</p> * @return a {@code Stream} of all {@code Segments} in the source sequence. */ Stream<Segment> segments();
/** * Returns a {@code Stream} of all {@code Segment}s in the source sequence where all segment limits * {@code l} satisfy {@code i < l}. Iteration moves forwards. * * <p>This means that the first segment in the stream is the same * as what is returned by {@code segmentAt(i)}.</p> * * <p>The word "from" is used here to mean "at or after", with the semantics of "at" for a * {@code Segment} defined by {@link #segmentAt(int)}}. We cannot describe the segments all as * being "after" since the first segment might contain {@code i} in the middle, meaning that * in the forward direction, its start position precedes {@code i}.</p> * * <p>{@code segmentsFrom} and {@link #segmentsBefore(int)} create a partitioning of the space of * all {@code Segment}s.</p> * @param i index in the input {@code CharSequence} to the {@code Segmenter} * @return a {@code Stream} of all {@code Segment}s at or after {@code i} */ Stream<Segment> segmentsFrom(int i);
/** * Returns whether offset {@code i} is a segmentation boundary. Throws an exception when * {@code i} is not a valid index position for the source sequence. * @param i index in the input {@code CharSequence} to the {@code Segmenter} * @throws IllegalArgumentException if {@code i} is less than 0 or greater than the length of the * input {@code CharSequence} to the {@code Segmenter} * @return Returns whether offset {@code i} is a segmentation boundary. */ boolean isBoundary(int i);
/** * Returns all segmentation boundaries, starting from the beginning and moving forwards. * * <p><b>Note:</b> {@code boundaries() != boundariesAfter(0)}. * This difference naturally results from the strict inequality condition in boundariesAfter, * and the fact that 0 is the first boundary returned from the start of an input sequence.</p> * @return An {@code IntStream} of all segmentation boundaries, starting at the first * boundary with index 0, and moving forwards in the input sequence. */ IntStream boundaries();
/** * Returns all segmentation boundaries after the provided index. Iteration moves forwards. * @param i index in the input {@code CharSequence} to the {@code Segmenter} * @return An {@code IntStream} of all boundaries {@code b} such that {@code b > i} */ IntStream boundariesAfter(int i);
/** * Returns all segmentation boundaries on or before the provided index. Iteration moves backwards. * * <p>The phrase "back from" is used to indicate both that: 1) boundaries are "on or before" the * input index; 2) the direction of iteration is backwards (towards the beginning). * "on or before" indicates that the result set is {@code b} where {@code b ≤ i}, which is a weak * inequality, while "before" might suggest the strict inequality {@code b < i}.</p> * * <p>{@code boundariesBackFrom} and {@link #boundariesAfter(int)} create a partitioning of the * space of all boundaries.</p> * @param i index in the input {@code CharSequence} to the {@code Segmenter} * @return An {@code IntStream} of all boundaries {@code b} such that {@code b ≤ i} */ IntStream boundariesBackFrom(int i);
class Segment { public final int start; public final int limit; public final int ruleStatus = 0;
/** * Return the subsequence represented by this {@code Segment} * @return a new {@code CharSequence} object that is the subsequence represented by this * {@code Segment}. */ public CharSequence getSubSequence() { ... } }
} |