I was imagining this:
Splitter splitter=Splitter.on(',');
List<String> values=splitter.split(line);
String name=values.get(3);
String postcode=values.get(10);
etc....
Instead I get an Iterable back from split. At first glance, this seems
inconvenient when splitting tabular data, where each column index has
a specific meaning.
What is the recommended way to get a value with a certain index out of
the returned Iterable?
Looking round Guava, I see there is Iterables.get or
Iterables.toArray. So perhaps something like this:
String[] values=Iterables.toArray(splitter.split(line), String.class);
String name=values.get(3);
String postcode=values.get(10);
Is this the best I can do using the Guava splitter, or am I missing
something?
--
guava-...@googlegroups.com.
http://groups.google.com/group/guava-discuss?hl=en
unsubscribe: guava-discus...@googlegroups.com
This list is for discussion; for help, post to Stack Overflow instead:
http://stackoverflow.com/questions/ask
Use the tag "guava".
List<String> values = ImmutableList.copyOf(splitter.split(line));
That's better than what I was going to use, but still not great.
Could splitToList() return an ArrayList<String> ? No problems with
circular dependencies there?
List<String> values = splitter.split(line);
But why is it not great?
--
Johan Van den Neste
So we have
List<String> values = ImmutableList.copyOf(splitter.split(line));
That's better than what I was going to use, but still not great.
Could splitToList() return an ArrayList<String> ? No problems with
circular dependencies there?
I don't like the idea of adding splitToList() / splitToArrayList() /
splitToImmutableList() to the Splitter itself. What would then stop us
from adding splitToSet(), splitToImmutableSet(), splitToArray()? I
think that while it would help readability in some cases, it would
also "pollute" the API and make it harder to learn.
Kent Beck touches on this subject in "Implementation Patterns" (pages
87-88 - sections: "Conversion", "Conversion method", "Conversion
Constructor"). I agree with him. Some quotes:
- "It’s not worth introducing a new dependency just to have a
convenient expression of conversion."
- "conversion methods become unwieldy when there are an unbounded
number of potential conversions"
- "These disadvantages lead me to use conversion methods sparingly and
only in situations where I am converting to objects of similar type."
(great book BTW)
I think that, unless we could get some kind of FluentIterable (which
does not seem likely, due to circular dependency requirements), an
Iterable is the perfect return type:
- you may use it "as is" if you simply want to iterate on the
splitting result
- you may dump it in the collection of your choice (whether ArrayList
or some custom collection). It may not read as nicely, but it's
flexible
- unless you decide you want to put the result in a Collection, the
objects are not allocated (as would be the case if, for example, the
Splitter returned an array - like String.split()).
For now, I would do:
List<String> splitList = Lists.newArrayList(splitter.split(line));
or, with static imports
List<String> splitList = newArrayList(splitter.split(line));
If I see myself using it in many places, I guess I could encapsulate
it in some kind of static method.
- Etienne
On Mar 19, 5:49 pm, Nikolas Everett <nik9...@gmail.com> wrote:
>But why is it not great?
The code above is dealing with the *how* (the technical details of
copying data around between different formats) as well as the *what*
(splitting comma delimited text into separate fields). Code that deals
only with *what* you are trying to achieve is easier to understand and
to maintain.
> I prefer it to return an Iterable. That way splitter.split(stringBuffer) doesn't have a chance of creating a one squidillion entry list.
>an Iterable is the perfect return type:
> - you may use it "as is" if you simply want to iterate on the splitting result
> - you may dump it in the collection of your choice (whether ArrayList or some custom collection). It may not read as nicely, but it's flexible
> - unless you decide you want to put the result in a Collection, the objects are not allocated (as would be the case if, for example, the Splitter returned an array - like String.split()).
Those are all valid points. The fact that the Splitter has the option
of returning an Iterable is a big improvement over String.split.
Iterable is certainly the most flexible type. As always, there is a
trade off between flexibility and convenience.
There are two very different use cases for a String splitter.
The first is where you have data of an unknown length. For instance,
separating the text on a web page into distinct words. Iterable is
great for this use case.
The second use case is where the data is of a fixed length and a known
format. For example, tabular text. In this case, Iterable is not
ideal. The data is of a known length, so memory allocation is not an
issue. The extra task of converting the Iterable to a List every time
you call split doubles the number of calls required to use the API.
>I don't like the idea of adding splitToList() / splitToArrayList() /
>splitToImmutableList() to the Splitter itself. What would then stop us
>from adding splitToSet(), splitToImmutableSet(), splitToArray()? I
>think that while it would help readability in some cases, it would
>also "pollute" the API and make it harder to learn.
Yes. Every member of the API should "pull its weight". Adding ALL
those methods would indeed be bad design. I would argue that
supporting a whole other group of users by providing one extra
function is worth it. By only providing an Iterable, you are ignoring
this second use case.
As a concrete suggestion, I would add a function
List<String> splitToList()
I have used this approach. It has some maintainability problems. What
happens if you comment out or remove one of the calls to itr.next()?
The rest of your data is now going into the wrong columns. I've found
using set column numbers if clearer.
It is also not ideal for data with a large number of columns where you
are only using 2 or 3 of the columns. You need a lot of calls to
itr.next() just to get at the columns you actually want.
The advantage of this method is that it is easier to change the data
format itself by inserting or removing columns (where this is
possible / desired).
Thanks. That's perhaps a bit nicer.