Optional delimited fields

510 views
Skip to first unread message

jimbo68

unread,
Feb 2, 2012, 2:01:56 PM2/2/12
to beanio-users
I have some odd requirements to create a csv file but each field is
optional. There is a datatype identifier for each field that wraps
the actual data in parentheses. So in the example below the data for
ADDRESS2 in the 2nd line is null so the requirement is to not include
this field. Is there some setting or something in beanio that would
accommodate this?

FIRSTNAME(TEST1),LASTNAME(TEST2),ADDRESS1(TEST3),ADDRESS2(TEST4)
FIRSTNAME(TEST1),LASTNAME(TEST2),ADDRESS1(TEST3)

Kevin

unread,
Feb 2, 2012, 7:53:12 PM2/2/12
to beanio-users
Hi jimbo,

With BeanIO, a CSV or delimited stream must have positionally fixed
fields to correctly map them to a field definition.

So using your example, if last name is not present, but a separator
was still included as shown below, it might be possible to use BeanIO
and simply create your own type handler to parse the value inside the
().

FIRSTNAME(TEST1),,ADDRESS1(TEST3)

I'm guessing this isn't the case though, in which case there is
considerably more work to extend BeanIO in order to support this
format.

Out of curiosity, is there any standardized specification for this
file format? Is it really CSV or just comma delimited? How are
parenthesis escaped if they appear in a value? Are other characters
escaped?, etc. What industry are you in that uses this format?

Thanks,
Kevin

jimbo68

unread,
Feb 6, 2012, 9:43:47 AM2/6/12
to beanio-users
Hey Kevin,

Thanks for your response. It is actually just a comma delimited file
that I have to create. The file is to be imported into a vendor
application we have (which I know very little about). I imagine
reading the file would be quite difficult using BeanIO but luckily we
don't have to do that. I don't think it's an industry standard and I
don't know why the vendor didn't just use XML although I get the
feeling that the product is old. I don't think escaping the
parentheses is necessary due to the data types and upstream
validation. Also, I was giving an address example for simplicity but
the actual data is financial.

Is there an implemetation or other solution that I can look at to not
write the delimiters for null fields?

Thanks!

Jim

Kevin

unread,
Feb 6, 2012, 12:54:42 PM2/6/12
to beanio-users
Hi Jim,

This may be a bit of a hack, but it could work (I didn't test it
myself), or at least get you on the right track... A BeanIO delimited
stream format writes records using a String array, so the position of
a field in the mapping file will always dictate the position of the
field in the array object passed to the writer. So with that in mind,
you can probably override the default writer with something like the
following:

public class TagDelimitedWriter implements RecordWriter {

private static final char delim = ',';
private static final String [] field = {
"FIRSTNAME",
"LASTNAME",
"ADDRESS1",
"ADDRESS2"
};

private Writer out;

public TagDelimitedWriter(Writer out) {
this.out = out;
}

public void write(Object recordObject) throws IOException {
int fieldCount = 0;
String [] value = (String[]) recordObject;
for (int i=0,j=field.length; i<j; i++) {
if (!"".equals(value[i])) {
if (++fieldCount > 1) {
out.write(delim);
}
out.write(field[i]);
out.write("(");
out.write(value[i]);
out.write(")");
}
}
out.write("\n");
}

public void flush() throws IOException {
out.flush();
}

public void close() throws IOException {
out.close();
}
}

You'll then need to create a RecordWriterFactory for this like so:

public class TagDelimitedWriterFactory implements RecordWriterFactory
{
public RecordWriter createWriter(Writer out) throws
IllegalArgumentException {
return new TagDelimitedWriter(out);
}
}

And register its use in your mapping file:

<stream name="..." format="delimited">
<writer class="example.TagDelimitedWriterFactory" />
...
</stream>

If you need to support multiple record types, the solution will get a
bit trickier, but you could add a field in the first position of the
record that identifies the record type using a literal value (with
ignore="true"), and use the literal value to determine the field list
in your writer...

Thanks,
Kevin

jimbo68

unread,
Feb 6, 2012, 2:37:26 PM2/6/12
to beanio-users
Ok, I got your code working and it appears it will solve my issue. I
really appreciate your help on this!!!!

Jim
Reply all
Reply to author
Forward
0 new messages