Fixed sized stream with variable length last field?

2,792 views
Skip to first unread message

pakman

unread,
Sep 10, 2012, 7:40:40 PM9/10/12
to bea...@googlegroups.com
I have a file with fixed lenght fields, except for the last field which has a variable length of up to 25 characters.
I have tried with all sort of combination of lenght, minLength, maxLength, strict=false but with no success.
Is this supported?

Thanks.

pakman

unread,
Sep 10, 2012, 7:54:16 PM9/10/12
to bea...@googlegroups.com

Ok, 10 minutes later I figured an ugly hack around this.  Use a repeating field of length="1"

<record name="sample" class="xxx.yyy.Record">
<field name="field1" length="3" type="string" rid="true" regex="[A-Z]{3}" />
<field name="field2" length="5" type="int" rid="true" regex="[0-9]{5}" />
<field name="field3" length="3" type="int" />
<field name="descriptionHack" length="1" type="char" minOccurs="1" maxOccurs="25" collection="array"/>
</record> 


and my setter:

public void setDescriptionHack(char[] descriptionHack) {
this.description = new String(descriptionHack);
}

If someone has a cleaner solution, please share.

Kevin

unread,
Sep 11, 2012, 11:01:55 AM9/11/12
to bea...@googlegroups.com
Hello,

I didn't think of that use case, so you're right that it's currently not supported.  I assume the last field is not padded then, right?  I'm working on some issues for a 2.0.2 release, and then I can see what it would take to support this.

One possible addition to your hack might be to wrap the field in a String segment (although I didn't actually try it myself).

<segment name="descriptionHack" class="java.lang.String">
  <field name="c" setter="#1" getter="toCharArray" length="1" type="char" minOccurs="1" maxOccurs="25" collection="array"/>
</segment>

Thanks,
Kevin

pakman

unread,
Sep 12, 2012, 2:43:38 PM9/12/12
to bea...@googlegroups.com

Hey Keving, thanks for your quick response.  Unfortunately using segment as you suggest gives me this error:

org.beanio.BeanIOConfigurationException: Invalid segment 'descriptionHack', in record 'default', in stream 'default': No suitable constructor found for bean class 'java.lang.String'

I started to play with segments because I came across another similar scenario:  Consecutive unpadded fields. i.e. The same case as the original but not limited to the last field in the record but for the n last fields of the record.  That is, the very last field might not even be present in the record, so the next to last is the one that turns as an unpadded field.

With my hackish solution, it doesn't work, I get the following mapping error:

org.beanio.BeanIOConfigurationException: Invalid field 'lastFieldHack', in record 'default', in stream 'default': Cannot determine field position, field is preceded by a component with indeterminate or unbounded occurences


messy...

pakman

unread,
Sep 12, 2012, 2:45:26 PM9/12/12
to bea...@googlegroups.com
by the way this is the field mapping causing the issue:

<field name="nextToLastFieldHack" length="1" type="char" minOccurs="1" maxOccurs="25" collection="array" />
<field name="lastFieldHack" length="1" type="char" minOccurs="1" maxOccurs="50" collection="array" />

pakman

unread,
Sep 12, 2012, 4:55:17 PM9/12/12
to bea...@googlegroups.com

New hack for multiple  variable length fields:

<segment name="descriptionVarField" class="xxx.StringWrapper" minOccurs="0">
<field name="chars" length="1" type="char" minOccurs="1"
maxOccurs="25" collection="array" />
</segment>
<segment name="longCompositionVarField" class="xxx.StringWrapper" minOccurs="0">
<field name="chars" length="1" type="char" minOccurs="0"
maxOccurs="25" collection="array" />
</segment>

and StringWrapper.java:

public class StringWrapper {
private String value;
public char[] getChars() {
return value.toCharArray();
}

public void setChars(char[] array) {
this.value = new String(array);
}

@Override
public String toString() {
return value;
}
}

Kevin

unread,
Sep 13, 2012, 2:32:27 PM9/13/12
to bea...@googlegroups.com
Hello,

I think you've lost me...  I don't see how its possible to reliably parse multiple variable length fields at the end of a fixed length record.  How do you know how many fields to parse, and what lengths they are?

Thanks,
Kevin

Pablo Krause

unread,
Sep 13, 2012, 2:55:01 PM9/13/12
to bea...@googlegroups.com

Record has a maximum of n "fixed" length fields.
From last to first, the fields are optional; i.e. If last field is not needed, is simply is not there in the file.  No padding. nothing.  If next to last field is not needed either, once again it simply is not there.  The record just ends.
The fields are of "fixed" lenght as long as there is another field after them. If it is the last field it is not padded to its "fixed" lenght, the record simply ends.

Example:

Spec:

field 1: numeric, length 4. Mandatory.
field 2: numeric, length 2. Mandatory.
field 3: text, length (up-to) 6. Optional only if next field is not present. Padded only if next field is present.
field 4: text, length (up-to) 100. Optional. Never padded to full length.

Sample:

000011AAA   BBBBBBBBBB<eol>
222233CCCC<eol>
444455<eol>

Explanation:

First record: contains all fields, but the last field is NOT padded up to its full size.  Field 3 is padded to fill its full length.
Second record: Does not contain field 4.  This makes field 3 the last field on the record so it is NOT padded to fill its full length.
Third record: Does not contain fields 3 and 4.

¿Makes sense?
--
Pablo

Kevin

unread,
Sep 14, 2012, 6:36:26 PM9/14/12
to bea...@googlegroups.com
Hi Pablo,

Ok, that makes sense, I didn't understand that you were conditionally padding the fields at the end of the record.

This is not currently supported, but I might have a workaround if we can assume that the fields at the end of the record are all space padded and left justified.  If that's the case, we can pad the record as its read from the stream to its maximum possible length before BeanIO extracts the fields.  The only downside is that you might get an empty string (instead of null) for some values that weren't in the input stream.

To do this, you'll need to create a custom fixed length record reader, which can simply extend and wrap the BeanIO default like this:

package pakman;


import java.io.*;


import org.beanio.stream.*;

import org.beanio.stream.fixedlength.FixedLengthRecordParserFactory;


public class CustomFixedLengthParserFactory extends FixedLengthRecordParserFactory {


    private Integer recordLength;

    

    @Override

    public RecordReader createReader(Reader in) throws IllegalArgumentException {

        final RecordReader reader = super.createReader(in);

        return new RecordReader() {

            public Object read() throws IOException, RecordIOException {

                String record = (String) reader.read();

                if (record != null) {

                    record = pad(record);

                }

                return record;

            }

            public void close() throws IOException {

                reader.close();

            }

            public int getRecordLineNumber() {

                return reader.getRecordLineNumber();

            }

            public String getRecordText() {

                return reader.getRecordText();

            }

        };

    }

    

    private String pad(String record) {

        if (recordLength == null) {

            return record;

        }

        

        int n = recordLength - record.length();

        if (n <= 0) {

            return record;

        }

        

        StringBuilder s = new StringBuilder(record);

        for (int i=0; i<n; i++) {

            s.append(' ');

        }

        return s.toString();

    }

    

    public Integer getRecordLength() {

        return recordLength;

    }


    public void setRecordLength(Integer recordLength) {

        this.recordLength = recordLength;

    }

}


A simple mapping file could then be configured something like this:

<beanio xmlns="http://www.beanio.org/2012/03" 

  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

  xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">


  <stream name="stream" format="fixedlength">

    <parser class="pakman.CustomFixedLengthParserFactory">

      <property name="recordLength" value="20" />

    </parser>

    <record name="user" class="map">

      <field name="firstName" length="10" />

      <field name="lastName" length="10" />

    </record>

  </stream>


</beanio> 


With this configuration, the 'lastName' field could be less than 10 characters or missing altogether and BeanIO will still map your beans correctly.

A similar approach could be used for writing the file (just trim the record instead of padding it), but I'll leave that implementation to you.

Thanks,
Kevin
Reply all
Reply to author
Forward
0 new messages