Alessio Spadaro
unread,Aug 12, 2010, 6:48:48 AM8/12/10Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Fixedformat4j User List
Hello,
first of all, thank you for this library, it's a real time-saver
and it's really easy to use.
I'm using ff4j for exporting a relatively large number of rows (about
4M ) containing eleven fields comprising strings, numbers (2) and
dates (2). The first attempt ran in a projected time of 25 minutes
and a brief profile run with jvisualvm revealed an hotspot in the
export method. With the following changes i reached a 3.5 minutes run
time for the same job.
One obvious improvement when using date fields annotated with
FixedFormatPattern is to cache SimpleDateFormat instances using
ThreadLocal (From DateFormatter):
private static class ThreadLocalDFCache extends
ThreadLocal<Map<String, DateFormat>> {
@Override
protected Map<String, DateFormat> initialValue() {
return new HashMap<String, DateFormat>();
}
}
private static ThreadLocalDFCache cache = new ThreadLocalDFCache();
DateFormat getFormatter(String pattern) {
DateFormat df = cache.get().get(pattern);
if (df==null) {
System.out.println("Cache per " + pattern);
df = new SimpleDateFormat(pattern);
cache.get().put(pattern,df);
}
return df;
}
The reason to use ThreadLocal is that SimpleDateFormat is not
ThreadSafe.
This alone gives a 18% performance boost on my use case (from 32s to
26s on 100K rows) and doesn't have any relevant side-effect.
The highest speedup is reached caching annotations metadata. If we
accept that a single instance of FixedFormatManager won't see a change
over the annotated fields (I can live with this..) we can aggressively
cache annotation-related metadata and reuse them, achieving a 5x (5s
down from 26s) on the very same load. This speedup only regards two
methods: export and getFormatInstructions. Export has been refactored
to gather and cache method annotations and reuse them on a per-class
basis:
List<Object[]> annotations =
getAnnotatedMethods(fixedFormatRecordClass);
for (Object[] anno : annotations) {
Method method = (Method)anno[0];
if (anno[1] instanceof Field) {
Field fieldAnnotation = (Field)anno[1];
String exportedData =
exportDataAccordingFieldAnnotation(
fixedFormatRecord, method, fieldAnnotation);
foundData.put(fieldAnnotation.offset(), exportedData);
} else if (anno[1] instanceof Fields) {
Fields fieldsAnnotation = (Fields)anno[1];
Field[] fields = fieldsAnnotation.value();
for (Field field : fields) {
String exportedData =
exportDataAccordingFieldAnnotation(
fixedFormatRecord, method, field);
foundData.put(field.offset(), exportedData);
}
}
}
using this helper method:
private List<Object[]> getAnnotatedMethods(Class
fixedFormatRecordClass) {
List<Object[]> annotations =
methodAnnotationMap.get(fixedFormatRecordClass);
if (annotations==null) {
annotations = new ArrayList<Object[]>();
Method[] allMethods = fixedFormatRecordClass.getMethods();
for (Method method : allMethods) {
Field fieldAnnotation =
method.getAnnotation(Field.class);
Fields fieldsAnnotation =
method.getAnnotation(Fields.class);
if (fieldAnnotation != null) {
annotations.add(new Object[]
{method,fieldAnnotation});
} else if (fieldsAnnotation != null) {
annotations.add(new Object[]
{method,fieldsAnnotation});
}
}
methodAnnotationMap.put(fixedFormatRecordClass,
annotations);
}
return annotations;
}
(Bear with me for the poor code quality..)
A very similar approach has been used on getFormatInstructions.
I implemented it by extending FixedFormatManagerImpl, which required
to transform some of its private methods in protected ( i didn't see
any reason for them to remain private).
didi other changes like using Stringbuilder instead of StringSuffer
and using log4j directly (still investigating this) as commons logging
was misbehaving.
I'll be glad to contribute a patch if desired
Best regards,
Alessio Spadaro