NOTICE: Pentaho 6.0 upgrade from VFS 1.x to VFS 2.x

Skip to first unread message


Oct 12, 2015, 12:59:17 PM10/12/15
to kettle-developers

With the release of Pentaho 6.0 on the horizon, I wanted to let you know that there are some important changes coming, and some may break your plugins, "glue code", etc. This often happens at major release boundaries, so hopefully this won't come as a huge surprise.

Pentaho has upgraded their use of VFS libraries from version 1.x (a custom version forked from the Apache project) to basically Apache VFS 2.1-SNAPSHOT. We are working with the project maintainers to release a proper 2.1 so we don't have snapshot artifacts in our release. In fact we have locked down a particular commit in the 2.1 branch and have built our own release version for Apache VFS 2.1.

The relevant commits are here:

How does this impact you?  If you use org.apache.commons.vfs.* or KettleVFS and consume the org.apache.commons.vfs.FileObject or other VFS1 classes, then your code will no longer compile against 6.0 artifacts. If you use VFS 1 by itself and include it as a plugin dependency, this may work ok, unless you try to "trade objects" between your plugin and other Kettle classes.

To upgrade, you should be able to simply change the package name from org.apache.commons.vfs.* to org.apache.commons.vfs2.*.  If your code still won't compile, please consult the above pull requests to see where other changes were needed.

In general, a good rule of thumb at major release boundaries is to:

a) before the Pentaho release, compile your code against the latest libraries (6.0-SNAPSHOT at this point is good enough, but I think the "real" ones are
b) after the release, test your plugins (original and/or recompiled) by dropping them into the data-integration/plugins folder, make sure they still operate successfully.
c) If you need to re-release for 6.0, please update your entries in the Marketplace.

When 6.0 is released, there will be "What's New?" documentation available, and as usual you can direct questions for the community here in this Google Group.

Matt Burgess
Lead Software Engineer
Pentaho Corporation, a Hitachi Data Systems company

Brandon Jackson

Oct 19, 2015, 7:04:45 PM10/19/15
to kettle-developers
We tried Pentaho Data Integration 6.0 both EE and CE with our existing vfs backed ETL.  It failed with no real error messages other than 'file not found' and no input from many of the steps.  The exact same ETL succeeds in every version of PDI previous to 6.

Are there any unit tests against sftp?  I noticed that the unit tests for zip files have something like file:///c:/   with 3 slashes.  The URL sftp example ktr has s with only 2.  Are there some file path conventions that changed?

Our has:
#VFS Test

If there are some debug points that you can suggest, I could get more specific and file a JIRA.
Did every step in PDI 6 have to get updated, or was KettleVFS the only location of the updated code around that vfs apache library?

You received this message because you are subscribed to the Google Groups "kettle-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To post to this group, send email to
Visit this group at
For more options, visit


Matt Burgess

Oct 19, 2015, 8:20:15 PM10/19/15
IIRC in Kettle it was almost exclusively limited to KettleVFS but as some steps were contributed over time and there was no enforcement over the methods used, it's possible some steps use VFS directly and had to be updated.

Also some changes to certain steps were made to increase usability although every effort was made to ensure backwards compatibility.

We are in the process of updating Jira cases to reflect what was done for these in 6.0. If you have a reproduction transformation (not tied to your data of course), please let me know (or find/file the appropriate Jira) and we will take a look.


Sent from my iPhone
Reply all
Reply to author
0 new messages