Adding MDI support to a PDI step

24 views
Skip to first unread message

Pedro Alves

unread,
May 13, 2016, 6:27:21 AM5/13/16
to pentaho-...@googlegroups.com

Novo blog post em http://pedroalves-bi.blogspot.pt/2016/05/adding-metadata-injection-support-to.html

-pedro


----------------


Adding Metadata Injection Support to a Pentaho Data Integration Step

As I wrote on a previous blog post - and not nearly as well as Jens did on his blog - Metadata Injection is kind'a of a big deal around here. You may have used it, you may have heard about it, but I'm sure you will in the future. 

Simply put - the concept of Metadata Injection is what allows a transformation to change itself at run time, dynamically adapting as needed to different inputs, different rules, different outputs.

In order to do that, the individual steps have to support it; We've been doing a huge amount of catch up work to increase the list of steps that support it, but that's not enough - we need your help! I'd like each of you to also add MDI support to the steps you've been contributing to the marketplace. 

In order to facilitate it, the engineering team prepared the following instructions on how to do it (and here's a link to a concrete implementation)

Adding Metadata Injection Support to Your Step


You can add metadata injection support to your step by marking the metadata class and the step’s fields with injection-specific annotations. You use the @InjectionSupported annotation to specify that your step is able to support metadata injection. Then, you use either the @Injection annotation to specify which fields in your step can be injected as metadata, or use the @InjectionDeep annotation for fields more complex than usual primitive types (such as string, int, float, etc.).

InjectionSupported

Use the @InjectionSupported annotation in the metadata class of your step to indicate that it supports metadata injection. This annotation has the following parameters.

Parameter
Description
localizationPrefix
Indicates the location for your messages in the /messages/messages_.properties file. When the metadata injection properties are displayed in PDI, the description for the field is retrieved from the localization file by the mask .
groups
Indicates the optional name of the groups you use to arrange your fields. Your fields will be arranged in these groups when they appear in the ETL Metadata Injection step properties dialog.

For example, setting the localizationPrefix parameter to “Injection.” for the “FILENAME” field indicates the /messages/messages_Injection.FILENAME.properties file. This prefix and "FILENAME" field within the following @InjectionSupported annotation tells the system to use the key "Injection.FILENAME" to retrieve descriptions along with the optional “GROUP1” and “GROUP2” groups.
@InjectionSupported(localizationPrefix="Injection.", groups = {"GROUP1","GROUP2"})
If your step already has metadata injection support using a pre-6.0 method, such as it returns an object from the getStepMetaInjectionInterface()method, then you will need to remove the injection class and getStepMetaInjectionInterface()method from the metadata class. After this class and method are removed, the method getStepMetaInjectionInterface()is called from the base class (BaseStepMeta), and returns null. The null value indicates your step does not support pre-6.0 style metadata injection. Otherwise, if your step did not use this type of implementation, you do not need to add or manually modify this method to the metadata class.
Although inheritance applies to injectable fields specified by the @Injection and @InjectionDeep annotations, you still need to apply the @InjectionSupported annotation to any step inheriting the injectable fields from another step. For example, if an existing input step has already specified injectable fields through the @Injection annotation, you do not need to use the @Injection annotation for fields you inherited within the step you create. However, you still need to use the @InjectionSupported annotation in the metadata class of your step even though that annotation is also already applied in the existing input step.

Injection

Each field (or setter) you want to be injected into your step should be marked by the @Injection annotation. The parameters of this annotation are the name of the injectable field and the group containing this field:
@Injection(name = "FILENAME", group = "FILE_GROUP") - on the field or setter
This annotation has the following parameters.
Parameter
Description
name
Indicates the name of the field. If the annotation is declared on the setter (typical style setter with no return type and accepts a single parameter), this parameter type is used for data conversion, as if it was declared on a field.
group
Indicates the groups containing the field. If group is not specified, root is used when the field displayed in the ETL Metadata Injection step properties dialog.

This annotation can be used either:
·      On a field with simple type (string, int, float, etc.)
·      On the setter of a simple type
·      In an array of simple types
·      On a java.util.List of simple types. For this List usage, type should be declared as generic.
Besides with these types, you need to understand special exceptions for enums, arrays, and data type conversions.

Enums

You can mark any enum field with the @Injection annotation. For enum fields, metadata injection converts source TYPE_STRING values into enum values of the same name. For your user to be able to use any specified values, all possible values should be described in the documentation of your metadata injection step.

Arrays

Any @Injection annotation can be added to an array field:
@Injection(name="FILES")
private String[] files;
The metadata object can also have a more complex structure:
MyStepMeta.java
public class MyStepMeta {
  @InjectionDeep
  private OneFile[] files;

  public class OneFile {
    @Injection(name="NAME", group="FILES")
    public String name;
    @Injection(name="SIZE", group="FILES")
    public int size;
  }
}
Metadata injection creates objects for each row from the injection information stream. The number of objects equals the number of rows in the information stream. If different injections (like NAME and SIZE in the example above) are loaded from different information streams, you have to make sure that the row numbers are equal on both streams.
Note: Instead of an array, you could use Java.util.List with generics.

Data Type Conversions

You can convert from RowSet to simple type for a field with the DefaultInjectionTypeConverter class. Currently supported data types for fields are string, boolean, integer, long, and enum. You can also define non-standard custom converters for some fields by declaring them in the 'converter' attribute of @Injection annotation. These custom conversations are extended from the InjectionTypeConverter class.

InjectionDeep

Only the fields of the metadata class (and its ancestors) are checked for annotations, which works well for simple structures. If your metadata class contains more complex structures (beyond primitive types), you can use the @InjectionDeep annotation to inspect annotations inside these complex (not primitive) fields.
Example:
@InjectionDeep

This annotation can be used on the array or java.util.List of complex classes.
Reply all
Reply to author
Forward
0 new messages