I thought it was architected as a service-oriented extensible system in mind. Also I thought data-lineage is one of the big features as well? But other apache projects such as apache atlas have also taken on data lineages?
There are also a few abandoned GitHub projects (search kylo docker) trying to run it in a docker container while leaving all other parts (nifi, spark, hive) as external components.
It also needs an easy open/marketplace/plugin system for people to write custom transforms, validators, and/or use data-feeds to transform data-feeds. Perhaps it also needs to extends its way of exposing data in datalake for user/tool consumption, but that can become out of scope really soon.