Welcome and thanks for the question!
Indeed one major draw for this over OC is control -- it is nice to have an open source solution that you can just set up on your own server if it comes to that. Better yet, having it open means you can add to it and tweak it.
However there are a ton of functional improvements since we are working with more types of media. OC does for generic text what we want to do for images, audio, video, and yes also generic text. On this line, we are also designing the system to be modular enough to support custom media types. For instance news_text might offer certain types of extraction and markup tools that are more specific to a news article (i.e. source identification)
Does that help clear up the ambiguity? This is a very young project so I can't imagine much of this vision is clear yet. Anyone else have thoughts to add?