1- Yes, tranquility writes events directly to druid without an external buffer like kafka. It uses in-heap buffers at each end (your end, and druid's).
2- 200K/sec isn't a problem in general, though you will probably need to partition your data across a few peons. I think the usual guidance is something like 5–50K/sec per peon depending on data complexity. Pre-aggregating data at the app always helps if possible.
3- The issue with kafka -> druid is that if you need to partition your dataset across multiple consumers, then you have two choices. You can have one consumer group, in which case everything works OK, but if you lose a realtime node then you can't query data that has not been handed off yet (temporarily, if you recover the node later, or permanently, if you don't). Or you can have two consumer groups, both ingesting the same data, which gives you better availability but has a serious problem. Druid segments all have partition numbers, and the broker assumes that two segments with the same partition number will have the same data. It will happily mix together results from segment 0 in consumer group A with segment 1 in consumer group B. Those segments will likely not be consistent across consumer groups, so your queries will not be consistent either. Tranquility's approach is to push data into druid rather than having druid pull it, which means tranquility can ensure that multiple replicas of the same partition number actually have the same data.
4- Tranquility actually uses the event receiver firehose, so there won't be any performance difference, 'cause it's the same thing. The main thing tranquility buys you is automatic management of indexing service tasks and client-side load balancing. You just tell it what your schema is and how many partitions and replicas you want, and it will take care of getting the right data into the right peons. If you talk to Druid directly, you'll have to manage that sort of stuff yourself.