This really doesn't tell us much of anything
* What's your HW setup like, is this data in RAM, on disk, what does
the CPU/disk use etc. look like when you do queries
* What does your data look like, do you have an example
* How does the mapping look like
* How do the queries that you've tried look like
* Have you tried increasing the number of shards, if you have 2 shards
in total with X number of replicas and 3 machines you'll only
distribute your queries to 2 shards, maybe that's your bottleneck, so
increasing replicas/boxes wouldn't help.