For understanding the robots.txt
protocol you should study that at
http://www.robotstxt.org/ .
In a nutshell it works by disallowing rather than by allowing
things. It's prefixed based. The allow directive is an extension
that's honored by some robots, but not really part of the
protocol. You should sit down and think about all the sections of
the site and types of urls you don't want robots to crawl,
idnetify a common prefix and disallow that. If there's an
exception to that rule then use the allow directive for the
specific exception. Otherwise all that is not disallowed is
allowed.
Keep in mind the robots.txt file does not manage access to the
site, it's just a polite request about crawling or not certain
parts of the site, like "keep off the grass" would be in a park.
In contrast the .htaccess file is used to control responses for
various urls (including but not limited to rewriting them as
needed), as well as physically controlling access to parts of the
website (by blocking unauthorized access for instance).
The X-Robots-Tag is a robots directive (thus not forcefully
binding like anything to do with robots diretcives) and can be
used when you cannot add robots meta tags to certain files or file
types. Google and most reliable robots honor that, as they would a
robots meta tag as well. A rogue robot honors nothing, just
saying.
If Magento developers advise those robots directives the way you
used them then they aren't really competent enough in the matter.
Also, IMO, a well built product shouldn't require quite so much
manipulation of the robots.txt file. I don't know any of
e-commerce platform that is good or even just acceptable right out
of the box, but I've long avoided dealing with them anyway.
Christina