Yes, the client is expected to fetch the hyper-schema to understand what it can do with the document. Although this may feel different, it's really not an unusual pattern for the web. When a web browser retrieves an HTML document, it often requires that the client download dozens of additional resources in the form of css, js, and images in order to completely handle the request. Also, it might seem inefficient that the client has to make two requests. But, considering that schemas should be cacheable, that extra request should only have to happen once each time the schema changes. Those first requests might take a little longer, but subsequent requests will be more efficient because you need to download less content than if the link data is embedded in the resource. Therefore, the service is more efficient overall because of the separation of concerns.
Is this still HATEOAS? Absolutely. The point of HATEOAS is that everything necessary to describe the flow of control is included in the response. There should be no out of band knowledge necessary. In the case of JSON Hyper-Schema, the client downloads the resource, follows the HTTP Link header to get the hyper-schema, and then processes the hyper-schema to know what it can do next. There is an extra level of indirection, but the spirit is of the principle is in no way violated.
As for the behavior of the root URL, I usually return a little resource with a short description of the service. It doesn't serve any real purpose other than to fill the void, so the shorter the better. Again the schema should be cachable, so the first request ends up being like a HEAD request. It's just a check to see if you need to download an updated hyper-schema or if you can use the cached version. If a hyper-schema other than the one that is already cached is specified in the HTTP Link header, another request is necessary, but that should be rare. So again, in the long run, it's more efficient to do it this way.