This is one of my favourite topics; due to recent success.
When I hired the "super testers of 2013", I found that it was more important to look at how you grade the answer more than the strength if the question.
I do support some limited "play" games or exercises, such as the "test this" exercise with a log in screen, or a pen with a torch function.
When it comes to the grading if questions much of it is about the hiring managers instinct. However, a system is often required to support that.
I used 6 questions, each targeting a cultural aspect of the project, a core competency, a chance to sell oneself, or a problem to solve.
Each question was graded by making notes and scoring each of the following areas; efficiency/delivery, technical skills, team player/fit.
There was an additional "other" category to add an additional area of focus, where necessary.
Grades were out of 3, with bonus points awarded for anything interesting.
It worked very well and helped keep an account of the interview.
The numbers helped guide us but were not the sole measure of success.
On occasion, where there was disagreement, a repeat of the JD and the type of person that was required would be discussed, and a 2nd grading and voting session would take place.
Overall it was great fun, and together with the pre-interview questions I was able to successfully drive the crap and salesmen away, and sift out some truly superstar testers.
Sent from my iPhone