Disclaimer: I have been a SIGIR SPC (formerly area coordinator) four
times, most recently 2010. I have been general co-chair of SIGIR
(2003) and I have had many papers accepted and rejected by SIGIR. I
like to think that my experiences inform rather than bias my point of
view, but you might think otherwise.
In this message, I'm going to be critical. I may allude to different
ways of doing things, but it is premature to start promoting (or
dissing) particular approaches before we decide what we want,
understand what we do now, and consider any contemplated changes as a
whole.
Point 1. SIGIR suppresses the communication of approximately 400
research results for 2 months, due to the requirements for blinding,
and non-submission to another venue. Why is this? In days of yore,
and in many (most) other fields, the purpose of conferences is for
early communication of results, not their resting place, or even the
final word on their correctness.
Point 2. In the four times that I've been SPC, I've never been in a
position to make an informed decision about papers under
consideration, in particular ones that "could go either way." This
has occurred for several reasons. One is that I have not had access
to the ranked list of papers, and a priority queue of those requiring
discussion due to controversy. A second is that I have not had the
opportunity to add my input to the priority queue.
Point 3. I believe that decisions were taken this year that would not
have been the consensus of the committe of the whole, had they had the
opportunity to avail themselves of the discussion -- and to cast a
vote. Of course, you need ad hoc subsets of the committee to study
papers and make recommendations, but the end result should be
*ratified* by the committee of the whole. For this ratification to be
informed, the necessary discussion and materials must be available to
all committee members (subject to conflict of interest restrictions).
Point 4. Identifying conflicts of interest *seriously* compromises
the blinding process. [The merits of blinding are stipulated here,
but argued elsewhere.] A much better approach is used in the NSERC
granting committee of which I have been a member: the voting members
on every paper consist of about 2/3 of the committee -- these members
are chosen using a balanced design, with the constraint that no
committee member votes on a conflict of interest. Any member can
participate in the discussion (subject to conflict of interest, which
is known only to the chair and the affected members), but only the
designated votes count. The other members cannot tell whether a
member is excluded from the vote due to chance or COI.
Point 5. There is no opportunity for rebuttal, so decisions can be
made on incorrect assumptions. I personally am not such a fan of
"online rebuttal" but that's getting ahead of the game, as I promised
I would not do. When you submit to a journal, you can address the
referees' criticisms, ensuring a very strong chance that a decent
result will actually be published.
Point 6. Why on earth are conferences trying to usurp journals? Why
is submission to a conference and a journal forbidden? Maybe there
should be a SIGIR journal, with all submissions automatically
considered for it, and the interesting ones selected for presentation.
Point 7. In other fields, conferences are not the graveyard for
completed work. For example, in medicine, one might publish a
protocol -- the design of an experiment intended to show something.
Then the community could reach consensus about whether that
experiment, if conducted, would support the researcher's hypothesis.
That's knowledge. Important knowledge. But not a "significant
contribution to the state of the art."
Point 8. Before journals were usurped by high stakes conferences,
there were letters to the editor, technical correspondence -- and
errata. I recall clearly that I found a counterexample to a paper
published in SIGMOD, and there was no place to publish it -- SIGMOD
wanted only new results, not contradictions of previous ones. This
is crazy. I don't think SIGIR has a process any better than SIGMOD.
SIGIR Forum doesn't cut it. It needs to be an official, reviewed,
organ of the publication venue.
Point 9. Statistical tests are overused and misused. They are
backwards. The *first* question you should ask of a result is "is it
substantive?" not "is it significant?" If it is substantive, you
then want to ask "is it real?" and at that time you should look for
confounds and the possibility that the result is due to chance. The
whole idea that "all you want to know is whether A is better than B"
without quantifying "how much better" is *way* off base -- but de jure
in IR. Can SIGIR influence what we consider acceptable reporting of
results?
Point 10. There is much more to scientific inquiry than a statistical
test. One must have some sort of prior hypothesis, and a chain of
logic that makes a falsifiable claim. Then the falsifiable claim must
be validated. *All* attempts to validate must be reported. I'm going
to make an elementary point just for emphasis: If I conduct 100
experiments to determine which of two fair coins shows heads more
often, 5 of my experiments will show that one is significantly more
likely to come up heads than the other. Yet many papers report of the
order of 100 coin tosses, not to mention the ones that were conducted
and not reported.
Also, pairwise tests among, say, 30 methods, are totally bogus. There
are 30*31/2 = 465 pairs. So if I test 30 coins in this manner, *of
course* I'm going to find one that is significantly [sic] better than
the rest.
Point 11. Lots of scientific inquiry requires no statistics. Many of
Einstein's predictions could be verified empirically only by
exceedingly rare astronomical events. Nobody said "well, you need 50
events before you can publish your work in Annalen der Physik." In
fact, his work was published with 0 events. And when the first event
occurred, the observations were published, too. Statistics is but one
tool used to eliminate possible explanations (other than the
hypothesis) for observed phenomena. It is neither necessary nor
sufficient.
That's all for now. Thanks for reading.
Gord
The only way to ensure review quality is to have at least 1 senior reviewer
for one marginal case paper, and make sure reviewers discuss.
Insightful and helpful reviews come out of more senior researchers and more
experienced reviewers.
Scoring reviewers would be a good way of detecting good reviewers.
But more importantly, there should be a tutorial or training session for
reviewing,
to teach people how to do helpful reviews, and that despite those one
line reviews,
what are other types of bad reviews.
Le