Friday, January 23, 2015

More on reviewing

Dominique Sportiche sent me this interesting break down on the NIPS experiment mentioned by Alex C in the comments section (here). It also provides a possible model for the results that presents in a more formal idiom one hypothesis that I floated for the results of the MRC results, namely that once one takes account of the clear winners and the clear losers the messy middle gets in in a tossup. At any rate, the details are interesting for the results as interpreted in the link come very close to assuming that acceptance is indeed a crapshoot. Let me repeat, again, that this does not make the decision unfair.  Nobody has  right to have their paper presented or their work funded. Arbitrary processes are fair if everyone is subject to the same capricious decision procedures.

That raises an interesting question, IMO. What makes for a clear winner or looser?  Here I believe that taste plays a very big role, and though I am a big fan of disputing taste, in fact I think that it is about the only thing worth real discussion (de gustibus dispudandum est), there is no doubt that it can move around quite a bit and not be easy to defend.

Still, what else to do? Not review? Randomly accept? This would likely delegitimize the reported work. So, I have no great ideas here. It seems that our judgments are not all that reliable when it comes to judging current work. Are you surprised? I'm not.


Let me add one more point. How do we decide what's good and what's not? Well on influence I suspect is what gets published/accepted/funded and what doesn't. If correct, then we can get a reinforcing process where the top and bottom ends of the acceptance/rejection process are themselves influenced by what was and wasn't done before. This implies that even where there is consensus, it might itself be based on some random earlier decisions. This is what can make it hard for novelty to break through, as we know that it is (see here for a famous modern case).

One of the most counterintuitive probability facts I know of is the following: If one takes two fair coins and starts flipping them and one coin gets, let's say, 10 heads "ahead" of the other how long will it take the second coin to "overtake" the first wrt heads? Well, a hell of a long time (is "never" long enough for you?). This indicates that random effects can be very long lasting.

One last comment: fortunately, I suspect that most science advances based on (at most) a dozen papers per field per year (and maybe fewer). Of course, it is hard to know ex ante which dozen these will be. But that suggests that over time, the noisy methods of selection, though they are of great personal moment, may make little difference to the advancement of knowledge. Humbling to consider, isn't it?


  1. Dominique Sportiche sent me this comment which I share with you with his permission. It is a positive suggestion, rather than just more whining. Comments appreciated.

    Hi Norbert, I did not have grants in mind so much as conferences and was trying to think of ways to improve the selection process which I see pretty much as a lottery (I wonder if people agree with me), a thought that is consistent with this NIPS story, perhaps even more so in linguistics where standards can be fuzzy.
    This is bad for the field, and also for younger researchers whose acceptance at conferences is a big deal.

    All this to say that I prefer easychair to easyabs (linguist list) in particular because it has a feature - which must be truend on - that I really like (even though I may be in a minority??)

    Once a review of an abstract is entered, the reviewer can (anonymously) see all other reviews, and comment on them, and it is possible
    to change one's grade or one's comments on the basis of comments by other reviewers (the grade history remains however).
    For conferences I review which have this feature turned on, I always read the other reviews. I have changed my grade on the basis of remarks by other reviewers. I have commented on other people's reviews.

    It seems to me that this is very valuable: it keeps the reviewers more
    honest (they know someone could look very carefully at what they
    write), allows for discussion among reviewers, provides further input
    for the selection committee and makes the whole process more transparent.
    I see no downside, are there some?

    1. I completely agree, that is a great feature of EasyChair (one of the many reasons why it is preferable to EasyAbs). Unfortunately it seems that many reviewers aren't actually interested in discussions and ignore these comments --- at least that's been my experience so far, but I'll also admit that the sample size is fairly small since most conferences I've reviewed for had the feature deactivated. Of course there's also the issue that reviews sometimes aren't uploaded until one or two days before the deadline, which might limit a reviewer's willingness to rethink their evaluation.

      I would actually like to see an even more open system where everybody with an EasyChair account can read the abstract and comment on reviews (or at least give them a point rating). That has at least three advantages: 1) additional incentive to write thoughtful reviews rather than one-liners or rants, 2) more feedback from the community and all the positives that entails, 3) students get some insight on the reviewing process and how to write good abstracts.

      One could even enforce that reviewers for an abstract may only participate in the discussion until the reviewing deadline while comments may be made later on, too. Then it is in the reviewer's interest to upload their review asap so that they can defend their evaluation as long as possible.

      Finally, usernames can be hidden if people wish to remain anonymous, but there should be an option for non-anomyous reviews and comments, which in turn should count more than anonymous ones.

      tl;dr crowdsource the reviewing process

  2. Perhaps a bit tangential, but I don't think arbitrary processes are necessarily fair if everyone is subject to the same capricious decision procedures. Arbitrary processes are only fair if everyone is subject to the same capricious decision procedures and the starting conditions for everyone are the same (or at least equal/equitable in some relevant sense).

    Perhaps this doesn't matter when viewed solely from the context of conference acceptances, but I can imagine it mattering when viewed from a larger context. It's hard for me to say much on this since I'm at a point in my career still very far removed from something like applying for a job and thus don't really have any familiarity with the process. However, I can at least imagine that it might matter when looking for a job if you have other things working against you—such as graduating from an institution that doesn't have 'prestige' associated with it, (structural) sexism, (structural) racism, etc.—and your competition doesn't have those things working against them. In other words, if one thinks about the randomness just in the context of whether a particular paper is accepted to a conference, then sure, it might be fair. But, in the larger picture, such randomness probably isn't fair to someone who has other things working against them from the start.