Sunday, November 1, 2015

Brains, grammars and hype (part 2)

In the previous post (here), I showed how Frankland and Greene identifies a role sensitive region of cortex and sub-areas within that region that are differentially sensitive to the doer and done-to roles. In other words, if correct, F&G offers a hypothesis about where roles like doer and done-to get coded. Finding a region sensitive to thematic parameters would be a useful contribution given our vast ignorance concerning the brain bases of anything (see here discussed here). Let me repeat this loudly lest I not be heard: FINDING A REGION SENSITIVE TO THEMATIC PARAMETERS WOULD BE A USEFUL CONTRIBUTION GIVEN OUR VAST IGNORANCE CONCERNING THE BRAIN BASES OF ANYTHING. However, F&G claims to do a whole lot more than this. Here I want to consider if it does do more. So the question for what follows: does F&G explain how the brain codes thematic information as it appears to claim to do?

No. Not really. The paper may have identified a region that correlates to role information but F&H’s claim that it explains how brains code such information seems to me quite overblown.[1] Here’s what I mean.

What would it mean to show how brains code such information? F&G tells us. In the abstract, it takes its discovered empirical results to support the following claim:

At a high level, these regions may function like topographically defined data registers, encoding the fluctuating values of abstract semantic variables. This functional architecture, which in key respects resembles that of a classical computer, may play a critical role in enabling humans to flexibly generate complex thoughts.

What’s this mean? Those familiar with earlier critiques of connectionism should recognize the allusions. People like Fodor and Pylyshyn, Marcus, and Gallistel argued that brains had a Turing rather than a connectionist architecture. They provided various arguments for this, including observations about the systematicity of cognition (in particular in language), which makes perfect sense if one assumes that brains embodied read/write memories with variables and valuation of variables, being key elements.  Most of the arguments provided were behavioral (though see Gallistel for more direct arguments that brains cannot be connectionist either). F&G is clearly pointing to these claims in the abstract above (indeed, Fodor and Pylyshyn, Marcus and Pinker are noted in the bibliography in relation to this). So, F&G clearly intends its results to be an argument in favor of Turing architectures and a challenge for connectionist architectures. However, if this is the intent, I don’t see that F&G’s argument adds anything to the earlier behavioral arguments. Why not?

F&G notes that its results are consistent with Turing architectures, but then so are most connectionist models so far as I can tell. There is nothing in these models that prevents the hidden layers (appropriately tuned) from isolating doer and done-to roles. Indeed, this is regularly done in such models for other abstract categories. So, if F&G intends to use its results to argue for classical architectures, then it is unclear to me what it has actually added to the arguments advanced by Fodor & Pylyshyn, Marcus or Gallistel. Note, I have nothing against the conclusion that connectionist architectures are bad neural models (less coyly: I am pretty confident that connectionist architectures suck). What I don’t see is that F&G adds anything to the previous arguments. I would go further (as you probably knew I would). The concluding discussion section of F&G notes that there is a “class of models that use matrix operations to combine spatially distributed representations into conjunctive representations…that could potentially be augmented…[to] encode conjunctive representations for distinct semantic roles” (11737). For the uninitiated, this is connectionist speak. In other words, as F&G notes, its results do not argue against a connectionist conception in favor of a more classical Turing view. Or more correctly, the F&G results do not add anything to the earlier (completely compelling arguments) arguments. So, if F&G intends its “how” contribution to consist in an argument for a classical architecture and against a connectionist one, then, by its own admission, it fails.[2]

What else could the “how” mean? Another possible contrast is between the kinds of codes the brain uses to track information; in particular does the brain use a place code or a rate code to track doers and done-tos. Let me expand a bit.

One line of thinking (that F&G says its results endorse) exploits geography to code information: “functional segregation corresponding to spatial segregation” and binding of variables to values executed by bringing the two into spatial proximity. This contrasts with another view wherein binding is signaled through temporal proximity (synchronization) rather than spatial. F&G claims that its results (my emphasis)

suggest that such temporal correlations may be unnecessary in this case because the bindings may instead be encoded through the instantiation of distributed patterns of activity in spatially dissociable patches of cortex devoted to representing distinct semantic variables” (11736). 

However as the paper notes, and the highlighted mealy-mouthed modals indicate, this conclusion is not particularly well supported by their experiments. Or, more correctly, F&G’s tools preclude a strong choice between the two.  As F&G notes, the hunt was conducted using fMRI and because these have limited temporal resolution (on the order of 1000 ms) fMRI probes cannot generally “see” rate codes. The best that F&G can conclude is that because it was able to localize roles in geographically proximate yet distinct locals this suggests that a place coding of roles might be right, though not to the exclusion of rate codes. The logic is that place codes require segregated (proximate?) geography and this was found. Hence the finding supports the claim that for role information the brain uses a place code. But this conclusion does not follow. To establish it firmly one needs the inverse: if segregated regions then place code. But this is not obviously true. Moreover, and here I am asking, do neuro people believe that anytime they can localize functions in different (nearby) places that this is evidence for place codes? Sounds wrong to me, but, hey, I don’t do this.[3]

I should add that the second experiment is the crucial one for this conclusion, and it is less robust than the first as F&G notes. The bifurcation of lmSTC into doer and done-to areas is quite subtle empirically and some of the participants in the UMD discussion thought that the data here was quite brittle. Again, this is beyond my pay grade.

F&G, then, really says very little (if anything) about the how question. In fact, it never really addresses it except tangentially. “Where?,” not “how?”, is what F&G addresses.  Let me squawk about this for a moment.

IMO, neuro types often confuse how does X work with where is X located. Why they think answering one answers the other I do not know. I don’t object to the claim that knowing where things are in the brain might be/is likely to be a good first step in figuring out how the brain does what it does. But reading F&G (and this paper is hardly unique) leads me to think that CNers can’t tell the difference between where and how.  And this is a problem.

One consequence of the confusion is that it denigrates the cognitive work that it presupposes. F&G relies on an unanalyzed conception of thematic roles. In fact, it relies on a truism: that sentences like John saw Mary do not mean the same as Mary saw John and that the difference has something to do with the fact that what sentences say about John/Mary in the first sentence is effectively reverses what the second sentence says about them.  This is a truism, or as close to one as might be imagined.  However, as any linguist knows, there are many different theories to explain how this truism is true. Some exploit theta roles, some grammatical roles, some the internal/external distinction, some first vs second merge, some predicate argument structure with 1st and 2nd argument positions of a predicate, some Deep Structures, some kernel sentences, etc. When a linguist asks how is thematic information represented, s/he means how can we distinguish between these apparently different conceptions all of which code/represent the observed doer/done-to difference. F&G cannot tell us which of these is right, nor does it intend to. This “how?” question is beyond the technical reach of current neuro apparatus.  That’s not a criticism. Here is the criticism: by confusing where with how, F&G continues the tradition of treating distinctions beyond the range of its probes as non-questions, rather than as questions beyond the resolution of its methods. The fact is that cognitive probes into the structure of brains is right now far more powerful than the currently most fashionable technology in neuro-science. fMRI might generate pretty pictures, but it's a pretty coarse technology. Right now, behavioral methods generally allow us to probe brain structure in a far more refined way than neuro methods do. That CN technology cannot usefully probe well motivated behaviorally based claims is what we should expect, and is what we find.

A second feature of the where/how confusion is that it leads one to abstract away from the most serious question in the neuro-sciences. Call it Gallistel’s question: how do brains embody mental constructs?  For example, how does wetware code for a variable or a value thereof? How do brains read and write to memory, bind a variable, distinguish between types and tokens?  Nobody knows. In fact, as Gallistel has observed, most CNers don’t even understand that this is the “how?” question that needs addressing (see here for discussion). The cognitive literature, including that in linguistics, has shown that we need these notions. Much of current neuroscience assumes that brain architectures that cannot do any of this (indeed that apparently deny, if Gallistel is right, that brains ever do this) are serviceable. This is partly abetted by the fact that current thinking fails to distinguish where from how. F&G is another example of this wider confusion.

I could go on, but I won’t. F&G makes a contribution: it identifies one possible place for where role information in some sense (however it is represented and whether it is specifically linguistic or not) might live. Given the current state of neuroscience, this is not nothing. However, the paper’s rhetoric (BS really) is way over the top. The introduction and conclusion motivate the investigation by pointing to really big issues (in particular recursion and Turing architecture). It purports to address these issues but in truth it can’t. The results are neutral wrt them. In the process, F&G sows lots of confusion and makes lots of simple errors thereby makind it hard to find the useful kernel in the morass. This leads me to one final observation.

I have heard it argued that without the overstatement and the BS the paper could never have been published. This is sometimes said in apparent justification of the BS and hype. If so neuroscience is in really bad shape. Moreover, I am skeptical that the hype is necessary, though I am sure that even if it is, it is odious to sling it nonetheless. Let me vent.

First, I doubt that a more measured presentation would have prevented publication. The result is not trivial and could have been presented as relevant to finding where linguistically/conceptually important concepts live in brain tissue.

Second, wanting to get published is no excuse for BS. This is not show business. BS goes against the fundamental values of the scientific enterprise and should not be tolerated, even if it might be useful career-wise.[4] The big problem is that such BS is fast becoming part of standard practice.  And like all S it greases a slippery slope: BS facilitates publication, we become more indulgent towards it and this will serve to further BSify research and publication. There is no excuse for this, or at least not one that should pass the smell test (and BS does smell). Whatever, F&G has told us about brains, it is mired in overstatement and self promotion. That’s the main reason many have reacted so strongly, and rightly so.[5] And that’s too bad because F&G does have something to tell us of interest.

[1] Steve Pinker’s tweet highlights these F&G ambitions as well. It reads: “The most important paper in cognitive neuroscience in many years: How does the brain represent who did what to whom.” Note the “how.” I wonder if the tweet would have had the same impact if we replaced ‘how’ with ‘where.’ I can’t tell, though I think that the howish version sounds far more interesting. And this is exactly the problem.
[2] In the discussion section, F&G observes relations between its results and some previous findings in the literature. An interesting one relates to deficit studies that identify insult to the lmSTC results in “who did what to whom” problems for stimuli presented aurally and visually. This suggests the possibility, as F&G note, that this area is not linguistically dedicated. In other words, this area might be part of an “amodal language of thought.” If this is so, it might be interesting to see if analogous areas in non-linguistically endowed animals can similarly discriminate doers from done-tos.  This might even have some interesting linguistic significance concerning the theoretical utility of theta roles as discussed here. F&G leaves the linguistic status of lmSTC for future research. Hope it gets done.
[3] Also, how important is the proximity? Say that doers were found in one area and done-tos were found several sulci away. Would this be a problem for place codes? I don’t know. At any rate, the relation between being localizable and being place coded strikes me as looser than F&G suggests. In fact, I could imagine that even were rate codes employed to code some functional feature the sources generating the relevant rates might nonetheless localize somewhat.  I don’t know that this is so, but nothing F&G says leads me to think that this is impossible or even false. So a question to cognoscenti: is this inference from localizable to place code legit?
[4] IMO, BS is the most corrosive feature of much current research. As Frankfurt has argued, it might be even worse than lying for unlike the latter it has no regard for truth whatsoever. Stan Dehaene was the editor for the paper and he should really have removed this BS from the paper. He knows better.
[5] BTW, F&G does not get its BS right either. See the box marked “Significance” on the first page of the paper. It suggests that the problem of theta roles is the same as the problem of recursion. This is false. The roles that F&G addresses have nothing to do with Humboldt’s making infinite use of finite means. Here we have a finite set of possible sentences templatically specifiable wrt roles of two arguments. Recursion gives you sentences with many doers and many done tos, in fact unboundedly many. F&G has nothing to say about where the brain codes this.

No comments:

Post a Comment