Monday, May 23, 2016

The return of behaviorism

There is a resurgence of vulgar Empiricism (E). It’s rampant now, but be patient, it will soon die out as the groundless extravagant claims made on its behalf will soon be seen to, yet again, prove sterile. But it is back and getting airing in the popular press.

Of the above, the only part that is likely difficult to understand is what I intend by ‘vulgar.’ I am not a big fan of the E-weltanschauung, but even within Empiricism there are more and less sophisticated versions. The least sophisticated in the mental sciences is some version of behaviorism (B). What marks it out as particularly vulgar? Its complete repudiation of mental representations (MR). Most of the famous E philosophers (Locke and Hume for example) were not averse to MRs. They had no problem believing that the world, through the senses, produces representations in the mind and that these representations are causally implicated in much of cognitive behavior. What differentiates classical E from classical Rationalism (R) is not MRs but the degree to which MRs are structured by experience alone. For E, the MR structure pretty closely tracks the structure of the environmental input as sampled by the senses. For R, the structure of MRs reflects innate properties of the mind in combination with what the senses provide of the environmental landscape. This is what the debate about blank/wax tablets is all about. Not whether the mind has MRs but whether the properties of the MRs we have reduce to sensory properties (statistical or otherwise) of the environment. Es say ‘yes,’ Rs ‘no.’

Actually this is a bit of a caricature. Everyone believes that the brain/mind brings something to the table. Thus, nobody thinks that the brain/mind is unstructured as such brains/minds cannot generalize and everyone believes that brains/minds that do not generalize cannot acquire/learn anything. The question then is really how structured is the brain/mind. For Es the mind/brain is largely a near perfect absorber of environmental information with some statistical smoothing techniques thrown in. For R extracting useful information from sensory input requires a whole lot of given/innate structure to support the inductions required. Thus, for Es the gap between what you perceive and what you acquire is pretty slim, while for Rs the gap is quite wide and bridging this gap requires a lot of pre-packaged knowledge. So everyone is a nativist. The debate is what kinds of native structure is imputed.

If this is right, the logical conclusion of E is B. In particular, in the limit, the mind brings nothing but the capacity to perfectly reflect environmental input to cognition. And if this is so, then all talk of MRs is just a convenient way of coding environmental input and its statistical regularities. And if so, MRs are actually dispensable and so we can (and should) dump reference to them. This was Skinner’s gambit. B takes all the E talk of MRs as theoretically nugatory given that all MRs do is recapitulate the structure of the environment as sampled by the senses. MRs, on this view, are just summaries of experience and are explanatorily eliminable. The logical conclusion, the one that B endorses, is to dump the representational middlemen (i.e. MRs) that stand between the environment and behavior. All the brain is, on this view, is a way of mapping between stimulus inputs and behavior, all the talk of MRs just being misleading ways of talking about the history of stimuli. Or, we don’t need this talk of minds and the MR talk it suggests, we can just think of the brain as a giant I/O device that “somehow” maps stimuli to behaviors.

Note, that without representations there is no real place for information processing and the computer picture of the mind. Indeed, this is exactly the point that critics of E and B have long made (e.g. Chomsky, Fodor, and Gallistel to name three of my favorites). But, of course the argument can be aimed in the reverse direction (as Jerry Fodor sagely noted someone’s modus ponens can be someone else’s modus tollens): ‘If B then the brain does not process information’ (i.e. the opposite of ‘If the brain processes info then not B’). And this is what I mean by the resurgence of vulgar E. B is back, and getting popular press.

 Aeon has a recent piece against the view of the brain as an information processing device (here). The author is Robert Epstein. The view is B through and through. The brain is just a vehicle for pairing inputs with behaviors based on reward (no, I am not kidding).  Here is the relevant quote (13) :

As we navigate through the world, we are changed by a variety of experiences. Of special note are experiences of three types: (1) we observe what is happening around us (other people behaving, sounds of music, instructions directed at us, words on pages, images on screens); (2) we are exposed to the pairing of unimportant stimuli (such as sirens) with important stimuli (such as the appearance of police cars); (3) we are punished or rewarded for behaving in certain ways.

No MRs mediate input and output. I/O is all there is. 

Misleading headlines notwithstanding, no one really has the slightest idea how the brain changes after we have learned to sing a song or recite a poem. But neither the song nor the poem has been ‘stored’ in it. The brain has simply changed in an orderly way that now allows us to sing the song or recite the poem under certain conditions. When called on to perform, neither the song nor the poem is in any sense ‘retrieved’ from anywhere in the brain, any more than my finger movements are ‘retrieved’ when I tap my finger on my desk. We simply sing or recite – no retrieval necessary (14).

No need for memory banks or MRs. All we need is “the brain to change in an orderly way as a result of our experiences” (17). So sensory inputs, rewards, behavioral outputs. And the brain? That organ that mediates this process. Skinner must be schepping nachas!

Let me end with a couple of references and observations.

First, there are several very good long detailed critiques of this Epstein piece out there (Thx to Bill Idsardi for sending them my way). Here and here are two useful ones. I take heart in these quick replies for it seems that this time around there are a large number of people who appreciate just how vulgar B conceptions of the brain are. Aeon, which published this piece, is, I have concluded, a serious source of scientific disinformation. Anything printed therein should be treated with the utmost care, and, if it is on cog-neuro topics, the presumption must be that it is junk. Recall that Vyvyan Evans found a home here too. And talk about junk!

Second, there is something logically pleasing about articles like Epstein’s; they do take an idea to its logical conclusion. B really is the natural endpoint of E. Intellectually, it’s vulgarity is a virtue for it displays what much E succeeds in hiding. Critics of E (especially Randy and Jerry) have noted its lack of fit with the leading ideas of computational approaches to neuro-cognition. In an odd way, the Epstein piece agrees with these critiques. It agrees that the logical terminus of E (i.e. B) is inimical with the information processing view of the brain. If this is right, the brain has no intrinsic structure. It is “empty,” a mere bit of meat serving as physiological venue for combining experience and reward with an eye towards behavior. Randy and Jerry and Noam (and moi!) could not agree more. On this behaviorist view of things the brain is empty and pretty simple. And that’s the problem with this view. The Epstein piece has the logic right, it just doesn’t recognize a reductio, no matter how glaring.

Third, the piece identifies B’s fellow travellers. So, not surprisingly embodied cognition makes an appearance and the piece is more than a bit redolent of connectionist obfuscation. In the old days, connectionists liked to make holistic pronouncements about the opacity of the inner workings of the neural nets. This gave it a nice anti-reductionist feel and legislated questions about how the innards of the system worked unaskable. It gave the whole theory a kind of new age, post-modern gloss with an Aquarian appeal. Well, the Epstein piece assembles the same cast of characters in roughly the same way.

Last observation: the critiques I linked to above both dwell on how misinformed this piece is. I agree. There is very little argumentation and what there is, is amazingly thin. I am not surprised, really. It is hard to make a good case for E in general and B in particular. Chomsky’s justly famous review of Skinner’s Verbal Behavior demonstrated this in detail. Nonetheless, E is back. If this be so, for my money, I prefer the vulgar forms, the ones that flaunt the basic flaws. And if you are looking for a good version of a really bad set of Eish ideas, the Epstein article is the one for you.


  1. While it is fairly easy as a linguist to ignore E/B revivals among psychologists, I find it surprising and frustrating to find the same trends among linguists, especially linguists familiar with the old arguments. It is hard to imagine features like Accusative or Plural being extractable from the sensory input, but some phonologists do not think the same carries over to phonological data structures like distinctive features and syllables. Archangeli and Pulleyblank (2015. Phonology without universal grammar. Frontiers in psychology 6.) recently told us to "See Mielke, 2004 on why features cannot be innately defined, but must be learned." Similarly E. Dresher and D.C. Hall have both offered rebuttals for the innateness of features, with Dresher (Dresher, B Elan. 2015. The arch not the stones: Universal feature theory without universal features. Nordlyd 41:165–181.) suggesting that "There is a growing consensus that phonological features are not innate, but rather emerge in the course of acquisition."
    However, Mielke's book and the papers that adopt or build on its conclusions do not mention the arguments for the logical necessity of innate representational primitives given by Fodor, Chomsky, or anyone else. None of these are mentioned in the book at all. Mielke (27) asserts that `"Chomsky and Halle's assumption that distinctive features are innate is treated in subsequent literature as if it were a conclusion", but Mielke is ignoring the centuries of discussion on the topic that is more general than phonology---the acceptance of a universal innate feature set is a specific conclusion based on a general argument made by linguists, philosophers and psychologists. Where Mielke does look beyond phonology, he restricts himself to syntax, and concludes that ``Most of the evidence for UG is not related to phonology, and phonology has more of a guilt-by association status with respect to innateness'' (34). All to say that Empiricism is on the rise even among generative linguists. A propos of which, I recently came across this nice old discussion: Katz, Jerrold J., and Thomas G. Bever. 1976. The fall and rise of empiricism. In An integrated theory of linguistic ability, ed. Thomas G. Bever, Jerrold J. Katz, and D. Terence Langendoen, 11–64. New York: Thomas Y Crowell Company.

    1. Charles Reiss suggests that my and Daniel Hall’s critique of innate phonological features has something to do with a revival of Empiricism (E) within phonology. This claim is completely false. On the contrary, our proposal is that the notion of a fixed innate list of phonological features has been problematic on empirical grounds, and is not conceptually necessary because there is an innate mental mechanism for creating distinctive features in the course of language acquisition.

      On our view, learners must arrive at contrastive feature hierarchies that account for the contrasts and phonological activity in their language. Because these hierarchical structures are limited to features that are contrastive, there are strict limits on how many features can be posited for a given inventory. For example, an inventory with three elements (say, vowels) allows exactly two contrastive features; a four-vowel inventory can have a minimum of two and a maximum of three; and in general, an inventory of n elements requires a minimum of the smallest integer ≥ log₂n and a maximum of n–1 contrastive features. Combined with an auditory system that allows us to make certain sound discriminations and perhaps favours some over others, Contrastive Hierarchy Theory (CHT) thus accounts for why phonological systems tend to resemble each other, as well as for why phonologies pattern the way they do.

      It should be clear that CHT is a Rationalist (R) theory of phonology. We do not repudiate mental representations, or the concept of distinctive features as a primitive of the system, or innate mechanisms that construct the relevant features. Unlike what would be required by an E theory, contrastive hierarchies are not observable in the data. Nor, in the words of the excellent article by Katz & Bever that Reiss recommends, can they “be learned on the basis of procedures for segmenting and classifying speech that presuppose only inductive generalizations from observable distributional regularities.” As is characteristic of R theories, in CHT “the essential properties of language underlie the surface form of sentences and are thus unobservable.” And, as Bever & Katz write of Chomsky’s grammar, CHT “is clearly rationalistic, since, on it, linguistic knowledge is determined by unobservable mental structures that are invariant from language to language.” In CHT, it is the contrastive feature hierarchy, not the individual feature, that is universal.

      One final lesson we can learn from Bever & Katz is that particular formalisms or primitives do not in themselves make a theory E or R. Transformations, for example, can be part of an E theory, as in Zellig Harris’s conception; the same formal operations can also be reconceived as part of an R theory, as Chomsky did. Similarly, emergent phonological features can be part of an E theory or an R theory, depending on what else the theory says. I can’t speak for other theories of emergent features, but CHT is an R theory.

  2. Thanks for responding Elan. Since we agree on so much, it is very useful for me to argue with you. Your response helps me to see how I could have been clearer and also to better understand our points of disagreement. I should have been clearer in saying that even among Rationalist generative phonologists who believe in mental representations (like me and you) there is debate about "what kinds of native structure is imputed". I think your response indicates that you are slightly more towards the E end of the spectrum than, say, SPE, in that you want to derive feature systems from the set of possible "sound discriminations" the auditory system can make (with perhaps some built in biases, biases which would be R-ish) along with the notion of contrast. My own take on more specific innate phonological features is that there exist higher level (speech specific) equivalence classes. In other words, that there are distinctions that can be detected by the auditory system but that cannot be used for phonological contrasts. To give a gross analogy that may or ay not be useful, we can "hear" sentence length, but the length of a sentence cannot be syntactically relevant. Closer to phonology, we can hear the contrast between positional variants of speech sounds if they are isolated (say syllable initial and final [f]), but these auditory differences cannot be phonologically contrastive. Sapir's Nootka example of glottalization timing w.r.t. oral release also come to mind. If such higher-level-than-auditory-jnds categories exist, then there is more innate structure than CHT posits, in my understanding.
    One last poke at the idea of auditory discrimination being the relevant level of analysis for the learner: This would make it hard to explain how learners generalize across speakers. Mommy and Daddy potentially sound very different when saying "cat". I think the level of abstraction at which language acquisition works must be significantly higher than auditory jnds. I imagine agree that we need a level of speech perception for the CHT, no?

    1. Thanks for the response, Charles. I’m glad that you acknowledge that we are both Rationalists, and I agree that the dispute between us is about the nature of the innate structures we posit. I don't dispute that phonological features are innate in some sense. I believe that the concept of a feature is built-in, and is not learned through experience. I just don't believe that there is a pre-compiled list of specific phonological features. I do not agree, though, that my position is in any sense "more towards the E end of the spectrum than, say, SPE". I'm not sure there is a meaningful sliding scale of E-ness within R theory. But even if there is, it is not relevant to our disagreement, which is about the nature of the innate structures, and not about any restrictions on what is learnable, or what sort of mental representations may be proposed.

      One of the consequences of CHT is that phonological features are necessarily abstract with respect to the phonetics. This abstractness takes several forms. First, there is the SPE type of abstractness: a segment might sound like one thing at the surface but might be something else in lexical representation. Features are abstract in other ways as well. Take, for example, a feature like [low], which supporters of a list of universal features might nominate as an innate feature that plays a role in many vowel systems. I think most would agree that there are no clear-cut boundaries that would tell a learner which vowels are [+low] and which are [-low] (assuming binary features, for the sake of this discussion). It's a relative term with many different instantiations. And the same goes for other vowel features like [high], [back], [round]. This relative nature of features is captured well in CHT, where features are inherently contrastive and do not require a fixed universal definition. This inherently contrastive function of features also requires them to generalize across speakers.

      There is a more fundamental sense in which phonological features are abstract in CHT. Because features are contrastive, it is not the case that every vowel that is phonetically low is assigned [+low], and the same for the other vowel features. Consonantal features are also abstract in this way. If a language distinguishes /t/ and /s/, the contrast between them could be based on [continuant], or [strident], or a place distinction. Similarly, coronal consonants may be contrastively [coronal] or not; /n/ might be contrastively [nasal], or just [sonorant]; and so on.

      So here is an example of a theory that I think is wrong. Suppose it is proposed that humans come equipped with an innate list of features that enables them to immediately assign the correct (surface) representations to every utterance they hear. Then we would be able to say that surface representations are essentially given, and all that learners need to work on is the rest of the system. Such a theory posits innate features, but I think it is a theory that Leonard Bloomfield would have been happy with, because it supposes that features can be directly extracted from the signal.

      It is the abstractness of representations relative to the signal that guarantees that the above theory cannot be correct. A learner who hears [ε] cannot know right off the bat if it is [+low] or [-low] (or has any contrastive value of [low] at all); a learner who hears [u] might be able to perceive that it is high and round, but cannot conclude that it is [+high] or [+round]; a learner who hears [h] cannot immediately know if it is set apart from other consonants by [+glottal], or if it forms part of a [+back] series with /x/ and other segments, etc.

      Finally, in the spirit of R approaches to the mind, I think the most relevant grounds on which to debate the correctness of CHT are empirical ones: how well does the theory account for the phonological component of grammars, and does it contribute to an account of how grammars can be acquired?

    2. In addition to agreeing that we are Rationalists, Elan, it seems we agree that the "Bloomfieldian" view you lay out is not plausible. I would point out a distinction between children having initially rich, even fully specified, representations as opposed to them having "correct" i.e. "same as adult" representations. Based on our long term discussions on these topics, I think you favor a building up of representations via contrast, whereas I favor a potential pruning of initial rich representations.