Wednesday, October 3, 2012

Universal Grammar

Everyone believes that humans have a Universal Grammar (UG). Why?  Because it is a one step conclusion licensed by a trivial (and it is trivial) inference from one obvious factual premise (viz. humans are linguistically capable beings) and one major premise (viz. if humans are linguistically capable then there are some mental properties on which this capacity rests).  As UG is the name we give to these mental properties there cannot be a real debate about whether humans have a UG.  What has been contentious is what UG looks like. In what follows I discuss two features that generative linguists attribute to it: (i) UG is exclusive to humans and (ii) UG is linguistically specific.  Chomsky, for example, has claimed both properties for UG. Let’s consider these in turn.

First what does “species-specific” mean?  One thing it means is that UG is a property of humans the way that bipedalism or opposable thumbs are.  Human genetics insures that individual humans normally (i.e. exempting pathological cases) come equipped with the capacity to walk erect, to grab stuff and to acquire and use a language. So, just as tigers are biologically built to have stripes, and salmon to return to their birthplaces to spawn so too humans come biologically equipped to develop linguistic facility.  The empirical basis for this observation is overwhelming and not at all subtle. Anyone who observes language acquisition in the wild cannot fail to notice that human children (regardless of socio-economic status, religious affiliation, birth marks, head size, overall IQ or anything else short of pathology) when reared in a linguistic environment come to acquire linguistic competence in the native language they are exposed to. This observational truism leads smoothly to a related truism: that the capacity to develop such linguistic facility is due to mental equipment that individual humans share simply in virtue of being human. 

These truisms conceded, even if UG is species specific to humans it does not imply that UG is exclusively a human endowment. After all the observation that humans come genetically packaged with a four chamber heart does not imply that other animals do not.  Nonetheless, as a matter of fact, it appears that whatever humans have that allows them to develop linguistic competence is not widely shared.  Concretely, so far as we can tell (and take it from me, many investigators have tried to tell!), nothing does language like humans do. Apes don’t. Dolphins don’t. Parrots don’t. Or at least they don’t obviously. While it takes considerable effort to show that other animals show language-like behavior, nobody will win a Nobel for demonstrating that 5 year olds (or even 2 year olds) talk.  Thus, it’s a safe bet that whatever is going on in the human case is qualitatively different from what we see in other animals.

This said, it is worth noting that the program of describing UG wouldn’t change much were it established that other animals had one. Depending on which animals it was it might raise additional questions of how these UGs evolved (other apes? look for common ancestor; apes and dolphins? look for language as correlate of brain size; birds and bees? who knows). If other animals talked we could (at least in principle) investigate UG by studying how they acquired and used language, though the difficulties of studying UG in this way should not be underestimated. The biggest bonus would likely arise if we decided to treat these non-human talkers as possible targets for the kinds of experiments that are morally and legally forbidden on humans (cut them up, put them in Skinner boxes), though if they really talked like us we might be squeamish about treating them the way we treat white mice, chimps and cute bunny rabbits, though considering the unquenchable (blood thirsty?) desire for pure knowledge that homo sapiens regularly displays even pleading animals might not be safe from our inquisitive minds.  However, excluding such scenarios, finding another species that talked just like we do would not substantially change the research problem. In fact, it would not make it appreciably different from studying UG by investigating the grammatical properties of different languages (English, Chinese, ASL etc.), something that generative grammarians already do in spades. So though it appears as a matter of fact that UG is exclusively a feature of humans, if the aim is to describe UG it does not much matter that this is so.

Let’s now consider the suggestion that UG is a linguistically specific capacity, to be understood as the claim that UG’s cognitive mechanisms are sui generis, different from the cognitive mechanisms at work in other areas of cognition. There is a stronger and a weaker version of this claim. The stronger one is that all (or most or many) of the cognitive powers that go into linguistic competence differ from those that support other cognitive capacities (e.g. the capacity to identify objects, recognize and “read” other minds, understand causal interactions, navigate home, keep track of where and when you hid your food, etc.). The weak one is that UG enjoys at least one cognitively distinctive feature.  Current speculation among generativists leans towards the weak claim. Much current research in syntax (especially that which flies under the flag of the Minimalist Program) aims to reduce the linguistically specific mechanisms of UG to a small core.  Chomsky, for example, has argued that the only real distinctive feature of UG is (hierarchical) recursion (a product of the operation Merge), the property whereby the outputs of rules can be treated as inputs to these same rules. This allows for the generation of endlessly large linguistic objects, a fact that sits well with the observation that there appears to be no upper bound on the size of admissible phrases and sentences in natural languages. 

How reasonable is this second claim?  To my mind, it is almost ineluctable for the following reasons. First, if as discussed above, humans have UG but other animals do not then one plausible reason for this is that humans have at least one mental power that other animals don’t. The alternative is that human cognition is not qualitatively different from that of other animals but only quantitatively so; all animals share the same basic mechanisms just that humans have more horse-power under the cranial hood.  This
option is a favorite of those excited by general learning theories. They tend to be of an empiricist bent (something we will discuss in a later post).   On this view, the same cognitive powers are used in every area of cognition, including language.  There are two kinds of puzzles this empiricist conception runs into in the domain of language. First, the species specificity problem noted above; why do only humans talk?  The second is the separability of linguistic competence from other forms of cognition. It appears that linguistic competence is independent of most other kinds of cognitive competence, e.g. IQ, face recognition, etc. Why so if all involve the same general all purpose cognitive powers? Were linguistic competence a product of general cognitive factors it would be natural to expect that success in acquiring linguistic competence tightly correlated with other cognitive achievements. But it appears that it does not; both the rich and the poor, the high IQed and the low, those with good memories and bad all seem to acquire linguistic competence at roughly the same rate and roughly the same way.

The second reason for thinking that UG involves at least one special cognitive feature is that it would be quite surprising biologically if it did not.  Many animals have (almost) unique capacities (think echo location in bats, or navigation in ants).  These capacities supervene on distinctive cognitive powers (e.g. in ants, the built in capacity to form a compass oriented map using sun position as anchor). Why should it be any different with humans and language? We are not surprised to find that other animals are specifically built to do the special things they do, why should humans be any different? 

Third, the alternative, that all learning relies on general purpose mechanisms, is as coherent as the idea that all perception relies on a general purpose sensing mechanism.  Just as seeing involves mental mechanisms and cognitive apparatus different from hearing or smelling or touching or tasting so too learning language is different from learning faces or learning to recognize objects or fixing causal interactions.  Gallistel and King (in Memory and the Computational Brain: 221) make the point well:
…a very general truth about learning mechanisms [is] they do not learn universal truths.  The relevant universal truths are built into the structure of a learning mechanism. Indeed, absent some built-in relevant universal truths and the strong constraints they place on the form of the representation that can be extracted from a given experience, learning would not be possible.
As linguistic representations from all that we currently know are formally quite different from other cognitive objects it would be surprising if their peculiar properties did not require some built in language specific mental mechanisms to allow for their acquisition and use, just as in the case of honey-bees and ants with respect to navigation.

Many resist these conclusions about UG. The idea that UG involves at least one linguistic specific feature is considered particularly controversial. But for the general reasons noted above, I can’t see why anyone would assume anything different. This judgment is reinforced once one takes a look at the linguistic competence humans have in more detail. Linguistic objects are very distinctive. UG must be able to accommodate these distinctive properties. If this means that UG invokes special cognitive powers, we should not be in the least surprised, nor perturbed.  That’s the way biology works.


  1. "The second reason for thinking that UG involves at least one special cognitive feature is that it would be quite surprising biologically if it did not."

    Well, then. The Universe works following "unsurprsing" biological rules. This is not an argument, it's a statement of something you think. There is no single argument, based on independent grounds, for claiming anything close to UG in this post. This is not argumentative, this is apologetic.

  2. Though I will only haphazardly reply to comments there is something alluring about the first one. So let me try to clarify my intended point. The point I was trying to make concerns who carries the burden of proof. Those that argue for language being the product of general cognitive capacities take the position, for roughly Occammish reasons I believe, that the null hypothesis is that there is nothing cognitively special about UG or FL. However, in studying other animals there is no hesitation is attributing to them analogous powers. So when studying singing in birds, weaving in spiders, navigation in ants communication in bees, we see highly specific powers presupposed as a matter of course. In this context, human language is treated exceptionally. In this context, there is nothing exceptional about the assumption that UG has special features and, I would argue, the assumption to the contrary is the one that carries the burden of proof. In fact, I believe that it requires demonstrating how the intricate features of languages follow from these general assumptions. And please, no mention of similarity and analogy. So, the intent was to shift the burden of argument and place it where I think it belongs.

  3. I agree with the contents of the posting, just three quick points.

    As for the first claim, as uncontroversial as it should be that there is a UG, the point of course can't really be made on the basis of the observation that other species "don't talk" (although I take this to be a deliberate oversimplificaation on your part). For the past 15 years or so Chomsky in particular has been pushing the (in my view, very plausible) idea that narrow syntax yields a language of thought, with externalization ancillary (as he likes to put it). But if that's accurate, the fact that other species don't talk tells you very little about whether or not they have a recursive narrow syntax. Chomsky likes to jokingly refer to this idea, which apparently was around at some point, according to which higher primates have language but don't use it, so as not to be enslaved by us (or something like that). Of course, that's an odd suggestion, but the more externalization is seen as a mere add-on, the harder it gets to decisively exclude a scenario like this for higher primates. Thus, what's really necessary is a detailed argument against complex thought structures in those species, something that's much harder to get to than the mere observation that they "don't talk" (externalize). And of course lots of experiments show that they have complex stuff going on, although they don't seem to be able to relate concepts from different domains very well -- this seems to be the contribution of an I-language. But again, this is a somewhat harder argument to make.

    Wrt. the second point, it seems to me to be necessary to distinguish between competence of performance somewhat more carefully than you do here. Acquisition is performance, after all, enabled by UG. The competence system is very likely to be domain-specific, but that doesn't mean the performance systems feeding it in acquisition, interpretation, externalization etc. are. Charles Yang's work, for instance, assumes a domain-specific UG which matures into an I-language on the basis of a domain-general (Bayesian) learning algorithm. It's necessary to distinguish parameter *structure* (part of the competence system) vs. parameter *setting* (performance); the parameters may well be domain-specific, but the way they're set in acquisition may hark back to domain-general learning algorithms. I see nothing incoherent about this view. Do you agree?

    Finally, the HCF paper is commonly misunderstood as saying that UG contains only recursion. That's falsely suggested in the abstract of the paper (which was written by Marc Hauser, who I think later acknowledged the mistake), but it's not what the paper says. Rather, it says that FLN plausibly contains recursive Merge *and* the interface mappings -- that's a big difference. Moreover, HCF speculate towards the end of the paper that Merge might actually be a domain-*general* mechanism that got exapted to language! Although this is mostly a speculative addendum (at least that's how I remember it), overall the paper is somewhat different from the way it's usually summarized.

  4. Point 1: agreed. Just a nice way of making the obvious point that there currently no reason to think that any species but humans have linguistic facility of any kind. Some have argued that Birdsong shows analogous complexity. I doubt it and will blog on this soon. For now, let me just express skepticism and also accept your main point.

    It's an interesting question whether all learning is domain-general. You are right that Yang supposes this but he actually has problems with the kinds of issues that Dresher and Fodor have identified in that parameters not being independent, incremental parameter setting (incremental learning) becomes problematic. There are various ways to finesse this real problem, one of which is to back off a general learning theory and suppose that there are special features about learning language (Lisa Pearle's thesis investigates this). As the issues involved are far from resolved (indeed barely clearly articulated) I am loathe to commit hostages to one or another view of these matters.

    I accept the third point. Thx. One addendum: so far as I can tell even should every computational feature of UG have counterparts in the cognitive systems of other animals it will still be useful to identify a modular faculty of language and study its properties. We have reasonable evidence that within humans language is a separable cognitive organ. So it's worth asking what its properties are even if it is gerry-built from general cognitive mechanisms. I don't believe that this is correct, but even were it so, it is a worthwhile object of study. After all, the cells that make up stomachs work pretty much the way cells that make up kidneys do but it is worth studying the structure of kidneys separate from the study of stomachs.

    Thx for your comments.

  5. Thanks for your response. Looking forward to reading what you have to say about the birdsong issue. I'll check out the aquisition literature you mention, thanks for that too. Much to learn here...

  6. Happy to see the issues being discussed. The whole debates is about "the suggestion that UG is a linguistically specific capacity, to be understood as the claim that UG’s cognitive mechanisms are sui generis, different from the cognitive mechanisms at work in other areas of cognition." If Universal Grammar means anything at all, then it should presumably be about grammar.

    But see Tomasello's several books about the differences between chimps and humans that involve prerequisites for language.

    The top contender for UG is recursion, but recursion isn't domain-specific (see Jackendoff and Pinker), and then there's Pirahã... And in any case, unless we are to assume that binary branching, invisible functional categories, N, V, A, constraints on word order, etc are part of UG, then we no longer need to assume that all languages are underlyingly the same. See also

  7. Tomasello (T) seems to be going after a straw person. The issue is not whether non-linguistically specific cognitive attributes facilitate the acquisition of linguistic competence. All can agree that there may be, the only question being how exactly they bear on the issue. What's at issue, and I have never seen T address these issues, is whether this is sufficient. Let him take his favorite handful of phenomena -island effects, cross over, anaphor binding, principle C, the CED etc.- and show how non-linguistically specific constraints could account for their well described and catalogued properties. I wish him luck but do not intend to hold my breath. Blue is not my color.

    I discuss Piraha in the first post. As you will see if you look there, I think that the whole discussion has nothing whatsoever to do with what linguists (at least Chomsky and his friends) mean by UG. As for recursion, the kind we find in language seems quite distinct, and that's enough to bet the ball rolling.

  8. This comment has been removed by the author.

  9. There is a difference between being pretty sure that a trait is largely biologically grounded and knowing how. I doubt that anyone looks for the environmental determinants of the emergence of stripes in tigers, ro web building in spiders. The assumption is that this is "genetically" based. Note the scare quotes: here this means, not environmentally driven. If I see a tiger wet I look for the environmental cause. Not if I see a tiger striped. This quick and dirty distinction does not require a deeper grounding. We might become surprised and decide that stripes come from high concentration oxygen intake (or some other environmental cause), but boy would be surprised.

    Is this enough? No. Is it a start? Yes.

    The analogy between stripes and language relates to this point i.e. is the main clausal source environmental or "genetic" (includes epigenetic factors and developmental ones). That's it. But sometimes that's enough for the point at hand. Sorry I confused you.

  10. Before assuming anything let me just ask: when it comes to language, do you believe there is a clear distinction between what you call biological and environmental causes that makes them mutually exclusive? Further, on your view what is biologically grounded?

  11. Nobody believes that they are mutually exclusive. In fact both are causally decisive. I talk English of the west island Montreal variety because that was the source of the PLD AND I have a UG of the kind I have. So, not mutually exclusive. However, and this is a big deal: I believe that there is no learning without an inductive bias. By assumption such a bias is GIVEN and so, not acquired/learned (this is Hume-Goodman problem of induction). The problem is then to discern the nature of the bias. Is it general or domain specific or a bot of both or...But whatever it is, it is "innate" in the sense that I mean as it is what learning presupposes.

  12. Based on what you say here, you are in wide agreement with most possibly all players - certainly connectionists acknowledge built in biases of the way you describe. So can you be a bit more specific about what you think is 'built in'

  13. Read reply to Alex. There I built in most of principle A-C of binding theory, c-command and locality domains. Them's my biases.

  14. Thanks. Just so I understand most of principle A-C of binding theory, c-command and locality domains are encoded by the human genome and instantiated in our brains? I am not asking how and where [as i assume we do not know that yet]only that. And they are in the brain of every human not just speakers of English, right?

  15. Yup. I build in the binding theory, or the functional equivalent thereof (I actually think that most of BT can be reduced to the theory of movement properly construed, but this does not change the nature of what is built in substantially). I don't know if it is in the genome, but it is NOT learned. It's what learning presupposes. They are in the brain of everyone: part of FL. It is another example of the POS, but this time with more cross linguistic support than Y/N questions which is not nearly as widespread a phenomenon.

  16. Thanks. You say: I don't know if it is in the genome, but it is NOT learned. - Does this mean there are further options? If so what are they?

  17. Damn if I know. Epigenesis, God, Paul Krugman when he isn't battling the Republicans? Does anyone really know how things like learning biases get coded in the genome? Does anyone know how the bee dance or the dead reckoning behavior of ants gets coded in the genome? But, seriously, I don't know, or, right now, care. That question is way above my pay grade.

  18. Fair enough. What is your theory of learning then. Chomsky often complains that the other side does not define learning properly, so i am interested: how it is defined on your view.

  19. I have none, but there are many off the shelf models I am happy with, though truth be told, this is way above my pay grade. The versions I have in mind are e.g. the stuff by Yang, Berwick's thesis stuff and more recently his paper with Niyogi (Cognition 1996) on learning parameters, Wexler's stuff. These seems fine to me. Frankly, most learning models supplemented with a more (to me) realistic hypothesis space reflecting the specifics of the learning problem for language would do as well for me.

    There is a thought out there that I frankly cannot understand: that generativists as such must be hostile to statistical models of learning. This is nuts. What is objectionable is not the stats but the associationism. Bayesian models can be adapted to my ends so long as what we are evaluating are realistic models of grammar. Counting is fine (at least in principle), as long as one counts the right things.

  20. So you require at minimum a constrained hypothesis space that is somehow instantiated in the brain?

    As for the perception of generativists hostilities, i can of course only guess but would imagine remarks like this contribute to same:

    "Take linguistics. If you want to get a grant, what you say is "I want to do corpus linguistics" - collect a huge mass of data and throw a computer at it, and maybe something will happen. That was given up in the hard sciences centuries ago. [Chosmky, 2012, p. 19]

    No one denies that sometimes funding is awarded for 'all the wrong reasons' but this comments seems to be based on a priory dismissal not on an actual case...

  21. I have learnt various good stuff right here, and I’m sure everyone will get advantage of it.Grammarly reviews