Tuesday, September 1, 2015

How I spent my summer vacation

Like all avid GGers, I spend part of my summer vacation rereading some of the greatest hits.  This year, this included rereading Current Issues in Linguistic Theory (CILT). If you haven’t done so recently, you should go out and read it (again) now, before the BBC series comes out on public TV. It is a fantastic little book and very timely for it lays out more clearly than anything else I know what the original GG enterprise took the central questions of interest to be. It is worth knowing what these were (and still are) for it helps prevent energetically chasing off in the wrong direction in pursuit of answers to questions of dubious utility.  In other words, knowing what you are interested in, what the research questions are, helps you to avoid wasting time. Remember: those things not worth doing are not worth doing well. And, there are many too many things that are really not worth doing. I will mention one such enterprise below that seems to have recently stirred the tea post tempestuously once again. 

This said, back to CILT. The book starts with a short important chapter on the goals of ling theory: what are the things we want to explain? Chomsky points to two central questions: (i) Linguistic Creativity: how do competent speakers go (in various kinds of uses (e.g. production, comprehension)) from utterances to structural descriptions of those utterances and (ii) The Logical Problem of Language Acquisition: how do kids/acquirers go from primary linguistic data to their acquired generative grammars.  These are the two questions we want to answer and this involves limning the fine structure of Gs and FL/UG. More abstractly CILT provides the following two important mnemonic diagrams.

(1)  utterances à A à structural description
(2)  PLD          à B à generative grammar

A is a place-holder for (at least) a particular G and B for the theory of FL/UG.  The aim of inquiry is to describe the innards of A and B.  Or as Chomsky puts it:

The perceptual model A is a device that assigns a full structural description D to a presented utterance U, utilizing in the process its internalized generative grammar G, where G generates a phonetic representation R of U with the structural description D…The learning model B is a device which constructs a theory G (a generative grammar G of a certain langue) as its output on the basis of primary linguistic data (e.g. specimens of parole) as input…We can think of general linguistic theory as an attempt to specify the character of device B. We can regard a particular grammar as, in part, an attempt to specify the information available in principle (i.e. apart from limitations of attention, memory, etc.) to A that makes it capable of understanding an arbitrary utterance, to the highly non-trivial extent that understanding is determined by the structural description provided by the generative grammar. (26)

Thus G and FL are seen as causally relevant factors in explaining various kinds of performances; normal discourse and acquisition. The aim of linguistics is to describe these two mechanisms, which causally contribute to these two kinds of “behavior,” (i.e. talking and language acquisition).

What is the empirical criterion of adequacy for the two cases? The relevant measure of evaluation for the first is that it “correctly describes the linguistic intuition of the speaker” (26). Note the singular intuition! What we call linguistic intuitions reflect a speakers grammatical intuition (i.e. the sense of his/her language). That’s why they are important. But the thing we want our theory of G to match is the singular, the plural being interesting to the degree that it reveals this. We return to this anon.

The relevant measure for evaluating proposals about (2) are that the Gs B selects correspond to “the speakers’ linguistic intuition, in the case of particular languages” (27). Thus, the adequacy of B is judged in relation to how good the Gs it selects are in describing a native speaker’s actual grammatical intuition (i.e. the mental structures that underlie linguistic facility).

The aim, then, is to describe the features of real cognitive objects, either Gs that speakers actually have and procedures for constructing these Gs that speaker’s come equipped with. These are the objects of inquiry and what linguistic theories should aim to model.

Why does Chomsky take these as the two central problems? Because of two basic very big and very obvious facts. The first, and the one that he hammers again and again in CILT, is the fact of linguistic creativity, by which Chomsky intends the following:

…a mature native speaker can produce a new sentence of his language on the appropriate occasion, and other speakers can understand it immediately, though it is equally new to them. Most of our linguistic experience, both as speakers and hearers, is with new sentences; once we have mastered a language, the class of sentences with which we can operate fluently is so vast that for all practical purposes (and, obviously, for all theoretical purposes), we may regard it as infinite. (7)

So, the fact of linguistic creativity implicates mastery of a recursive procedure  (aka a G) that is used by native speakers in understanding and producing utterances. And the fact that Gs are required to explain this creativity means that a native speaker must acquire such a G in order to be fluent. So, the aim of linguistics is to describe these Gs and explain how they are acquired.

Chomsky also notes a second interesting capacity that G knowledge endows a native speaker with: the “ability to identify deviant sentences and, on occasion, to impose an interpretation on them” (7).

As we all know, this second capacity (aka: native speaker linguistic intuitions) has proven to be an excellent window into the structure of a native speakers language specific capacity. The generative enterprise has relied on this capacity to probe the structure of G and FL. It is what licenses GGs reliance on linguistic intuitions (note the ‘s’ here) as guides to the structure of linguistic intuition (note the absence of an ‘s’). So, Gs explain (in part) how linguistic creativity is possible and their structure can be probed by querying native speakers’ evaluations of the acceptability and interpretation of products of these Gs.

It is worth noting that the second capacity does not follow from the fact that speaker’s possess Gs. This could have been true without it being true that speakers could usefully reflect on G products. Speakers could have used Gs to speak and understand without having reliable linguistic intuitions useful for probing this capacity.

Chomsky discusses skepticism regarding such judgment data in chapter 3. Not surprisingly he concludes that it’s the best thing we’ve got and that “[w]e neglect such data at the cost of destroying the subject” (56). However, as Chomsky noted in CILT such judgments are not “sacrosanct and beyond any conceivable doubt” (56). Some such data might be bad (as any data in any area might be). We can firm it up by looking for “consistency among speakers of similar backgrounds” as well as “for a particular speaker on different occasions” (56). In other words, such data as a class are fine, though particular instances are reasonably challenged.

Chomsky notes a second important check on the reliability of such data. I call it the proof-of-the-pudding test: “The possibility of constructing a systematic and general theory” also matters. Theory tests data just as much as data tests theory. With 60 years of hindsight we can conclude that such data has been very useful and reliable precisely because the theories built on it have proven to be remarkably insightful.

So, CILT picks out two central questions for linguistic investigation and explains why they should be cynosures of further inquiry. Moreover, he outlines the kind of data that is relevant in pursuing these questions. Moreover, and most famously, in chapter 2 he outlines what he takes to be the relevant measures of theoretical adequacy; observational, descriptive and explanatory.  Let’s turn to this next.

Chomsky identifies three levels of adequacy for grammatical description: (i) observational adequacy, (ii) descriptive adequacy and (iii) explanatory adequacy.

Observational adequacy is “the lowest level” and is achieved “if the grammar presents the observed primary data correctly” (29). As Chomsky is quick to point out (see his note 1) what constitutes the relevant observable data is not at all straightforward. One measure of relevance involves the “possibility for a systematic theory.” Moreover, in an important sense, what linguists are looking for are data that bear on linguistic structure and so good data is that which are sensitive to these structures. Sadly, however, linguistic structure is not itself observable and is only accessible to a speaker only via an utterance that embodies it. Some utterances are good windows into these structures and so judgments based on these are generally useful (that’s why quite often unacceptability is a better window into G than acceptability). But some are not. Useful data allows one to infer grammatical structure from its effects in the visible utterance, and what data does this is not always obvious. As Chomsky puts it:

The problem of determining what data is valuable and to the point is not an easy one. What is observed is often neither relevant nor significant, and what is relevant and significant is often very difficult to observe, in linguistics no less than…anywhere in science.

What’s a descriptively adequate description? It’s one that “gives a correct account of the linguistic intuition of the native speaker, and specifies the observed data (in particular) in terms of significant generalizations that express the underlying regularities in the language” (28). So, a descriptively adequate description will enumerate the properties of that G that the speaker has internalized. Given that Gs are recursive rules systems, they will (implicitly) embody regularities characteristic of the language they generate. 

Last of all we get to explanatory adequacy. Theories of grammar achieve this level if  they provide “a general basis for selecting a grammar that achieves he second level of success over other grammars consistent with the relevant observed data that do not achieve this level of success” (28).  Explanatory adequacy is a predicate of theories of FL. Descriptive adequacy is a predicate of Gs. Explanatorily adequate FLs are those that derive descriptively adequate Gs relative to some specification of PLD. As is clear, issues of descriptive and explanatory adequacy are intimately intertwined with considerations of both bearing on the adequacy of each. Like it or not, claims about descriptive adequacy commit hostages to explanatory adequacy no less than do claims about the latter for the former. Given the close connection between explanatory adequacy and Plato’s Problem and the PoS issues that surround it, Chomsky’s vision of linguistics demands that these concerns be at the center of every linguist’s attention (sad to say, IMO, this is hardly the case nowadays).

Chapter 2 does a very nice job operationalizing these notions in the context of linguistic theory circa the early to mid 60s. There is still lots to learn by reading these discussions (especially, IMO, the section on levels of adequacy in semantics).  It is also worth carefully re-reading section 2.4 where Chomsky sums up the discussion of the importance of the measures. Here is his blunt assessment (52):

…three levels of adequacy have been sketched…Of these, only the levels of descriptive and explanatory adequacy (and ultimately on the latter) are of sufficient interest to justify further discussion.

This makes perfect sense given the two questions CILT highlights. If you are interested in human language, then the name of the game is ultimately to describe the properties of FL/UG. All else is interesting to the degree that it contributes to this end. Unfortunately, much current work on language seems to assume that discussions of PL/UG are at best premature and quite often little more than cow pie.  Many are happy to limit their interests to “coverage the data,” aiming primarily for observational adequacy, with some pretensions to descriptive adequacy. Chomsky has some choice remarks about this.  Here are two:

It is important to bear in mind that a grammar that assigns correctly the mass of structural descriptions (remote as this is from present hopes) would still be of no particular linguistic interest unless it also were to provide some insight onto those formal properties that distinguish a natural language from arbitrary, enumerable sets of structural descriptions. At best, such a grammar would help to clarify the subject matter for linguistic theory, just as a fourteenth century clock depicting the positions of the heavenly bodies merely posed, but did not even suggest an answer to the questions to which classical physics addressed itself. (52-3)

In other words, work that fails to at least suggest something about the structure of FL/UG is of very dubious value given the central questions of GG. A corollary suggests itself: it is always worth explicitly asking what light some piece of work tells us about FL/UG. If the answer is unclear, then this is very much worth knowing.

Here’s the second quote:

Comprehensiveness of [data, NH] coverage does not seem to me to be a serious or significant goal in the present stage of linguistic science. Gross coverage of data can be achieved in many ways, by grammars of very different forms. Consequently, we learn little about the nature of linguitc structure from the study of grammars that merely accomplish this…[I]t is only by studying the properties of gramamrs that achieve higher levels of adequacy and by gradually increasing the scope of description without sacrificing depth of analysis that we can hope to sharpen and extend our understanding of the nature of linguistic structure. (53)

I see no reason to think that we have finally reached a stage where big data work will shed much light on the structure of grammar. In fact, I would go further. I doubt that data coverage in the big data/corpus linguistic sense will ever be of linguistic interest. Why? Because it is seldom driven by the impulse of uncovering the basic properties of linguistic structure. An explanatory theory aims to uncover the basic operations and principles that descriptively adequate Gs deploy. There is no doubt that these principles interact with many other non-linguistic factors in every day speech. But if your interest is in these principles and operations, then it needs a good argument to conclude that looking at speech in the wild or lots of it will reveal what these principles are. It’s not how things proceed in the “real” sciences, so why think that this is the right way of doing linguistics? Beats me. 

One last bon mot from Chomsky: He makes an important distinction between exceptions and counter-examples. The latter are important, the former not so much, or not obviously much. A counter example is interesting because it contradicts a principle and principles are what FL/UG is all about. An exception need not. It may simply show what is already conceded, that our theories do not aim towards broad data coverage, i.e. text fidelity.  Here’s Chomsky on this:

Examples that lie beyond the scope of a grammar are quite innocuous unless they show the superiority of some alternative grammar. They do not show that the grammar as already formulated is incorrect. Examples that contradict the principles formulated in some general theory show that, to at least this extent, the theory is incorrect and needs revision. (55)

What we are interested in is the failure of principles, not in the failure of coverage.

CILT should be required reading for all GGers. From where I sit, its theoretical and methodological observations are as relevant today as they were when first written. In fact, they may be more relevant today. As a field becomes technically more sophisticated it can loose its bearings. Technique substitutes for insight. Keeping ones eyes on the central questions of interest is a useful prophylactic against this. CILT has a very clear research agenda. It has a clear target of explanation and outlines relevant criteria of success. It has the virtue of being clear about these things. If you too are interested in these fundamental questions, then nightly chanting from the pages of CILT will serve you well. 


  1. Hi Dr. Hornstein,

    Have you read Hilary Putnam's latest blog post How would you disagree?



    1. Putnam was my thesis advisor and is a very smart guy, much smarter than I am at any rate. But his comments on this topic have always struck me as both misinformed and rudely dismissive. First of all, it is not at all clear that Chomsky has anything to say about the fund of concepts natively available to us. This is much more Fodor's bag. What Chomsky does assume is that there are natively provided restrictions on the form of grammars and that this has an impact on how sentences are understood and pronounced. Nothing that Putnam says in his note bears on this.

      Second, he really doesn't discuss Fodor's point, which makes an important distinction between concept acquisition and belief fixation. Fodor's point is that there is a deep sense in which ALL theories of learning assume that it is a process of selection from a pre-specified set of available options. If this is so, then we don't acquire new concepts though we may fix new beliefs. He thinks that there is a way out of this if we could decompose all concepts to a primitive base, but he, like Putnam, doesn't believe that this is viable. Now Putnam thinks that this is absurd. Say he is right. What he has not shown is that Fodor's argument is wrong, only that we don't understand how acquisition works. Fodor might agree. Recall, his is a conditional claim.

      So Putnam's remarks are no more sensible now than they were when he first made them. The real problem being that he doesn't really address the points that either Chomsky or Fodor made.

      Two last points:As Chomsky has rightly noted, there is no "innateness controversy." There can't be. EVERYONE assumes some innate bases for acquisition and learning. We call these biases now. The question is not whether but which. Putnam's argument from incredulity is, again, irrelevant to this point. Second, Chomsky's view on evolution given his more recent (20 years!) minimalist musings are much more interesting than Putnam's caricature. As I've discussed this a lot elsewhere on the blog I will leave these comments with this bare assertion here.

      So, what do I think? That a really smart guy has decided not to engage with the issues for reasons that have always befuddled me.

  2. Many many thanks to you for writing this amazing post.

    Golf Espagne