Monday, February 15, 2016

The wonder of subjacency

I am currently teaching our Grad syntax 2 course and, not surprisingly, it focuses on the Minimalist Program (MP). Given my predilections (and the influence of Howard Lasnik) I find that one can best appreciate MP by starting with Government-Binding Theory (GB). Jairo Nunes, Kleanthes Grohmann and me used GB as backdrop to MP in our intro book (here). But every time I teach this course I become more and more impressed with the virtues of GB. It is a pretty neat little theory, and, for my money, it still provides the best set of analytical tools in linguistics. In fact, were I charged with the task of describing a new construction or writing the G of a language I would render it in a GB idiom, minimalist technology be damned. However, this is not what I wanted to write about here. Rather, I wanted to sing the praises on one particular sub-part of GB that dealt with a topic that has largely fallen out of research favor but that stands as one of the great scientific accomplishments of Generative Grammar (GG). The topic? Islands and Subjacency. What follows is why I consider it such an achievement.

As everyone knows, Chomsky’s aim in developing the theory of Subjacency (S) was to unify Ross’s islands, the latter having been discovered and described about a decade earlier. The locus classicus of this effort is On Wh Movement (OWM) where Chomsky lays out the story in gory detail.  Here’s a question: what did the unification add to Ross’s original discussion?

One thing it added was unification. Looked at theoretically, Ross’s islands are a motely, a list of domains opaque to movement. From the get-go, it was hard to believe that this list was what FL/UG coded. There has to be some underlying method. Chomsky’s goal was to find it. I do not actually recall his theoretical discontent being widely shared across the GG community (but, in my experience GG hardly ever suffers from the mental unease that poor theory regularly generates in Chomsky). At any rate, in unifying Ross’s islands, OWM tries to explain why the islands we find are the islands we have. In fact, OWM tries to tie the existence of islands to general computational considerations thereby providing what is, in retrospect, an excellent paradigm of Minimalist thinking. Here is what OWM says:

… the island constraints can be explained in terms of general and quite reasonable computational properties of formal grammar (i.e. subjacency, a property of cyclic rules that states, in effect, that transformational rules have a restricted domain of potential application; SSC, which states that only the most prominent phrase in an embedded structure is accessible to rules relating it to phrases outside; PIC, which stipulates that clauses are islands subject to the language specific escape hatch..). If this conclusion can be sustained, it will be a significant result, since such conditions as CNPC and the independent wh-island constraint seem very curious and difficult to explain on other grounds. (p. 89; On WH Movement, my emphasis).

So the list like nature of the islands becomes comprehensible when viewed from a more general computational perspective. And this is indeed a virtue.

But, and I want to emphasize this, this unification is not, as it stands, an empirical argument in favor of S. Taken at face value, what OWM demonstrates is that it is possible to unify Ross’s islands on a more rational basis, but just unifying them does not show that this unification is empirically fecund or justified.

Happily, the unification proved to be empirically very fertile indeed. OWM provides two ways that S logic could lead to the discovery of novel data.

First, it provides a general method for discovering which kinds of dependencies should be subject to island effects. OWM has a long and interesting discussion of comparative constructions and notes that given the nature of the unification proposed, comparatives should be formed by movement. This was somewhat unconventional at the time (though the work is based on some earlier work by Richie Kayne that argued for this conclusion). In fact, the most carefully worked out theory of comparatives (due to Bresnan) treated comparatives as products of a deletion operation, rather than as products of movement. If memory serves, there was quite a bit of very vigorous debate on this topic over the next little while, including at the UCSD conference where OWM was originally presented. This debate became quite heated and gave lowly grad students like me an appreciation of the old adage: when elephants fight what gets hurt is the grass. At any rate, this was one consequence of the unification that OWM emphasizes. 
A digression: could Ross’s analysis been used as a diagnostic of movement? This is, in effect, what OWM does. It assumes that if dependency D obeys islands yet allows unbounded dependency (btw, this second conjunct is a critical yet often ignored part of S reasoning) then the dependency must be the product of movement. Could Ross’s theory be interpreted in the same way? Not really. Recall, that for Ross, what makes an island and island is not the movement (movement out of islands was fine for Ross). Rather what makes an island is chopping the resumptive pronoun that movement leaves behind. In other words, for Ross, islands restrict chopping, not movement. For Chomsky, S restricts the movement and resumption is analyzed as a non-movement dependency precisely because it does not show island effects.[1] Given this, comparatives are a very good place to empirically distinguish Ross’s theory from S-theory. Why? Because comparatives have no apparent resumptive analogues like DP movement cases do (*John is taller than Bill is it/that/such). But if there are no resumptives then there can be no chopping and so no expectation of islands. This would make deletion the natural generative operation sub-serving comparatives. Thus OWM’s argument that comparatives are actually products of movement, was an empirical argument for the S view of islands.[2]
The second empirical argument for the OWM unification came from a crop of new islands. Thus, the OWM story implied that complex DPs should be islands for extraction. This implied that we should find subject islands (which we more or less do: *What do pictures of hang in the National Gallery) but also that objects should be islands (which is far less evident: What did Bill paint pictures of). OWM spends some time trying to get out from under the problems that object extraction creates. To the degree that it succeeds, then the predicted presence of subject islands is an empirical plus for S-theory.[3]
The third empirical argument in favor of S is by far the best and, if my recollection is correct, the most wow-inducing. It’s successive cyclic A’ movement. The unification of islands predicted that unbounded movement (movement that shows no island effects) is nonetheless derivationally bounded in that it is made up of a series of small bounded steps. Ross’s theory made no such prediction, Indeed, prior to S-theory there was no reason to believe it to be true. Unbounded dependencies were considered to be perfectly reasonable operations. S-theory implies that, at least for one class of dependencies (i.e. movement), such unboundedness is an illusion. This was (and is) a hell of an implication. There are many many languages where there is little evidence suggesting that anything like this is true (English being a good example of one). But, as we soon discovered (and by ‘we’ I mean GGers), it was TRUE (insert fireworks and brass bands here).
I was a grad student in Cambridge when the empirical evidence started trickling in. Jean Yves Pollock gave versions of the deservedly famous paper he co-wrote with Richie Kayne on stylistic inversion in French. If memory serves, Esther Torrego’s equally excellent paper was floating around when I was still a Cambridge denizen. As most now know, this trickle soon became a torrential stream of results with many languages providing overt evidence for cyclic Wh movement (Irish, Chamorro a.o.) At any rate, that this implication of S-theory was apparently true (or at least had non-obvious data that could be explained by it) was stunning. This is what good science does: its theories imply something unexpected and the unexpected turns out to be the case. It was great. And this was the evidence that really sold S-theory.
Let me emphasize the important argumentative structure: Unifying islands as in OWM implies that all movement, even that which does not manifest island-like properties, is local. Thus islands imply successive cyclic C to C movement. The discovery that this prediction holds is stunning confirmation of the unification of Ross’s islands in OWM and a strong confirmation of S-theory.
So, if anyone asks you what unifying islands brought to the table, successive C to C movement (or edge of domain to edge of next higher domain) is one of the biggies. It served a bit like the Syntactic Structures analysis of affix-hopping and do-support in that it sold S-theory with its aha effect and thereby made it widely accepted.

One last virtue: S-theory served as a bridge to other parts of cogsci. For example, S-theory had very natural interpretations in the context of parsing theories (E.g. Berwick and Weinberg) and learnability theories (e.g. Culicover and Wexler). S-theory served as a grammatical bridge to, IMO, the richest interaction between GG and other parts of cognition witnessed to date. In fact, like the income of most Americans, GG has receded from this high point, which, is really too bad.

So, what did S-theory add? It unified islands, allowed for a refinement of our understanding of movement, led to the postulation of new islands, implied that long movements were made up of short steps and served as a productive bridge to other parts of cognition.

And it has one last virtue of contemporary relevance. It serves (IMO) as and excellent  (maybe even the best) example we have of what linguistic theory should aim for. It is our poster child for for GGs scientific bona fides, which makes it odd that S-theory appears not to be a central part of the grad syntax curriculum anymore. I say this on the basis of very cursory investigation, actually just one or two discussions with recently minted PhDs. For the reasons noted above, this is too bad. It is a beautiful GG discovery and deserves to be regularly trumpeted as one of GGs great achievements. So, next time you are at a party and there is a lull in the conversation, remember the wonders of S-theory.

[1] Note that given current work arguing that resumption involves movement raises interesting questions about S theory. Given my partiality to this excellent idea (Demirdache is a leading exponent of this line of thinking), I think that it is worth revisiting some of the OWM assumptions, though I will refrain from doing so here.
[2] There was even independent dialectal evidence in favor of the movement analysis of comparatives: John is taller than what Bill is. The ‘what’ sure looks like a relative pronoun. This observation was due to Kayne, if I recall correctly (way to go Richie!).
[3] Wh islands another “novel” island, one that in fact Ross argued at length did not exist. As you all know, the status of Wh islands is somewhat variable cross linguistically and even among speakers of the same language. Sprouse’s thesis shows that they more or less display island-like acceptability signatures. However, whatever their status (maybe they are semantic rather than syntactic as some have argued), theoretically, they fell under the OWM unification only if one makes additional assumptions about the structure of C (how many “escape” hatches it contains). This assumption was usefully investigated empirically by Reinhart and Comorovski. They showed that the degree of freedom that S-theory allowed for was in fact empirically consequential, thus providing an indirect argument in favor of the unification along OWM lines.


  1. McCawley, in his syntax textbook, called successive-cyclic movement "implausible" (p. 531f of the 2nd edition, 1998) because it requires intermediate steps in the absence of conditioning factors (i.e. no Q feature in the intervening C's).

    What is the current consensus on how to motivate those intermediate steps in a Minimalist computation?

    1. This comment has been removed by the author.

    2. While I no longer believe in uninterpretable features and "interface crashes" and all that jazz, I always really liked Heck & Müller's Phase Balance idea — roughly, that the system recognizes that a computational cycle is about to conclude, and so it moves everything that still has unmet needs to the edge to give it a chance to fulfill its needs on the next cycle.

      As for consensus, that is of course a different issue. My sense is that people have taken the facts concerning the Irish complementizer system (among others) to show that the featural make up of complementizers is dissociable from clause typing (i.e., that a "wh feature" is not the same as a "Q feature"), and are therefore content to simply posit [wh] on intermediate non-interrogative heads. Needless to say, positing [wh] features whose only purpose is to drive intermediate wh-movement is just a redescription of the problem... I think maybe people have lost sight of that. Or, more charitably, perhaps they have lost interest in the question.

    3. Peter's question and Omer's remarks are interesting enough to promote it to a post. So I am going to do this.