Monday, July 7, 2014

Comments on lecture 3, part III

Here’s part 3. First two are here and here. Chomsky’s lectures are here.

1.     The Halting Problem

As usual, let’s assume that the earlier objections don’t derail the project (which clearly they don’t) and let’s keep following Chomsky’s logic.  Here is one more problem that Chomsky addresses. We know why XPs move and why they can stop. The next question is why they must stop.  Rizzi called this the “halting problem” (no, it’s not related to the real halting problem, though it does sound like it might be eh?). The issue is why a WH (or a DP in an agreeing Spec) does not move any further once there. Chomsky attributes this to uninterpretability of the resulting structure at the CI interface. Let’s look at the details.

The relevant structure is illustrated by (4). This illustrates that criterial agreement “freezes” the DP preventing further movement.  Why?
                        (4)  *What does John wonder [what [C+Q [ Bill ate]]]
Chomsky suggests that (4) is not syntactically illicit but is illegible at CI. Why?  Chomsky does not distinguish between the +Q-C in Wh questions and the one in Yes/No questions.  Empirically, this is a necessary assumption given the observation that a verb can take an embedded WH question as complement iff it can also take a Y/N question. This only makes sense if the Qs in both are the same. If this is so, we can ask why (4) cannot be interpreted like (5):
                        (5) What does John wonder if/whether Bill ate
This has the same structure as (4) but for if/whether. Note that (4) cannot be interpreted as a degraded version of (5). What is less clear to me is why not? There is a +Q-C there, just as in (4). So what’s the problem? We know from matrix clauses that Y/N questions do not require an overt if/whether to license the Y/N interpretation. So, the embedded Q should not need an overt WH morpheme to license the interpretation. Like I said, Chomsky asserts that this derivation has CI problems, but I really don’t see why.[1]

Why does Chomsky attribute the problem with (4) to CI interpretation? He has few other options. Though it is true that agreement suffices to disambiguate the application of MLA, movement will do so as well (i.e. movement will not cause problems for the MLA). But I don’t think that Chomsky wants to say that what blocks further movement should be traced to the details of uF valuation.  Though he could offer the following story: The WH can have its features valued in CP (or via Agree before moving there) and when features can be valued they must be. If so, the WH in the embedded position must have its features valued. If we further assume that feature values cannot be over-written, then if the WH moves further it cannot Agree with the matrix WH and so the matrix {XP, YP} cannot be labeled (recall, we need more than mere feature identity, we need feature agreement).  Note that this relies on some substantive details about feature valuation (e.g. it’s not optional, it’s indelible, uFs cannot “stack”). Perhaps these details follow from the minimal theory feature checking. I leave this to those with a better sense of what is minimal here. However, I suspect that Chomsky does not want to tie his theories to the specifics of feature checking algorithms (I don’t think that I would) and if not, he needs a CI interface story.[2] Whatever the upshot of all of this, we should recall that examples like (4) are orthogonal to the MLA. Chomsky’s CI interface account is there to plug a hole in his story. It does not follow from it and it should be fine if something else explained (4).

So, the MLA has a story for successive cyclic movement, though it is unclear that it is much different from the standard story in the literature, IMO. When all the details are considered the relevant accounts all assume that without agreement in “Spec XP positions,” Gs require movement and with agreement in Spec XP positions Gs forbid movement. The virtues of the MLA are not empirical, but theoretical (viz. that it purportedly follows from the minimal assumptions regarding the algorithm that provides labels in endocentric configurations that the interfaces can read). Just how simple these assumptions are I leave for you to decide. IMO, they ain’t that minimal or simple. But this is partly a matter of taste, I think. I return to a broader consideration of Chomsky’s system towards the end of this exegesis.

            5. Fixed Subject Condition Effects (ECP) and the EPP

Chomsky argues that the MLA is able to unify two further well-known effects given reasonable ancillary assumptions. The two are the subject/object asymmetry in the ECP and the EPP illustrations in (6):
                        (6)       a. Which man did you wonder if Bill liked
                                    b. *Which man did you wonder if liked Bill
                                    c. (*there) arrived a beautiful woman
                                    d. arrivato una bella ragazza
                                    e. (*I) ate a pizza
                                    f.  (I) ho mangiato una pizza

As Chomsky notes (under prodding from David P), the EPP fact that interests Chomsky is the one involving (null) expletives (6c,d), not null subjects together with referential null pronouns (6e,f). It’s not actually clear to me why Chomsky makes this distinction as the account he gives seems to cover both cases. I suspect that the issue is empirical, as I will explain below.

Here’s the account of the EPP.  Chomsky assumes that English T0 is weak (careful!). He treats them as having the same labeling powers as lexical roots (viz. they can’t do it).  This is true even after the tense and agreement features lower from C0 onto T0.  So how is a label assigned by the MLA? Via agreement with a subject in Spec TP. The resultant phi features provide the label that weak T cannot provide on its own. So in (6c), without there the past tense T0 head by itself cannot provide a label. If there is inserted, however, agreement between there and the “T’” suffices to license a phi-label. 

This story raises two questions. First, why is agreement necessary at all? Why can’t the expletive alone label the structure, just like v,n,a suffice to label the structures in e.g. {n, Root} configurations?  This would be empirically awkward, but it is not clear what prevents this theoretically, given Chomsky’s assumptions.  Second, it’s odd that the expletive suffices to license agreement given the standard assumption (Chomsky assumes this too) that in expletive constructions it’s the associate that determines the agreement features on T. But if T agrees not with there but with a beautiful woman then how does the MLA explain these EPP effects? It must be that there agrees with T0 in a way that the associate cannot. What way is that? Unless we are told, it looks like we are again assuming that the EPP is basically the “I-need-a-specifier” condition.

What on Chomsky’s account explains the English contrast with Italian? It assumes that in Italian T0 is strong and thus suffices to license a label on T0 without a DP in its Spec. Thus, Italian T0 is like n,v,a in English.  This account eschews null expletives. Rather the Italian TP in (6d) is a simple {Y, XP} structure with Y being the strong T0.[3] 

The explanation Chomsky gives easily extends to explain the unacceptability of (6e). T0 is weak and a lexical subject is needed to license the labeling via the MLA.  The problem arises with the Italian examples. Here’s what I mean. There is good reason to think that in cases like (6f) there is a pronominal like element in the Spec TP position. The reason is that such sentences are understood as having thematic subjects and these null pronouns care bindable.  If there is nothing there at all, this is hard to understand. Ok, so say there is a null pronoun (aka pro) there.  This is ok for Chomsky so long as we assume that this pro can agree with T0 and this agreement is what affords the label. If we assume this, then pro has phi-features.

Now to English: if Italian pro has phi-features then the minimal assumption is that English pro does too. But then why is (6e) unacceptable?  It seems that what English needs is not merely an agreeing element in Spec XP, but an overt agreeing element. But this seems to obviate the need for Chomsky’s more elaborate assumptions concerning the weak status of T0 in English.  At the least, it suggests that Weak/Strong here has more to do with the SM system than with the CI system. The problem is that it is not clear what this has to do with labeling? 

Are phrase labels required for SM interpretation? Maybe, though specific values seem not to be. What I mean is that identifying that something is an XP might be important for phrasal phonology. But distinguishing VPs from TPs from CPs is not obviously relevant. The question then is whether one needs a labeling algorithm to determine whether something is an XP. Can one identify a structure as XP in the absence of identifying a particular head. It would seem that one can do so easily at least in the relevant cases: any {XP,YP} configuration will be an MaxP for the purposes of SM (as will any {X, YP}. So it would seem that for these purposes MLA is not required, at least conceptually.  But if this is so, then the English/Italian contrast becomes not a property of the MLA but a fact about TPs in English requiring overt subjects in contrast to those in Italian. Why? Well because.  Does the MLA provide a better story? Not so far as I can tell. It all comes down to an idiosyncratic difference between Italian and English and embedding this difference in MLA technology does not appear to do much work.

However, this might be unfair. Chomsky argues that the same account can explain Fixed Subject Constraint (FSC) effects (originally discovered by Perlmutter in his thesis, I believe, and tied together with the availability of pro in the relevant G). Chomsky ties them together as well. Recall, that in English T0 is weak. If so, we need something in “Spec TP” to allow the MLA to label the structure.  That serves to block A’-movement (and DP movement as well, one should note[4]). Note the copy left behind will not suffice as it is part of a chain with links outside the “TP” domain. So the subject WH cannot move at pains of not labeling the TP and causing interface interpretation problems.

Chomsky observes that this predicts that in Italian, where T0 is strong, we should not find FSCs.  And this is plausibly correct (but see below).

In sum, Chomsky argues that the MLA serves to unify EPP and ECP (more accurately FSC) effects and that is another good argument in its favor, in addition to the conceptual ones he uses to motivate the elimination of labeling in the CS.[5]

Here are some potential problems with this analysis: First, though unacceptable, sentences like (6b) are hardly uninterpretable.[6] Indeed, these cases are a bit like the student seems sleeping. These have perfectly obvious CI interpretations and so do examples like (6b). It’s not clear why if they cannot be interpreted at the CI interface. 

Second, we know that FSCs appear in non-interrogatives as well. It’s known as the that-t effect. However, it is also well known that the unacceptability of these seems to vary across speakers. This would predict that for such speakers T0 is strong. But this further predicts that they should find sentences like arrived a nice boy/entered a well dressed dog perfectly acceptable. I have my doubts, but it’s worth looking to find out. 

Third, it’s not clear to me why the indicated reasoning doesn’t block movement from subjects altogether. Why is (7) fine?
                        (7) Which man do you believe t saw Mary
How does the MLA label the embedded TP if the Wh moves?  In other words, why is moving the subject bad if there is something overtly in C but fine if there isn’t?  Chomsky’s proposal does not obviously distinguish these cases. In fact, where Chomsky’s story differs from the traditional one is in tracing FSCs to something about “SpecT-T” relations. The standard accounts tie them to “C-Spec T” relations.[7] By taking the “Spec T-T” relation as central, it’s quite unclear how to account for the obviation of FSCs once the C is deleted.[8]

Fourth, as David Pesetsky noted during the lecture, Chomsky really misdescribes the Italian data.  On his story, it should be possible to extract a WH from “Spec T” position in Italian because Italian T0 is strong.  However, this seems to be incorrect.  In cases where the morphology indicates where the subject has moved from, we find that we cannot WH move from Spec T (this is discussed in a great paper by Brandi and Cordin (here)) Now, one might say that in these dialects T0 is weak, and that might be true. But, one would then also expect no analogues of (6d) in these dialects, and this seems to be incorrect (see Brandi and Cordin 115: (13)/(14)). Of course, appearances here may be deceiving. So, let’s chalk this up as another puzzle.

In sum, Chomsky offers some intriguing connections between the MLA and two more well known FL effects, the FSC and the EPP.  IMO, the analyses are at best suggestive.  There are many loose ends. Chomsky is aware of this and seems not to really be bothered for in his opinion the strength of the proposal lies in its conceptual simplicity. The empirical benefits are a bonus (maybe even a big one) but the problems are tolerable for the proposal depends less on their viability than on the fact that in Chomsky’s opinion, the MLA is the conceptually optimal account of labeling given a the minimal basic operation Merge.  In the last part, I return to consider the conceptual lay of the land once again.

[1] Note that a similar problem does not occur with DPs in “Spec TP” positions. In such cases, TP will be in the complement domain of the C phase head. Thus, before it can move out of the CP, Transfer will remove it from the purview of the computational system. Note that this requires that Cs be strong phases and that T must “inherit” its features from C, otherwise we could generate a TP either with a weak C or no C and then Transfer would not serve to prevent a hyper-raising derivation. Note that such derivations appear to exist (i.e. some Gs allow hyper-raising). Consider this another puzzle for the analysis.
[2] Observe that for this kind of story to work, we need to assume that it is moving WH that has uFs, contrary to the assumption that uFs are limited to phase heads and are checked in Probe/goal configurations.
Note too that this is effectively a greed based account: no movement without feature checking. This makes a lot of sense if agreement takes place in {XP,YP} configurations. Once the features of an XP are valued in {XP, YP} no feature checking is possible and so things stay put. Thus greedy movement suffices to block this. The problem, of course, is that it is not clear how conceptually necessary it is for I-merge to be greedy and why I-merge would be greedy but E-merge would not be (for the cognoscenti, I am trying to develop a slippery slope argument for Q-feature checking: if that were so, then both E and I merge would be greedy).
[3] Is it worth noting that Chomsky’s assumptions regarding T0 make it hard to see why it exists at all. In English, it is indistinguishable from AGR heads: it has virtually no properties of its own and the properties it does have are extremely idiosyncratic.  T has enjoyed a weird position in GG for a very long time. It’s odd within the Barrier’s framework and is no less odd within this one. Again, the odder its properties the greater the challenge for DP concerns.
[4] This seems to suggest that either non-finite T is strong (otherwise successive cyclic DP movement will be prohibited) or that non-finite “TP” doesn’t require a label. Either assumption strikes me as strange. Another puzzle? Another possibility is that non-finite TP is not subject to the EPP as Castillo, Drury and Grohmann as well as Epstein and Seeley have proposed a while back.
[5] For the record, the analysis does not address ECP’s argument/adjunct asymmetries.
[6] Indeed, anyone who has taught undergrad syntax 2 will have encountered speakers that find these kinds of sentences to be only marginally unacceptable. Bleive me, they do exist.
[7] Actually this is true of Rizzi’s proposal and the one in Aoun et al. The idea was that C was able to license the trace in subject position in some cases but not in others. Pesetsky and Torrego develops a version of this account: with movement of Nominative WH to Spec C obviating the need for T to C and that being the morphological reflex of T to C in embedded contexts. In both however the relation of interest in FSCs is that between the C and the expression in Spec TP.
[8] The obviation effects go beyond the removal of an overt C. We find them attenuated in cases where an adverb has been fronted:
                        (i) Who do you think that sooner or later will solve the problem
However, why this should be so on any of these accounts is not particularly clear.


  1. I'm sorry if this comes across as overly snarky, but it needs to be said: it's 2014 – why the heck are we still holding onto this outdated (by which I mean, debunked) idea that [Spec,TP] and phi-feature agreement have anything to do with one another in the general case? I'm not denying, btw, that they may have something to do with one another in the grammars of particular languages (though English might not be that good an example of this, given Locative Inversion; I'd hold up Bantu languages as better poster-children for that). But I wasn't under the impression that this was a set of lectures about the grammar of English/Bantu; this is supposed to be about UG/FL, right? And Icelandic (which is of the (6c/6e) type, not the (6d/6f) type) is a human language, yes?

    And just to head something off at the pass: I don't think this has the status of other "empirical puzzles" Norbert touched on in his discussion. We are talking about the agreement-[Spec,TP] connection being hardwired into the technology Chomsky chooses to employ (namely, agreement→labeling→halting). In other words, the technological choices made here are founded on a purported empirical observation that is false. I find this disheartening.

    1. I take it by 'we' you mean Chomsky? ONe of the things I had trouble triangulating on, both in the lectures and some recent papers, is what mechanism (rule) Chomsky thought lay behind feature valuation. Is it Agree in probe/goal system or Spec-X agreement. I thought that the latter made more sense in the system he develops, but I could be wrong. This might bear on your issue, right? For if it is all set within a Probe/goal system there is no obvious relation between agreement and spec-x configurations. The problem then becomes to explain how the story gets agreement-->labeling-->halting, as you put it.

      Last point: there are many of you who have argued that the whole feature agreement stuff is not that important in MP accounts. I think that Dennis O has sort of made this point. It would be great to have a reaction form those who think this to the technology Chomsky deploys here and how they see it fitting in with the overall minimalist strategy. I have my own views about this that I will lay out in the last (yes I promise, last) set of comments on lecture 3 that I will put up sometime this week.

    2. I take it by 'we' you mean Chomsky?

      Well, yes, but I was referring to the broad scientific community that I count myself a part of.

      Is it Agree in probe/goal system or Spec-X agreement

      As discussed by many people in many places, Spec-X agreement cannot subsume probe/goal unless it's coupled with what I'll call "interface-vacuous" movement (i.e., movement where both PF and LF interpret the lower copy; see, e.g., Bobaljik 2002). And now here comes the "but": even if that is what's going on in Icelandic when we see agreement with low nominatives, this kind of movement does not satisfy subjecthood needs w.r.t. effects like (6c)/(6e). (Otherwise, the equivalent of "arrived.PL some people.NOM", with no expletive, would have been okay in Icelandic.) And so I maintain: regardless of the Agree vs. Spec-X issue, the [Spec,TP]-agreement connection is spurious as far as data like (6) are concerned.

    3. Ok, Icelandic takes care of the speculation I had wrt Italian dialects. Agreement is not enough. ONe still needs an expletive. This, of course, is true in English existential constructions too. At any rate, it seems that there is no derivation of the EPP without many more details to be spelled out. In fact, the EPP, as you note, becomes an SM interface condition. As I mentioned in the comments, this leaves successive cyclic raising somewhat unclear as well as WH movement from subjects without Cs. But, so be it. BTW, I agree that as of now, there is no good account of the data Chomsky discusses.