Tuesday, February 16, 2016

More on subjacency

Peter Svenonius (once again) asks the right question and Omer (once again) has interesting things to say about it (see here). Take a look. Here is my take on the issues. Please chime in with yours.

I agree with Omer that there may not be much of a consensus right now about how to deal with successive cyclicity in detail. However, so far as I can tell, there is general agreement that it has something to do with the PIC (this is the current analogue of the subacency principle, Bounding nodes and  the domains they create). 

As you know, there are two extant versions of the PIC, the more favored one being the one wherein a complement of a phase head is rendered inaccessible at the NEXT phase (usually when the next phase head is accessed). This comes close to coding the old idea of subjacent domain (as was early observed one has access to the domain one is in and the next one (i.e. no counting)). The strong version of a phase is less in favor, but it has some charms for it would force something like a "no edge skipping" requirement on the grammar (e.g. the old Rizzi idea that one could "skip" the most immediate Comp would be ruled out). There are purported arguments against this stronger version, but they never struck me as dispositive (and there are Legate style arguments against it). At any rate, there is consensus that phases should derive cyclicity.

How closely is this tied to features, uninterpretable or otherwise? Logically speaking, not that closely so far as I can tell. The issue of features is tied to whether movement is optional or obligatory. If Greed drives movement or uninterpretability then C features might be needed. Yes if Greed is strong and no if something like uninterpretability suffices to drive one to the phase edge as a last resort (Phase balance or Boskovic's take on the same idea). There is some evidence that intermediate Cs can have features, as we know. The generalization to all languages is a standard GG move. So, the idea is not empirically nuts. Of course, WHY this should be true is unclear in the absence of something like Greed, but then maybe this is an argument for a strong version of Greed.

This goes against the current fashion. It seems that nowadays movement is free again (free at last, free at last, thank the lord, free at last!). But then there is no requirement that there be intermediate features to drive movement, nor that there be uninterpretable features on WH to force it to move. The WH moves or it does not. If it does, then it must move to the intermediate C for PIC reasons. If it doesn't then no convergence. More specifically, what one needs are language specific requirements that force a given G to have a WH up top overtly in some languages (something like the old strong feature) or some kind of Rizzi Criterion that is fulfilled in G variable ways. This seems generally assumed in current technology, so no biggie here.

There is one last idea that has been tied to successive cyclicity: Chomsky's current idea about labels. Oddly, for Chomsky, the fact that there are languages where there appears to be agreement in non WH Cs with a moving WH is a big problem. Agreement should obviate further movement. Of course one can get fancier here with different features having different effects on labeling (and so movement), but this begins to hand code in the property we want explained (not a good thing to do).

That's the way things look from where I sit. So, there are several ways of getting edge to edge movement all involving the PIC in some fashion and thereby recoding the old subjacency criterion. I want to emphasize this: this is not a new explanation but a recoding of the old one (not that this is a bad thing).

Two last points: what is less clear to me is how this all hooks up with islands. Chomsky, it seems to me, is reluctant to take islands as G-real phenomena. He seems inclined to take the view he once criticized, viz: that islands are performance residues of complexity. I am skeptical myself, but it is a logical possibility. The Sprouse stuff has convinced me that it is likely false. 

This leaves the question of how to code Islands in phases? That's easy (as anyone who has tried will attest). The problem is that the coding follows from nothing (why are D edges different from C edges? why weak PIC rather than stropping? Why transfer when next pause head chosen rather than next phase completed? Why C and D and v as phases? Why week vs strong phases?). In fact, the coding just recapitulates the machinery in classical Subjacency theory. Or, Minimalism has not given us any insight into the details of subjacency as of this date. So, islands stand as having no good deep explanation beyond the one that Chomsky already provided for Subjacency that I quoted in the body of the earlier post.

Second: we really would love to tie island effects with ECP effects as Barriers and Cinque-Rizzi tried to do. Why? Because the domains for bounding and ECP are so damn similar. It would be really odd (IMO too odd to be tolerable) were these driven by different mechanisms given that their domains are virtually identical. So, we need to find a way of finally addressing ECP questions within MP. In particular we need to find a way of unifying them in ways more conceptually acceptable than the Barriers/Lasnik-Saito theory did.

So island effects are currently no better understood within minimalist theory than they were within GB. The GB story can be smoothly translated into technologically acceptable minimalist terms, but doing so provides no insight. Moreover, some parts of the old theory, the ECP part dealing with adjuncts vs arguments and their differing locality conditions, really has not good minimalist counterparts (does anyone really thing gamma marking is part of FL/UG?). That's how I see things. You? 


  1. Regarding intermediate movement, I have always been fond of a perspective that I first came across in Greg's thesis, though it might have already floated around earlier. The idea is that only the final target site needs to have a feature that triggers movement, the successive cyclic landing sites are inserted during the mapping from derivation trees to derived trees.

    One can interpret that as Heck & Müller's Phase Balance, but the idea is more general than that since it completely abstracts aways from any temporal ordering of construction-building steps, computational cycles that are about to conclude and so on. We could just as well think of intermediate movement as a sort of Oregon trail where a phrase starts moving in the hope that it will eventually find a landing site with a suitable feature to license its displacement. Different perspectives, but the same basic process.

    1. But how does that explain the observations to the effect that the putative intermediate locations often show signs of some kind of disturbance (complementizer change in Irish as documented by McCloskey 1979; a grab bag of similar effect compiled by Zaenen 1983)?. The difference between camping at a site before moving on to the next one vs. checking out that it's OK to fly over and then flying over it seems to me like a difference without a difference.

      I think those observations really did shift the conception of the way 'movement' worked away from the Ross-Postal one-fell-swoop conception towards Chomsky's (I recall Edwin Williams suggesting 'telegraph poles' as a metaphor that could used to express the essential idea without getting tangled up in controversies about movement vs deletion).

    2. I'm with Avery on this one, with a twist. I can see how limiting the domain makes sense from a computational point of view (you move there because you must for derivational reasons) but why should traces be inserted in the derived object? They have no obvious interpretive reflex, they don't "mean" anything there nor do they sound like anything there. So why put them in when mapping to the derived object? It just makes no sense. It does make sense.

      It also raises another question: does this mean that the locality we find imposed by cyclicity become otiose when viewed from the perspective of derivation trees? And if so, does this constitute an argument AGAINST entirely dispensing with derived objects as linguistically significant objects?

    3. Cyclicity needs to be posited either way, neither representation format gives it to you for free. You have to assume some mechanism like subjacency, phases, something that rules out one-fell-swoop movement. The only thing that changes under the perspective I outlined is that this mechanism is no longer enforced in the derivation tree but in the mapping to derived trees. The same goes for disturbances, they are part of the mapping to derived trees, just like a +PAST T-head is "disturbed" in different ways depending on the verb in the same clause. That disturbances exist isn't a logical necessity in either scenario.

      So what about phases? The idea behind phases is that they somehow reduce memory burden and thus can be regarded as third factors. At this point that is indeed something that doesn't make much sense from the perspective I describe above. But it doesn't make much sense with features for intermediate movement either. At this point, it doesn't make sense from any formal perspective because you could just as well have something like cooper storage for movers rather than putting them in escape hatches.

      But the key phrase here is at this point. Our understanding of syntax is still lacking when it comes to the complexity of the mapping. We have lower and upper bounds, but they aren't particularly tight for now. It might easily be the case that inserting traces at key points can bring down the complexity of that mapping.

      Let me give you an example from phonology, where we already have a better grasp of the mappings thanks to recent work by Jane Chandlee and Jeff Heinz, among others: a progressive vowel harmony pattern is very natural, whereas a variant where only the last vowel displays harmony with the first vowel while all intermediate vowels stay the same is unheard of. 15 years ago, this seemed to be but a quirk of language because both patterns are easily generated by finite-state transducers. But now we have a much more fine-grained hierarchy of string transductions, and it turns out that the natural vowel harmony pattern is much simpler than the unnatural one.

      If traces induce a similar complexity reduction for mappings in syntax, that would immediately derive cyclicity (though it would not explain why CP and vP and possibly DP and PP are singled out). And before someone asks: one-fell-swoop movement does provably reduce the complexity of the derivation trees.

      [Side remark: if I remember correctly, the Collins & Stabler paper shows that the standard notion of movement as remerge only works if phases remain accessible for the entire derivation. So why should there be escape hatches? It seems that once you look close enough, phases just can't derive successive cyclic movement in a non-stipulative manner.]

    4. I know I am being dense, but I don't see how getting intermediate traces makes sense if it something that happens on the way to the mapping of the derived objects, especially if "derived" object is just a fancy name for the Ss and Ms in pairs. There intermediate "traces" are interpretively quite uninteresting. There is not obvious interface reason why they should be "put in" unless they are residues of bounded operations that have no choice but to put them in. Another way of saying the same thing is that elements in edges that are not interpreted are very weird bare output conditions. Of course, one can assume this, but they make no sense. And that, IMO, is a problem.

      Things are, of course worse than this. As you note one swipe movement is better along some complexity measures. Is there really any reason to think that the mapping to Ss and Ms is facilitated by intermediate traces? Is there reason to think that these intermediate traces will cause syntactic effects like agreement and binding? So, I see that one can recode the derivational story in representational terms, but from where I sit, these do not seem like reasonable representational objects. Of course, I could be wrong, but...

    5. If traces/copies reduce the complexity of the mapping, that's all the motivation it takes to put them in. Intermediate movement steps would then just be a crutch enforced by a system that likes to keep its computations simple, a nice third factor principle. Or did computability considerations suddenly fall out of fashion?

      If so, here's another argument: learnability. It might turn out that the mappings with traces/copies are learnable and those without aren't --- something along those lines has happened in phonology, where input strictly local and output strictly local mappings are learnable, but finite-state transductions are not.

      What about side-effects, could those be triggered by traces/copies in a natural way? Yes, if instead of traces you have copies and the processes that handle these are so simple that they cannot reliably distinguish copies from the real thing. That is actually how you get progressive vowel harmony with strictly output local mappings: these mappings cannot distinguish a basic i from a derived one and that's why a derived i also triggers harmony in the next vowel.

      For syntax all of this is little more than an educated guess at this point (there's that phrase again). But crucially the answers aren't decades away, I actually expect them to turn up very soon. Just like in computational phonology, where we have learned an overwhelming amount in the last 10 years. As soon as we have logical characterizations of all the mappings we see in phonology, it will be very easy to lift them to trees for syntax. I'd be surprised if that doesn't happen before the end of the decade. Patience, young padawan.

    6. I agree that if one can show that intermediate traces reduce the mapping to the interfaces then this would be an argument that makes sense to me. Learnability less so. There is nothing incoherent about an FL that disallows long distance A'dependencies. So, even were it to facilitate learnability, this does not explain why they exist in the first place, though given that it exists this is what we might expect.

      At any rate, I see where you are going. It does suggest however that Ss are structured much more elaborately than is required to support semantic interpretation. The outputs look less like CIs and more like LFs: structures that reflect both meaning considerations and syntactic ones. This is possible and worth thinking about. Thx.

    7. I do not think that traces are important here, and see no reason why they should reduce the complexity of the interface maps (nor even how to make sense of this statement). If we step back and consider intermediate landing site phenomena from a perspective neutral to movement vs slash-feature percolation, we see that the real linguistic generalization is that some constructions are sensitive to whether they lie on a path between two dependencies of an expression (as I think Avery was getting at).

      From the derivational perspective, the information that matters, namely, whether you are on the path of some mover, is present and available in principle to every head at the time it is first merged. There is no need to 'insert' intermediate landing sites (because there is no need to build up derived structure).

  2. Re Norbert's comment that there's general consensus that something like the PIC has to be implicated in deriving successive cyclicity, I'd like to register one voice of dissent, in that I'm still expecting an interfacey, non-syntactic understanding of the relevant effects to emerge. The fact is virtually all of the relevant evidence for obligatory cyclic stop-offs (in some languages, with some extraction types) comes from effects seen in complement clauses, so I suspect there may be something to be learned by looking at complement clauses more carefully, not least since there's recently been a lot of interesting work on the syntax and semantics of complement clauses which focuses on the question of what kind of semantic object a complement clause is. It's surely relevant that the semantic properties of a given complement clause conditions whether or not it's an island (factive v non-factive etc), and indeed that the semantic properties of adjunct clauses conditions their opacity too (Rob Truswell's work). Thus it strikes me as plausible that this stuff on complement clauses will reveal that successive cyclic stop-offs in clause edges do something which is required for clauses to be interpreted (maybe Kayne is right that all complement clauses involve movement of an operator to the edge, with an operator doing the job when there's no extraction), and so we can say stop-offs are motivated by semantic considerations rather than purely syntactic ones like the PIC. I hold out hope for this as I'm not really convinced by the case for PIC-based explanations.

    One other thing to bear in mind when it comes to thinking about different ways of modelling successive cyclicity (given the comments above about successive cyclic mvt being meaningless) is intermediate reconstruction, i.e. where A'-movement expands scope and binding possibilities. So, I'd want to know whether a mechanism like what's in Greg's thesis (mentioned by Thomas above) would be compatible with this. Of course assessing this question would require a worked-out theory of reconstruction, which takes us on a different tangent.

    1. Just a word: I tend to agree that phase theory to date has been less than enlightening. Where it works, it looks a lot like the old subjacency theory (not necessarily a bad thing). It hasn't added much new, save a new batch of terminology.

      As for the hope that we might be able to reduce successive cyclicity to some kind of semantic condition at CI, color me skeptical. There may be different kinds of complements, but declarative complements remain declarative whether or not we extract a wh from them and move it to a higher clause. Yet, this extraction of the wh triggers "agreement" effects that are absent when the same declarative has no semantic hole within. So, if the issue is complement selection we will need a much more refined notion of "declarative," one that distinguishes declaratives with extracted expressions from one's that don't. I don't believe that this distinction will be all that intuitive semantically. Anyway, that's what I suspect. However, as this is virgin territory, so who knows.

      Last point: MPers really need to start thinking about islands and the antecedent government part of the ECP again. Right now, to my eye, we have no good idea how to understand these phenomena in a unified way that is also not very ad hoc. This domain of data was once the glory of GG. Now it is a recalcitrant backwater that most of us ignore.

    2. @Gary: I had a longer reply that got eaten by a browser crash (damn you, dwb!), so here's the abridged version: if by compatible you mean expressible, the answer is yes. You do not lose anything by removing intermediate features, you just move some things into the mapping that were previously part of the derivation. If you mean intuitively appealing, my remarks above apply: there are some things that seem rather stipulative from a derivational perspective, but that might just be because our formal understanding of the mappings is still lacking in several areas.

      If you want to know more about MG semantics, Greg Kobele's thesis and some of his recent papers contain very detailed and rigorous proposals. I don't think that he would analyze reconstruction in terms of intermediate movement nowadays, but I believe he had something along these lines in his thesis.

    3. I actually think that reconstruction is better seen derivationally as interpreting a mover before it gets to its final landing position. While I have an aesthetic preference to only permit this in actually moved-to (i.e. feature-checked) positions, the formal set up would very easily allow interpretation at any point from base generation up to the final landing site without changing the complexity theoretic properties of the system.

  3. This comment has been removed by the author.

  4. It's funny that you should discuss subjacency now, as I've just finished reading Rob Chametzky's "A Theory of Phrase Markers" which includes a possible explanation of Islands that makes zero reference to subjacency. Chametzky argues that Islands are a result of non-canonical labelling and phrase construction procedures. Given Chomsky's new interest in labelling, maybe Chametzky's ideas can be brought into the MP.

    A very cursory glance at the set of Islands seems two suggests that they all arguably involve some sort of {XP, YP} structure, which are exactly those which are most problematic in POP.

    1. I should also note that there was a poster at the most recent NELS that suggested labelling is a condition on extraction.

    2. Yes, Rob's idae has been pursued by others. It essentially reduces all islands to adjunct islands. I am pretty sure that Uriagereka suggested this in one of his multiple SO papers. In my 2009 book, I think that i suggested that lack of lables on adjuncts might lie behind their opacity. So, this is an idea worth pursuing. It would be nice, however, to figure out how to squeeze ECP effects of the arg vs adjunct kind into the system. So, by all means lets explore this idea. It is a good one, and there are some serious attempts already in the literature outlining how this might get done.

      Last point: not sure Chomsky's idea will work as his idea is to lable under transfer, and that this is motivated by CI interpretatio. That suggests no lables at AP or in the derivation. Not sure then how this will get us island effects. Nothing in the derivation or at AP will be labelled. This makes islands CI effects. Maybe, but this does go against the general idea that islands do not regulate LFy effects. But who knows. Worth pursuing.