Friday, February 14, 2014

Derivation Trees and Phrase Structure

Another snow day, another blog post. Last week we took a gander at derivation trees and noticed that they satisfy a number of properties that should appeal to Minimalists. Loudmouth that I am I took things a step further and proclaimed that there is no good reason to keep using phrase structure trees now that we have this shiny new toy that does everything phrase structure trees do, just better. Considering the central role of phrase structure trees, that's a pretty bold claim... or is it?

Derivation Trees are Evolution, not Revolution

One point I didn't emphasize much in my previous post --- and was promptly called out for by Greg --- is that derivation trees are extremely similar to the kind of trees syntacticians work with on a daily basis. Recall that it takes only four steps to turn a derivation into the kind of phrase structure tree generated by Minimalist grammars:
  1. Add movement branches
    This step is necessary only because I like to save myself some typesetting by using derivation trees where Move nodes are unary branching. I can safely do that for standard MGs thanks to the SMC, which makes Move deterministic. If you do not like the SMC, or if you value clarity more than succinctness, then your derivations already have those movement branches anyways.
  2. Linearly order siblings
    MGs create phrase structure trees where the nodes are linearly ordered, so the step from unordered derivations to ordered phrase structure trees has to take care of this. But syntacticians like their phrase structure trees, too, to be unordered so this step can be skipped.
  3. Relabel interior nodes with labels
    Recent developments in Minimalist syntax have moved the field towards unlabeled phrase structure trees, which means we can slack off once again and move on right to step 4.
  4. Remove/gray out all features
    One central idea of Minimalism is that representations containing unchecked features cannot be processed at the interfaces. So if you want interfaces to take phrase structure trees as their input, you have to encode which features have been checked, and removal is one way of accomplishing that.
So let's take stock and see what we have here. Step 1 is required only if your derivation trees have unary branching nodes, which has certain mathematical advantages but is mostly irrelevant for linguistic purposes. Step 2 is not only redundant, it would actually take us away from the kind of phrase structure trees syntacticians want. Two out, two more to go. Step 3 and 4 highlight actual differences in what derivation trees and phrase structure trees look like (depending on what your favorite version of Minimalism is), but if you are at all familiar with how Minimalist analyses work, you'll know that interior node labels or the encoding of feature checking hardly ever enter the picture. Yes, there is work that makes crucial use of these aspects of phrase structure trees, for instance Norbert's ideas on adjunction.1 But these papers are the exception, with the majority of syntactic papers depending only on Merge, phrasal movement, c-command, and concepts that build on these more primitive notions (mostly Agree and locality constraints). And derivation trees handle those without major changes. I would wager that if you take a sufficiently large sample of Minimalist proposals from the last 15 years, over 90% of them could be easily stated for derivation trees rather than phrase structures trees --- and when I say "easily", I mean that a few changes in notation would be all it takes.

So the idea that derivation trees are the central object of syntax isn't that radical at all. As far as new perspectives go, it is amazingly backwards compatible. But it shifts things just enough to open up exciting new possibilities. In the next few weeks I will talk about several exciting results that build in some way on derivation trees. To give you an idea of what those might look like, let's round out today's post with a concrete yet simple example. Our object of study: c-command.

D-Command: One Step Beyond c-Command

C-command is one of the most important structural relations in a variety of syntactic frameworks, in particular with respect to binding and NPI-licensing. The standard definition is a little on the sloppy side, here's a cleaned up version:

c-command. For every tree t and nodes m and n of t, m c-commands n iff 1) m does not reflexively dominate n, and 2) every node properly dominating m properly dominates n.

As you can see it is a very specific relation, too specific if you are a Minimalist. Epstein (1999) tried to reduce c-command to Merge,2 but if I remember correctly his definition ran into some problems with how it applied to terms, e.g. X c-commanding itself after it undergoes remerge. I haven't looked at this very carefully though since my undergrad years, so I could easily be misremembering things. Whatever may be the case, our first question should be whether c-command can be defined over the kind of derivation trees we have in mind. The answer is yes, but not necessarily in an elegant fashion, and definitely not in an insightful one.

For derivation trees where Move nodes are binary branching, we can use the c-command definition for multi-dominance trees, which alters the second clause such that some node immediately dominating m properly dominates n (if it exists). If Move is unary branching, things get more complicated as c-command in the phrase structure tree then corresponds to the union of specific subsets of the c-command relation and the dominance relation in the derivation tree. It can be done, but is a purely technical exercise that tells us little about why c-command should play a central role in syntax. This is also problem for the definition over derivation trees with binary branching Move nodes. It might be easier to state, but that does not make it any more illuminating.

So let's try something different instead: rather than translating c-command into an equivalent relation over derivation trees, let's see what kind of relations can be naturally stated over these trees. At their very core, derivation trees encode the timing of feature checking. Notice that timing is essentially dominance in the derivation tree. If a dominates b, then b must take place before a. If the two are not related by dominance, then it does not matter which one of the two occurs first in the structure-building process. Now does derivational dominance get us something like c-command? Not by itself. In the derivation tree below, the blue Merge node dominates the red one, yet the former does not c-command the latter in the corresponding phrase structure tree.

But remember that feature checking is also a natural concept in derivation trees. And if we use feature checking to home in on the dominance relations between specific nodes in a tree, something magical happens.

Let's say that a node n in an occurrence of lexical item l iff the label of n denotes an operation that checks a negative polarity feature of l. Here's the derivation tree above, with red and blue highlighting the occurrences of a and the, respectively. You can also check the table for the occurrences of each lexical item.

Lexical Item Occurrences
empty C-head none
empty T-head Merge2
the Merge5, Move3
girl Merge6
likes Merge4
a Merge7
boy Merge8

Some of you might have already guessed from this example what kind of command relation I am thinking of:

derivational command (d-command). For every derivation tree t and LIs l and l' in t, l d-commands l' iff l and l' are distinct and some occurrence of l properly dominates some occurrence of l'.

For example, that the DP the girl c-commands the DP a boy is captured by the fact that the head of the former, the, d-commands the head of the latter, a. D-command holds because Merge5 is an occurrence of the and dominates Merge7, an occurrence of a. However, a also d-commands the thanks to its higher occurrence Move1 dominating Merge5. A full table of all d-command relations is given below. As an extra stipulation, the highest head in the derivation d-commands all other lexical items.

Lexical Item d-Commandees
empty C-head empty T-head, the, girl, likes, a, boy
empty T-head the girl, likes, a, boy
the empty T-head, girl, likes, a, boy
girl none
likes the, girl, a, boy
a empty T-head, the, girl, likes, boy
boy none

Intuitively, d-command corresponds almost exactly to "c-command at some point during the structure-building process" (rather than"c-command at S-structure", if I may be so daring to fall back to good ol' GB vocabulary). It derives this rather complex notion from the timing of negative polarity feature checking, which in turn indicates the integration into a bigger structure. A node a c-command b iff a is integrated at a later point than b. The only differences between c-command and d-command arise with respect to heads and their arguments:
  • Heads always d-command their specifiers, and
  • a head is d-commanded by one of its arguments (in particular specifiers) only if the latter does not remain in situ.

Is d-Command a Substitute for c-Command?

Now is d-command a useful relation? Yes, provided c-command is.

First, the kind of c-command that d-command encodes seems to be the exactly the one that matters for binding theory, where an anaphor must be locally c-commanded at some point and a pronoun may never be locally c-commanded at any point during the derivation.

Second, the minor differences between d-command and c-command are harmless. I can think of only one case where a specifier not c-commanding its head might be problematic, and that is possessor doubling constructions. In Bavarian German, for example, possessives usually take the form [DP [DP the Peter] his girlfriend ]. One could say that [DP the Peter] acts as an abstract subject, thus turning the outer DP into the binding domain for the possessive anaphor his, which is then locally bound by its specifier. Since binding requires c-command, this account cannot be ported to d-command unless the specifier actually undergoes movement to some higher position, for which there is no evidence. But this is just one out of many possible analyses, hardly a deal breaker for d-command.

What about the other direction, a head c-commanding its specifier? Well, that is essentially m-command, and whatever use you may have for m-command is a reason to prefer d-command to c-command. So d-command seems to cover at least as much empirical ground as c-command, if not more.

But what about linearization and Kayne's LCA, I hear you ask, where we definitely do not want a head to c-command its specifier. There's two answers here. For one thing, the LCA only needs to establish some linear order among the lexical items, nothing commits us to the assumption that this linear order is exactly the surface order we observe. So maybe heads do precede all their arguments in the order assigned by the LCA, mirroring the default predicate argument order in mathematical logic. And then this order is slightly permuted later down the road to PF. In this scenario we also would no longer have to worry about supposedly non-linearizable structures such as [VP kiss John], because arguments never d-command their heads.3

More importantly, though, d-command isn't a suitable basis for the LCA because it encodes "c-command at some point during the structure-building proces", whereas the LCA uses "c-command at S-structure". This c-command relation is a lot more complex over derivation trees.

surface d-command. For every derivation tree t and LIs l and l' in t, l sd-commands l' iff
  1. the highest occurrence of l properly dominates the highest occurrence of l', and
  2. there is no l'' distinct from l and l' such that
    • the highest occurrence of l properly dominates the lowest occurrence of l'', and
    • the lowest occurrence of l'' properly dominates the highest occurrence of l', and
    • the highest occurrence of l'' properly dominates the highest occurrence of l.
What's going on here? Clause 1 requires that l is in a higher surface position than where l' was immediately after its final structural integration step, and clause 2 ensures that l' isn't contained by some phrase that has undergone movement to a higher position than l.

So from a derivational perspective, "surface c-command" is the less natural variant of c-command. In representational theories like GB, it was actually the more common one and played a crucial role in the Proper Binding Condition, for example. These conditions have been abandoned in the meantime, with the LCA as the last bastion of surface c-command. Personally, I view this as a hint that there is something off with the LCA and that linearization is controlled by other means. But I'm sure there are other interpretations, feel free to share them in the comments section.

There's many more linguistic questions and issues to be explored here, but the bottom line is that derivation trees aren't all that different but still improve on phrase structure trees in interesting ways. The fact that a complex relation like c-command seems to be closely related to the more basic concept of derivational recency or prominence is just one example of that.

  1. Hornstein, Norbert and Jairo Nunes (2008): Adjunction, Labeling, and Bare Phrase Structure. Biolinguistics 2, 57--86
  2. Epstein, Samuel D., Erich M. Groat, Ruriko Kawashima, and Hisatsugu Kitahara (1998): A Derivational Approach to Syntactic Relations. Oxford University Press. Epstein, Samuel D. (1999): Un-principled Syntax and the Derivation of Syntactic Relations. In Samuel D. Epstein and Norbert Hornstein (eds.) Working Minimalism. MIT Press.
  3. Of course this is actually a disadvantage of d-command if you are a fan of Moro's Dynamic Antisymmetry.


  1. @T.G. (couldn't resist those initials)

    This is a bit off the main point, and stop me if you've heard this, but there is a Better Way to understand C-command than this (quoted from above):

    "The standard definition is a little on the sloppy side, here's a cleaned up version:

    c-command. For every tree t and nodes m and n of t, m c-commands n iff 1) m does not reflexively dominate n, and 2) every node properly dominating m properly dominates n.

    As you can see it is a very specific relation, too specific if you are a Minimalist."

    Richardson & Chametzky (1985) is the urquelle of How to Think Right about C-command. Chametzky (1996, 2000, 2011) keeps trying to explain it. The basic
    point is this. Don't ask "Does (node) X C-command (node) Y?"; ask rather "What
    are the C-commanding nodes of node X?" That is, take the point of view of the commandee, not the commander. Then, the set of C-commanding nodes for X is the set of all nodes that are sisters of nodes that dominate X (dominance reflexive). C-command is, as I've think I've mentioned before, a generalization of the sister relation. It has, notice, no specifically linguistic content.

    As for Epstein's "derivational C-command"

    "Epstein (1999) tried to reduce c-command to Merge,2 but if I remember correctly his definition ran into some problems with how it applied to terms, e.g. X c-commanding itself after it undergoes remerge."

    I've castigated this in Chametzky (2011) (that's my chapter in the Oxford Handbook of Linguistic Minimalism). I'd link to some version of it, but I don't know how to. I do have a scan of the chapter that I could email to whomever. For those interested, the place to start is on p.317, Section 14.2.


    1. there is a Better Way to understand C-command
      I like this definition, and it seems to generalize straightforwardly to multi-dominance trees if one stipulates that no node can c-command a node it reflexively dominates. So at this point it is mostly a matter of taste whether one prefers dominance + sisterhood or derivational dominance between occurrences. Both seem fairly natural, but they do slightly different things.

      I'd link to some version of it, but I don't know how to.
      If the file is available online, then you can link to it using standard html markup. I would include the code here, but blogger automatically converts it into a link, and neither the "code" nor the "pre" tags work. If you can't get it to work, you can email me the URL and I'll add a link.

    2. For anyone who's interested, a manuscript version of Chametzky (2011) is now available for download.

    3. A lot of this discussion of derivation trees and phrase structure trees, and Rob Chametzky's (2011) paper on C-command discussing the derivational view versus the representational view remind me of an old Feynman lecture on the relationship between maths and physics, especially the bits on how there are multiple ways of looking at/thinking about the same facts in physics.

      Since a lot of us here have physics envy, I thought I would post the video link (

      Apologies if this is not terribly germane to the posts and is bordering on blog spam.

  2. This comment has been removed by a blog administrator.

  3. This comment has been removed by a blog administrator.

  4. This comment has been removed by a blog administrator.