What's in UG (part 2)

In the previous post (here) I provided some relevant background for discussing a forthcoming paper in Cognition by Cole, Hermon and Yanti (CHY) that argues against a UG interpretation of the Binding Theory (BT), and, by extension, against most PoS forms of reasoning to UG. Here we get into some details. There is one more post to follow. Told you this was going to be involved.

The paper argues that there are languages where the relatively clean morphological distinctions found in English between anaphors and pronominals is considerably less clear. Thus, there are some languages where some “semantically dependent expressions” (SDE, I return in a moment to explicate this notion) are completely exempt from BT and that there are many where there is no clear morphological distinction between reflexives and pronouns (as their appears to be in English). None of this is news. In fact, as CHY notes, and has been known for quite a while even to typological illiterates like me, there are even languages where the morphological distinction between pronominals and anaphors is hard to discern.[1] In other words, there are languages where the functional surface vocabulary (FSV) relevant for binding does not cleanly reflect the underlying categories relevant for BT. Or, to put this another way, it’s easier to study BT in some languages than in others precisely because in some languages the overt FSV more clearly marks the underlying grammatical distinctions than in other languages. Isn’t this one of the reasons for doing comparative syntax? To find those languages where the system of interest is easiest to study? Does it matter that in some languages the system of interest is muddied by a lousy functional inventory? Not that I can see.[2]

However, CHY’s main argument is that this is a problem for BT and that it shows that UG is not key to understanding binding phenomena and that PoS problems do not support nativist conclusions in the domain of binding. Here’s how CHY puts it (p.3):

UG constitutes a general solution to the problem of the poverty of the stimulus, and would be expected to provide the solution to this dilemma in the realm of Binding, as well as other areas of syntax. If, however, Binding is not determined by UG, and, hence, must be learned by the language learner (as we claim below is in fact the case) then Binding must be learnable pace a priori claims to the contrary. For this reason, Binding constitutes a good test case for the claim that the facts of syntax generally cannot be accounted for without making the assumption that much of our knowledge of syntax is innate (i.e. determined by UG).

So the aim of CHY is to argue that BT facts can do without a UGish BT. However, what CHY actually shows is that it is often hard to fix which overt morphemes/expressions (if any) fall under which parts of BT, i.e. to fix the FSVs of a given language’s G. Thus, the fact that BT does not tell us how to categorize overt morphemes wrt binding categories is taken by CHY to show that a UG BT cannot play a part in solving PoS problems in the domain of binding. In other words, CHY argues that because BT is not by itself a full theory of acquisition, (in particular does not provide a general account of fixing FSV) then it does nothing at all. And this is a very poor argument against UG.

Let me be clear exactly what my criticism of CHY is: nowhere does CHY outline just what the PoS problem in the domain of binding is (as I did in the previous post). It doesn’t discuss how facts like (3) in the post here, could be learned purely on the basis of PLD. Nor does CHY argue that the binding principles would not be useful in categorizing morphemes. The CHY argument is entirely based on the observation that the BT categories are often opaque (i.e. that the mapping from overt FSVs to BT relevant categories is often unclear).[3] However, as noted above, this cannot be an argument against the UG nature of BT because the claim that BT is part of UG never assumed that it could solve this problem (all by itself, though it almost surely contributes towards solving it). And a good thing too for we know that whatever is in UG it had better not be a specification of which language particular morphemes are subject to BT-A, B or C. If anything is learned, it’s this.  So, the CHY conclusions do not, as a matter of logic, argue against the claim that BT is part of FL/UG. Let’s consider some further details.

The CHY argument comes in three parts. Part one is the observation that though many languages have a BT functional vocabulary that clearly separates BT anaphors from BT pronominals (see CHY section 2), this is not universally true.[4] In section 3 CHY notes that there are languages (a version of Javanese) with BT exempt anaphors, expressions that are anaphoric (in one sense) but do not fall under BT at all. CHY takes this to be problematic. In particular, CHY claims that the existence of BT exempt anaphors (p. 7):

…constitutes a serious challenge for UG-based approaches to Binding. The presence in a language of a form that is used anaphorically but which is exempt from the Binding requirements of UG would impose a considerable burden on the child acquiring the language…The existence of UG sanctioned categories for anaphora simplifies learning only if all anaphoric elements (in the non technical sense of “anaphoric” that includes both pronouns and reflexives) are subject to UG principles. Thus, in the context of a system containing a mixture of UG compliant and UG exempt elements, UG principles do not provide a solution to the poverty of stimulus problem. At the least, the distinction between exempt and non- exempt forms must be learned from experience…and it is doubtful that the data is sufficiently structured to make this distinction learnable on the basis of the distributional data that are available to the child.

What’s the problem? Not all forms that are “anaphoric” in the “non-technical sense” (i.e. they are semantic but not syntactic dependents) are anaphoric in the BT sense (i.e. subject to BT-A) and, CHY claims, there is no PLD that can help the PLD figure out if an expression belongs in one category or another. Let’s consider these claims much more closely, for they come close to inviting a form of magical thinking.

First, what is the “non-technical sense” of anaphoric? Linguists have two senses of “anaphoric.” The first is “semantic.” An expression is an anaphoric dependent of another expression if its interpretation presupposes/requires/relies on the interpretation of another. This is what I meant by SDEs above. English contains expressions like this. For example, ‘the others’ is an expression that has no interpretation sans a semantic antecedent (i.e. it must have a semantic antecedent). However, it is not subject to BT-A as it can have a non sentence internal antecedent. In fact, all non-deictic pronouns (e.g. cataphoric pronouns) are another example.

(4)  a. Three of the men are wearing tuxedos. The others are sporting jeans.
b. John ate an ice cream. Mary then kissed him.

In (4a), the others depends for its interpretation on three of the men in the previous sentence, as does him on John in (4b). These, then are semantically anaphoric, but BT-A exempt. Are these then a problem for a UG account of BT-A? Only if categorizing them as exempt from BT-A is problematic. So is it? Is there no available positive PLD that might inform the LAD that despite appearances, these expressions are not subject to BT-A (i.e. that despite being semantic anaphors they are not syntactic anaphors)? Well, how about the data in (4). Are these too exotic to be considered part of the PLD? Not from where I sit. Of course, I could be wrong, but it is certainly not “doubtful” that such data might be available to the LAD. In fact, I would go further: cross sentential dependencies of the sort in (4) are almost certainly part of the PLD. [5]

Is Javanese any different? Maybe, but as CHY notes (p. 6) the BT exempt anaphor it discusses can have discourse antecedents, in contrast to the BT-A forms that cannot. But this then is exactly the kind of evidence that would advise the LAD against categorizing them as syntactically anaphoric (i.e. subject to BT-A). If so, there is no PoS problem as regards the categorization of these overt expressions into more abstract syntactic categories (i.e. treating them (or not) as FSVs for the abstract grammatical category BT-anaphor).

Conclusion: CHY’s argument here fails. It is certainly right in concluding that the world would be a better place for the LAD if semantic anaphora were an infallible indicator of syntactic anaphora. A world where visible diagnostics are perfect indicators of underlying structure is a nice place (of course, this is no less true for measles and quarks than for BT-A anaphors). But, this does not mean that a world where the two pull apart (i.e. our own) implies that the distinction cannot be acquired on the basis of PLD. It all depends on the PLD and the learner, and CHY does not consider its own cited data in concluding that the PLD is sadly lacking. Is cross-sentential “anaphora” that exotic in the PLD that the LAD would never have access to it? Maybe, but I would like a lot more than simple assertion to that effect before concluding that it is so.

In fact, we can go further. CHY must assume that this categorization can be fixed on the basis of PLD (contrary to their apparent claim to the contrary). Why so? CHY argues that distinguishing BT-A from BT-A-exempt FSVs is not based on innate features of the LAD. That, in fact, is its main point (which, btw, must be correct given the obvious variation in surface forms). But if it’s not fixed via FL/UG then it must be learned. But to be learned there needs to be evidence that the LAD can use to learn it. Thus, there must be evidence in the PLD relevant to fixing the categorization. I mention this as CHY appears to deny this. Specifically CHY asserts (p. 7):

At the least, the distinction between exempt and non- exempt forms must be learned from experience…and it is doubtful that the data is sufficiently structured to make this distinction learnable on the basis of the distributional data that are available to the child

But as a matter of logic, this pair of claims is close to contradictory (That’s being weaselish. IMO it is a contradiction). If the categorization is learned, then the PLD must be “sufficiently structured” to allow it to be learned. And if the PLD is not “sufficiently structured” to allow it to be learned, then it must be innately determined. There is no third alternative.  There is no learning fairy that fits neatly between these possibilities.[6] Luckily, relevant data appears to be plausibly accessible, and so there is no clear PoS problem as regards binning potential “anaphoric” FSVs into different BT relevant categories.

Last point: recall, that even where it very hard to fix BT exemption on the basis of PLD it would not argue against the classical BT being part of UG. Why? Because CHY does not address how BT-A compliant FSVs acquire all of their distributional properties? In specific, CHY does not address how native speakers acquire competence wrt the data in (3) (see previous post). Recall that this is the PoS problem that BT was developed to address. And in this case, as we outlined in the previous post, there is a clear PoS problem. Thus, the LAD cannot inductively acquire knowledge of facts like (3) (in earlier post) as there is no relevant data in the PLD for fixing this knowledge. This argument stands regardless of how the problems with BT-A exempt expressions is resolves.  In short, it is a very odd criticism of any theory that because it (BT in this case) does not solve a problem that it was never intended to address that it also does not solve the problem that it was constructed to address. And, to further conclude, that because some other theory (not yet provided I may add) solves a problem BT was not constructed to address that the same solution obviously solves the problem that BT did address.

Conclusion: this argument comes nowhere close to showing anything about the UG status of BT. And, for the record, I don’t believe that CHY has shown that categorizing FSVs as BT exempt cannot be inferred from a judicious use of the PLD (and, as I noted above, CHY can’t really assume this either).[7]

The next and last post addresses CHY’s second argument.

[1] The most well-known cases involve languages with so called “long distance anaphors.” Are these really anaphors or just bound pronouns? What’s the difference? Are they both or neither? Tough questions. But does linguistic theory need to provide strict answers? Or does it only need to provide ways of classifying things to the degree that they can be so classified? Can you spot the rhetorical question?
[2] Nor is looking for the easiest window into the mechanism limited to linguists. Ask your favorite neuroscientist about squid axons sometime and see why they were a research favorite for such a long time.
[3] For example, on p 9 CHY states:

…the division of anaphora into reflexives and pronouns cannot be simply a matter of compliance with UG principles…[this] suggest strongly that the pattern modeled in Binding Theory is not primarily due to the interaction among principles of UG…

This suggests that CHY (wrongly) understands BT as aiming to explain how FSV gets mapped into BT categories. And CHY is right. BT does not do this. But it does other things, like explain the pattern in (3), which is not attested in the PLD.

[4] Oddly, IMO, CHY group English and Chinese together as well-behaved languages wrt BT. However, if English is the BT poster child (which it really shouldn’t be, see below) then Chinese makes things less transparent. So, for many speakers, ‘ziji’ can serve both as the overt form of the reflexive and of the long distance anaphor. True, ‘ta ziji’ distributes much like ‘himself,’ but Chinese has a simple ‘self’ form that English does not have and this might well make the categorization problem in Chinese harder than it is in English, at least for the canonical data. Of course, not even English is that well behaved as the well-known facts about picture NPs indicates.
[5] JL, my friendly local acquisitionist, tells me that I am correct in thinking that such data are available in the PLD.
[6] There is a third option, but it is irrelevant to CHY’s concerns. It could be that some feature is not specified by PLD or intrinsically fixed by LAD. In such a case, the “parameter” could be random in the population (see Han, Lidz and Musolino LI (2007) for a worked out example in Korean “dialects”). However, even in this case, there is an innate learning principle at work: if no data relevant to P then flip a coin and set P among UG available options. As I noted, this possibility is not relevant for the CHY argument.
[7] CHY argue against the hypothesis that the BT exempt anaphor is ambiguously either a local anaphor or a pronoun depending on context. I sort of liked this proposal. CHY argues against it. It argues instead that the exempt anaphor is under-specified wrt being an anaphor and a pronominal. CHY proposes that this under-specification allows such anaphors to be simultaneously interpreted anaphors and pronouns (as opposed to being one or the other in any given context). The argument against this involves strict vs sloppy readings under ellipsis. CHY assumes that BT-A anaphors only license sloppy readings under ellipsis. It further notes that the BT exempt anaphor in Javanese it discusses allows strict readings even when the anaphor in the ellipsis licensing antecedent would be locally bound (i.e. even when the ambiguity thesis should treat the relevant structure as one of unambiguous BT-A binding).

The form of this argument is fine (or sorta fine: it relies on an ad hoc stipulation that expressions that are underdetermined wrt anaphor and pronominal features is simultaneously both anaphoric and pronominal semantically. So far as I know, this finely-tuned assumption does not follow from anything else about binding or under-specification, and so is ad hoc.). However the premise seems to me empirically ill founded. Assume, as CHY does, that English is well behaved wrt BT-A then we predict that reflexive binding should never license strict readings under ellipsis in English. However this is false (or there is some challenging counter-evidence). One standard case where the strict reading is acceptable is in (i):
(i)             John1 defended himself1 more competently than his lawyer did (defend him1)
In (i) the “defend John” reading is fine. For me, ditto for sentences like (ii) and (iii).  
(ii)           John1 defended himself1 ably in court. His1 lawyers did not (defend him1 ably)
(iii)          John1 admires himself1 greatly. Nobody else does (admire him1).
Thus, the diagnostic tool CHY uses to argue against the ambiguity thesis is not correct in general as paradigmatic reflexive antecedents can license strict readings in ellipsis sites.

Why?  Well this requires some theory of ellipsis (which CHY does not provide). The technical issue of relevance will be what licenses ellipsis. One standard theory sees ellipsis as deletion under identity. What are the relevant identity parameters? Say that morpho-phonological identity is one such, then the fact that the same FSV in Javanese is ambiguous between a reflexive and a pronoun could allow the strict reading to be readily available (the “reflexive” antecedent would be morpho-phonologically identical to the “pronominal” in the ellipsis site). The expression in the ellipsis site interpreted as a pronoun would be morphologically identical to the expression in the antecedent clause with the expression interpreted as a reflexive. It is well known that such conditions are grammatically relevant (e.g. ATB and Parasitic Gap constructions often require morphological case identity to be licit).
            Last point: it is worth observing that the strict/sloppy dichotomy CHY uses as a diagnostic is not part of BT. In fact, BT is mum concerning these matters. It is a strictly empirical question, which CHY argues one way. My point is not that CHY is wrong, but that the premise is dubious and this leaves the ambiguity hypothesis alive and kicking.

1 comment:

  1. Another similar case of what Norbert calls a semantically-dependent element (SDE) not being subject to BT-A or BT-B is epithets, which are subject to – of all things – BT-C:

    (1) John[i]'s mother thinks that Mary should support the idiot[i].

    (2) *John[i] thinks that Mary should support the idiot[i].

    Again, this echoes a point Norbert was making (unless I misunderstood it): the mapping between "is/isn't a semantic dependent" and "is/isn't subject to particular binding condition X" is hardly transparent. This is not news, not even to people who only look at English (not that I would ever support such conduct...).