Monday, November 16, 2015

What does typology teach us about FL?

I have been thinking lately about the following question: What does comparative/typology (C/T) study contribute to our understanding of FL/UG? Observe that I am taking it as obvious that GG takes the structure of FL/UG to be the proper object of study and, as a result, that any linguistic research project must ultimately be justified by the light it can shed on the fine structure of this mental organ. So, the question: what does studying C/T bring to the FL/UG table?

Interestingly, the question will sound silly to many.  After all, the general consensus is that one cannot reasonably study Universal Grammar without studying the specific Gs of lots of different languages, the more the better. Many vocal critics of GG complain that GG fails precisely because it has investigated too narrow a range of languages and has, thereby, been taken in by many false universals.

Most GGers agree with spirit of this criticism. How so? Well, the critics accuse GG of being English or Euro centric and GGers tend to reflexively drop into a defensive crouch by disputing the accuracy of the accusation. The GG response is that GG has as a matter of fact studied a very wide variety of languages from different families and eras. In other words, the counterargument is that critics are wrong because GG is already doing what they demand.

The GG reply is absolutely accurate. However, it obscures a debatable assumption, one that indicates agreement with the spirit of the criticism: that only or primarily the study of a wide variety of typologically diverse languages can ground GG conclusions that aspire to universal relevance. In other words, both GG and its critics take the intensive study of typology and variation to be a conceptually necessary part of an empirically successful UG project.

I want to pick at this assumption in what follows.  I have nothing against C/T inquiry.[1] Some good friends engage in it. I enjoy reading it. However, I want to put my narrow prejudices aside here in order to try and understand exactly what C/T work teaches us about FL/UG? Is the tacit (apparently widely accepted) assumption that C/T work is essential for (or at least, practically indispensible for or very conducive to) uncovering the structure of FL/UG correct?

Let me not be coy. I actually don’t think it is necessary, though I am ready to believe that C/T inquiry has been a practical and useful way of proceeding to investigate FL/UG. To grease the skids of this argument, let me remind you that most of biology is built on the study of a rather small number of organisms (e. coli, C. elegans, fruitflies, mice). I have rarely heard the argument made that one can’t make general claims about the basic mechanisms of biology because only a very few organisms have been intensively studied. If this is so for biology, why should the study of FL/UG be any different. Why should bears be barely (sorry I couldn’t help it) relevant for biologists but Belarusian be indispensable for linguistics? Is there more to this than just Greenbergian sentiments (which, we can all agree, should be generally resisted)? 

So is C/T work necessary? I don’t think it is. In fact, I personally believe that POS investigations (and acquisition studies more generally (though these are often very hard to do right)) are more directly revealing of FL/UG structure. A POS argument if correctly deployed (i.e. well grounded empirically) tells us more about what structure FL/UG must have than surveys (even wide ones) of different Gs do. Logically, this seems obvious. Why? Because POS arguments are impossibility arguments (see here) whereas surveys, even ones that cast a wide linguistic net, are empirically contingent on the samples surveyed. The problem with POS reasoning is not the potential payoff or the logic but the difficulty of doing it well. In particular, it is harder than I would like to always specify the nature of the relevant PLD (e.g. is only child directed speech relevant? Is PLD degree 0+?). However, when carefully done (i.e. when we can fix the relevant PLD sufficiently well), the conclusions of a POS are close to definitive. Not so for cross-linguistic surveys.[2]

Assume I am right (I know you don’t, but humor me). Nothing I’ve said gainsays the possibility that C/T inquiry is a very effective way of studying FL/UG, even if it is not necessary. So, assuming it is an effective way of studying FL/UG, what exactly does C/T inquiry bring to the FL/UG table?

I can think of three ways that C/T work could illuminate the structure of FL/UG.

First, C/T inquiry can suggest candidate universals. Second, C/T investigations can help sharpen our understanding of the extant universals. Third, it can adumbrate the range of Gish variation, which will constrain the reach of possible universal principles. Let me discuss each point in turn.

First, C/T work as a source of candidate universals. Though this is logically possible, as a matter of fact, it’s my impression that this has not been where plausible candidates have come from. From where I sit (but I concede that this might be a skewed perspective) most (virtually all?) of the candidates have come from the intensive study of a pretty small number of languages. If the list I provided here is roughly comprehensive, then many, if not most, of these were “discovered” using a pretty small range of the possible Gs out there. This is indeed often mooted as a problem for these purported universals. However, as I’ve mentioned tiresomely before, this critique often rests on a confusion between Chomsky universals with their Grennbergian eponymous doubles.

Relevantly, many of these candidate universals predate the age of intensive C/T study (say dating from the late 70s and early 80s). Not all of them, but quite a few. Indeed, let me (as usual) go a little further: there have been relatively few new candidate universals proposed over the last 20 years, despite the continually increasing investigation of more and more different Gs. That suggests to me that despite the possibility that many of our universals could have been inductively discovered by rummaging through myriad different Gs, in fact this is not what actually took place.[3] Rather, as in biology, we learned a lot by intensively studying a small number of Gs and via (sometimes inchoate) POS reasoning, plausibly concluded that what we found in English is effectively a universal feature of FL/UG. This brings us to the second way that C/T inquiry is useful. Let’s turn to this now. 

The second way that C/T inquiry has contributed to the understanding of FL/UG is that it has allowed us (i) to further empirically ground the universals discovered on the basis of a narrow range of studied languages and, (ii) much more importantly, to refine these universals. So, for example, Ross discovers island phenomena in languages like English and proposes them as due to the inherent structure of FL/UG. Chomsky comes along and develops a theory of islands that proposes that FL/UG computations are bounded (i.e. must take place in bounded domains) and that apparent long distance dependencies are in fact the products of smaller successive cyclic dependencies that respect these bounds. C/T work then comes along and refines this basic idea further. So Rizzi notes that (i) wh-islands are variable (and multiple WH languages like Romanian shows that there is more than one way to apparently violate Wh islands) and (ii) Huang suggests that islands needs to include adjuncts and subjects and (iii) work on the East Asian languages suggests that we need to distinguish island effects from ECP effects despite their structural similarity and (iv) studies of in-situ wh languages allows us to investigate the bounding requirements on overt and covert movement and (v) C/T data from Irish and Chamorro and French and Spanish provides direct evidence for successive cyclic movement even absent islands.

There are many other examples of C/T thinking purifying candidate universals. Another favorite example of mine is how the anaphor agreement effect (investigated by Rizzi and Woolford) shows that Principle A cannot be the last word on anaphor binding (see Omer’s discussion here). This effect strongly argues that anaphor licensing is not just a matter of binding domain size, as the classical GB binding theory proposes.[4] So, finding that nominative anaphors cannot be bound in Icelandic changes the way we should think about the basic form of the binding theory. In other words, considering how binding operates in a language with different case and agreement profiles from English has proven to be very informative about our basic understanding binding principles.[5]

However, though I think this work has been great (and a great resource at parties to impress friends and family), it is worth noting that the range of relevant languages needed for the refinements has been relatively small (what would we do without Icelandic!). This said, C/T work has made apparent the wide range of apparently different surface phenomena that fall into the same general underlying patterns (this is especially true of the rich investigations on case/agreement phenomena). It has also helped refine our understanding by investigating the properties of languages whose Gs make morpho-syntactically explicit what is less surface evident in other languages. So for example, the properties of inverse agreement (and hence defective intervention effects) are easier to study in languages like Icelandic where one finds overt post verbal nominatives than it is in English where there is relatively little useful morphology to track.[6] The analogue of this work in (other) areas of biology is the use of big fat and easily manipulated squid axons (rather than dainty, small and smooshy mice axons) to study neuronal conduction.

Another instance of the same thing comes from the great benefits of C/T work in identifying languages where UG principles of interest leave deeper overt footprints than in others (sometimes very very deep (e.g. inverse control, IMO)). There is no question that the effects of some principles are hard to find in some languages (e.g. island effects in languages which don’t tend to move things around much, or binding effects in Malay-2 (see here)). And there is no doubt that sometimes languages give us extremely good evidence of what is largely theoretical inference in others. Thus, as mentioned, the morphological effects of successive cyclic movement in Irish or Chamorro or verb inversion in French and Spanish make evident at the surface the successive cyclic movement that FL/UG infers from, among other things, island effects. So, there is no question that C/T research has helped ground many FL/UG universals, and has even provided striking evidence for their truth. However (and maybe this is the theorist in me talking), it is surprising how much of these refinements and evidence builds on proposals with a still very narrow C/T basis. What made the C-agreement data interesting, for example, is that it provided remarkably clear evidence for something that we already had pretty good indirect evidence for (e.g. Islands are already pretty good evidence for successive cyclic movement in a subjacency account). However, I don’t want to downplay the contributions of C/T work here. It has been instrumental in grounding lots of conclusions motivated on pretty indirect theoretical grounds, and direct evidence is always a plus. What I want to emphasize is that more often than not, this additional evidence has buttressed conclusions reached on theoretical (rather than inductive) grounds, rather than challenging them.

This leaves the third way that C/T work can be useful: it may not propose but it can dispose. It can help identify the limits of universalist ambitions. I actually think that this is much harder to do than is often assumed. I have recently discussed an (IMO unsuccessful) attempt to do this for Binding Theory (here and here), and I have elsewhere discussed the C/T work on islands and their implications for a UG theory of bounding (here). Here too I have argued that standard attempts to discredit universal claims regarding islands have fallen short and that the (more “suspect”) POS reasoning has proven far more reliable. So, I don't believe that C/T work has, by and large, been successful at clearly debunking most of the standard universals.

However, it has been important in identifying the considerable distance that can lie between a universal underlying principle and its surface expressions. Individual Gs must map underlying principles to surface forms and Gs must reflect this possible variation. Consequently, finding relevant examples thereof sets up interesting acquisition problems (both real time and logical) to be solved. Or, to say this another way, one potential value of C/T work is in identifying something to explain given FL/UG. C/T work can provide the empirical groundwork for studying how FL/UG is used to build Gs, and this can have the effect of forcing us to revise our theories of FL/UG.[7]  Let me explain.

The working GG conceit is that the LAD uses FL and its UG principles to acquire Gs on the basis of PLD. To be empirically adequate an FL/UG must allow for the derivation of different Gs (ones that respect the observed surface properties). So, one way to study FL/UG is to investigate differing languages and ask how their Gs (i.e. ones with different surface properties) could be fixed on the basis of available PLD. On this view, the variation C/T discovers is not interesting in itself but is interesting because it empirically identifies an acquisition problem: how is this variation acquired? And this problem has direct bearing on the structure of FL/UG. Of course, this does not mean that any variation implies a difference in FL/UG. There is more to actual acquisition than FL/UG. However, the problem of understanding how variation arises given FL/UG clearly bear on what we take to be in FL/UG.[8]

And this is not merely a possibility. Lots of work on historical change from the mid 1980s onwards can be, and was, seen in this light (e.g. Lightfoot, Roberts, Berwick and Nyogi). Looking for concomitant changes in Gs was used to shed light on the structure of FL/UG parameter space. The variation, in other words, was understood to tell us something about the internal structure of FL/UG. It is unclear to me how many GGers still believe in this view of parameters (see here and here). However, the logic of using G change to probe the structure of FL/UG is impeccable. And there is no reason to limit the logic to historical variation. It can apply just as well to C/T work on synchronically different Gs, closely related but different dialects, and more. 

This said, it is my impression that this is not what most C/T work actually aspires to anymore, and this is becuase most C/T research is not understood in the larger context of Plato’s Problem or how Gs are acquired by LADs in real time. In other words, C/T work   is not understood as a first step towards the study FL/UG. This is unfortunate for this is an obvious way of using C/T results to study the structure of FL/UG. Why then is this not being done? In fact, why does it not even seem to be on the C/T research radar?

I have a hunch that will likely displease you. I believe that many C/T researchers either don’t actually care to study FL/UG and/or they understand universals in Greenbergian terms. Both are products of the same conception; the idea that linguistics studies languages, not FL.  Given this view, C/T work is what linguists should do for the simple reason that C/T work investigates languages and that’s what linguistics studies. We should recognize that this is contrary to the founding conception of modern linguistics. Chomsky’s big idea was to shift the focus of study from languages to the underlying capacity for language (i.e FL/UG). Languages on this conception are not the objects of inquiry. FL is. Nor are Greenberg universals what we are looking for. We are looking for Chomsky universals (i.e. the basic structural properties of FL). Of course, C/T work might advance this investigation. But the supposition that it obviously does so needs argumentation. So let’s have some, and to start the ball rolling let me ask you: how does C/T work illuminate the structure of FL/UG? What are its greatest successes? Should we expect further illumination? Given the prevalence of the activity, it should be easy to find convincing answers to these questions.

[1] I will treat the study of variation and typological study as effectively the same things. I also think that historical change falls into the same group. Why study any of these?
[2] Aside from the fact that induction over small Ns can be hazardous (and right now the actual number of Gs surveyed is pretty small given the class of possible Gs), most languages differ from English in only having a small number of investigators. Curiously, this was also a problem in early modern biology. Max Delbruck decreed that everyone would work on e.coli in order to make sure that the biology research talent did not spread itself too thin. This is also a problem within a small field like linguistics. It would be nice if as many people worked on any other language as work on English. But this is impossible. This is one reason why English appears to be so grammatically exotic; the more people work on a language the more idiosyncratic it appears to be. This is not to disparage C/T research, but only to observe the obvious, viz. that person-power matters.
[3] Why has the discovery of new universals slowed down (if it has, recall this is my impression)? One hopeful possibility is that we’ve found more or less all of them. This has important implications for theoretical work if it is true, something that I hope to discuss at some future point.

[4] Though, as everyone knows, the GB binding theory as revised in Knowledge of Language treats the unacceptability of *John thinks himself/heself is tall as not a binding effect but an ECP effect. The anaphor-agreement effect suggests that this too is incorrect, as does the acceptability of quirky anaphoric subjects in Icelandic.
[5] I proposed one possible reinterpretation of binding theory based in part on such data here.  I cannot claim that the proposal has met with wide acceptance and so I only mention it for the delectation of the morbidly curious.
[6] One great feature of overt morphology is that it often allows for crisp speaker acceptability judgments. As this has been syntax’s basic empirical fodder, crisp judgments rock.
[7] My colleague Jeff Lidz is a master of this. Take a look at some of his papers. Omer Preminger’s recent NELS invited address does something similar from a more analytical perspective. I have other favorite practitioners of this art including Bob Berwick, Charles Yang, Ken Wexler, Elan Dresher, Janet Fodor, Stephen Crain, Steve Pinker, and this does not exhaust the list. Though it does exhaust my powers of immediate short term recall.
[8] Things are, of course, more complex. FL/UG cannot explain acquisition all by its lonesome; we also need (at least) a learning theory. Charles Yang and Jeff Lidz provide good paradigms of how to combine FL/UG and learning theory to investigate each. I urge you to take a look.


  1. I can only speak for my own line of work, but typology is a lot more important to my current interests than what is going on in individual languages (including their acquisition). I think this point is just a variation of your third argument in defense of C/T, but since it comes from a different subfield with very different methodology it might be of interest nonetheless:

    The study of individual languages was very useful in establishing lower bounds on generative capacity (Swiss German for supra-CFL, Yoruba and a few others for supra-MCFL), but it is very unlikely that we will find any more constructions that would push us even higher up the Chomsky hierarchy. And that's not restricted to weak generative capacity, we're also on pretty safe ground with respect to strong generative capacity. The Minimalist grammar framework as it stands right now can handle pretty much anything linguists use in their analyses while keeping expressivity in check and having very interesting formal properties.

    To a certain extent, this trivializes the questions that dominate the analysis of individual languages. For example, Korean allows case endings to be dropped on the last DP in fragment answers. The standard view would be that this is a peculiar property that tells us something about how FL operates. From the computational perspective outlined above this phenomenon is entirely unsurprising because the formalism is already capable of generating such languages (any formalism with subcategorization can do this). Assuming free variation, some language is bound to display this behavior, so there's nothing to see here... if that were the end of the story.

    The interesting thing is that not all logical possibilities are attested cross-linguistically. I haven't been able to fully map out the space of attested options for case dropping yet, but it's already clear that there are several typological gaps: drop only the penultimate case marker, only the antepenultimate case marker, every case marker that is preceded by an even number of DPs, and so on. Those are also things that the formalism is capable of (once again, any formalim with subcategorization can do that), so their non-existence does require explanation. Those may take the form of certain algebraic properties, showing that the typological gaps are more complex in some sense, or deriving them from independently motivated substantive universals. But whatever the answer, it is only due to the typology that the construction becomes interesting.

    Bottom-line: the study of individual languages reveals what's possible, but we also need to know what's impossible. Typology does that. POS arguments can do it too, but it is far from obvious that both cover the same ground. In addition, POS arguments are much more theory-laden than simply mapping out the space of typological (surface) variation.

    For my own research, work emphasizing typology (e.g. Inkelas' Interplay of Morphology and Phonology) has proven a lot more useful than the standard approach (e.g. Kramer's Morphosyntax of Gender; not a bad book, just not what I needed). So for my purposes it doesn't matter whether the C/T community is interested in universals --- as long as their work contains succinct tables contrasting which logically possible options are attested/unattested, I'm happy.

    1. I largely agree but:
      "The interesting thing is that not all logical possibilities are attested cross-linguistically"
      So here's the question: to what degree are these "gaps" principled? One way of deciding whether a gap in a single G is principled is by seeing if the facts hold in other Gs (hence the utility you note of C/T work). Say we find a gap across Gs. One possibility is that this is principled and that Gs will not tolerate these (or only tolerate them if there is lots of PLD indicating their presence). Another option is that the gap is "accidental." If I understand some of the work by Heinz and Idsardi, they are suggesting that many gaps are phonologically accidental. Is this not also a syntactic option? Now, conclusions of PoS reasoning cannot be accidental in this way. If something is not acquirable then it will not be acquired. Such gaps have principled explanations. That's why I like this kind of explanation.

      Second, as a matter of fact, we have not really surveyed that many languages. So, if we think that the universals we have identified are largely correct (and maybe even largely exhaust their number) then as a matter of fact the assumption that we need to survey a large number of languages to ground universals (or, more strongly, that methodologically speaking UG can ONLY be grounded if every language is surveyed) needs rethinking. And if one compares linguistics to other biological enterprises, it is not clear what drives this assumption. That was a main point of my ruminations.

      last point: "the study of individual languages reveals what's possible, but we also need to know what's impossible. Typology does that." Does it? That's the point at issue. And the idea that one can limn the structure of FL/UG without getting theory laden very quickly is, IMO, a myth, and one with baleful consequences.

    2. There's a lot of minor points I could quibble with, but the real issue here, I believe, is factorization and how one thinks about it.

      The case Jeff and Bill make for phonology --- and which does indeed hold for syntax, too --- is that typological gaps may be due to computational properties but can also arise from other factors such as learnability, processing, diachronic change, or just quirks of human nature (the shape of the articulatory apparatus for phonology, our anthropocentric view of the world for animacy effects, and so on). If we were to lump that all together, we would have a hot mess on our hands. Factorization breaks it up into distinct properties each one of which can be given an appealing explanation. But typological data still is something worth explaining --- I've done a fair share of work in computational phonology and morphosyntax, and it has always been guided by typology, not POS.

      Also, factorization doesn't mean that the "non-core properties" rooted in, say, processing and learnability are in any sense less about language. This is a methodological distinction we make, but it doesn't necessarily correspond to an ontological one (cf. Marr's levels; and before somebody complains, that does not entail that FL is just a theoretical construct without cognitive reality). So I find the argument that typological gaps need not be FL-gaps and thus aren't interesting rather weak because it depends on one's definition of FL and the status one grants it in factorization.

      The fact of the matter is that there are typological gaps that are not predicted by the formalism and unlikely to be due to an insufficient sampling of the language space furnished by FL. In my book, finding explanations for these gaps is an interesting and rewarding program irrespective of whether the explanation draws on the grammar, learnability, processing, or even functional notions such as code optimization for communication.

      I interpret your interpreting my use of typology as an attempt to "limn the structure of FL/UG without getting theory laden" as evidence of this fundamental disagreement: I do not want to limn the structure of FL/UG, I want to limn the structure of the whole thing but choose to break it up into smaller parts for methodological reasons. That will sometimes lead to indeterminism --- a specific typological gap can have many explanations and thus may not uniquely inform the FL/UG component of the factorization, but that doesn't make them spurious or unprincipled.

      That said, I am not going to argue for the opposite extreme to regard linguistic proposals that aren't rooted in broad typological work as dubious or somehow deficient. I've already given examples of claims that don't need evidence from more than one language --- FL is capable of generating TALs and PMCFLs --- and the same can be done for other factors such as learnability, processing, and so on. And POS arguments can also shed some light on these. I understand that there are some sociological factors motivating your argument, but from a purely scientific perspective I see little point in pitting POS arguments and typology against each other. Just use whatever gets the job done.

    3. "I understand that there are some sociological factors motivating your argument, but from a purely scientific perspective I see little point in pitting POS arguments and typology against each other. Just use whatever gets the job done."

      Let's stipulate that in science one uses whatever is usable to get on with it. Let's stipulate that C/T work has proven to be a popular and useful way to investigate linguistic structure. Let's stipulate that Hornstein is wrong to suggest otherwise. Ok, now with that out of the way, let's discuss the relevant issue, or at least the one that I wanted to highlight.

      The general view is that C/T work is not one way of investigating UG but the best way. Why is it the best? Because as UG is about grammatical universals then the most direct way of studying such is by investigating the G structure of as many Gs as possible. Are all swans white? Only one way to tell: look at the swans. That, I believe is the default view. I am suggesting that the logic is wanting. This does not imply that doing C/T work is worthless or that you should stop. I am suggesting that the logic is hardly airtight and that the presupposition that this is the best way to proceed needs justifying.

      Why do I think this? Well because such C/Y investigations cannot deliver on what is promised. It can't explain why Gs must have the structure they in fact have. They cannot do this for the reasons I outlined. POS arguments can deliver this if they are done right for they do not rely on the logic of surveys but on the necessities associated with induction. So, POS arguments can deliver modal claims whereas surveys, even extensive ones (which we don't actually currently have) cannot. DOes this make surveys useless? No. Does it highlight a limitation if one's interest is in FL/UG? I think so.

      The problem comes up with gaps in the paradigms. If all Gs have a property or no Gs have a property suggests FL's influence. But the gap problem is not trivial given that we don't know at this moment what the range of possible Gs FL/UG allows is. And so we don't know how to evaluate gaps. This is not so for POS arguments. If done well they give principled accounts of gaps. The problem is doing them well.

      Again, this is not to claim that C/Y work is pointless. It is meant to highlight a feature if the logic that is often unacknowledged. Why is it unacknowledged? Well you will know the answer: Greenbergism! And I really do want us to be aware of its reach.

      So, if there was a hidden agenda it was to expose the insidious influence of Greenbergism within GG, not to disparage C/T work. The aim was to explore the logic, not make judgments.

    4. I must be particularly dense today, but there's still a few things that are unclear to me, and I think they once again have to do with whether we refer to the same thing by FL/UG.

      1) As far as I'm concerned, we do know the bounds of FL/UG. My working assumption is that the MG formalism is mostly on the right track, so every natural language has to be in the class of languages defined by that formalism. I can make this assumption because FL/UG to me refers to part of the factorization (fUG) and thus isn't necessarily the same as the cognitive object UG (cUG). Typological gaps with respect to fUG are perfectly well-defined objects. Gaps are indeed hard to interpret once one considers the full system, i.e. the intersection of fUG, learnability, processing requirements, and so on. That's why typology works well for fUG but does not neatly carry over to cUG. Which takes me to my second point.

      2) From the factorization perspective, a POS argument builds on two modules, fUG and the learning algorithm. Depending on the choice of algorithm, the intersection of fUG-definable languages and learnable languages can carve out very different language classes. A more restricted learning algorithm affords you more leeway for fUG, and the other way round, you can shift the workload between the two components to some extent. So POS arguments are more complicated for fUG. For the same reason, it's not clear to me why POS arguments should apply to cUG more cleanly than typological gaps --- the problem of reasoning from fUG to cUG stays the same.

      I suppose both points are somewhat off-topic since the linguists entertaining the arguments you're questioning are unlikely to accept my fUG-cUG-distinction. But I think there's good reasons to make that distinction, and if one does the logic of the argument changes quite a bit imho.

    5. @Thomas: I'm not sure that I entirely understand your distinction between fUG and cUG, but, to the extent that I think I can infer what you mean, I'm guessing that this distinction (which I would indeed reject) was presumably the basis for my (and Omer's) disagreement with you in the comment thread on Another Follow-Up on Athens .

      Anyway, don't mean to distract anyone here. I'm looking forward to Norbert's response, but I just thought I'd mention it, since I think it might be relevant, at least if I understand your distinction correctly.

    6. Yeah, I was thinking the same thing as I was typing my reply. That's one of the nice things about FoL discussions, they bring out differences in our underlying assumptions that we otherwise would be unaware of.

      I think the fUG-cUG distinction is an important one to make as it acknowledges that even though our theories describe a cognitively real object, the way they carve up that object may not directly match the cognitive reality.

      For example, it is perfectly fine to posit a specific fUG and a specific parsing model, with the intersection of the two carving out some superclass of all possible languages. But it is of course conceivable that cUG is an abstraction of the parser in the sense of Marr, which in turn is an abstraction of something even more complicated. In this case we cannot simply identify fUG with cUG and the parsing model with the human parser, for that would identify two formally distinct objects that carve out incomparable language classes with the same cognitive object.

      Maybe the following analogy by Andras Kornai is helpful: a dime is two nickels, but it does not consist of two nickels.

    7. @Thomas
      I'm not sure I understand it either. However, maybe the following might help. When it comes to theories of FL/UG I am a simple minded realist. The aim is to describe THAT system that we IN FACT have. Now, this can be a complicated process with all the usual caveats, but that it is the aim. Greenbergers do not have this aim for their view lives in abstraction from the mental seat of linguistic capacity. Their aim is to describe regularities across languages, which they take to be real objects. There is a version of this, call it Greenberg at one remove, that wants to identify the regularities of Gs, what they all have in common. This gets closer to the GG enterprise as I view it, but not all the way. I am interested in the features of FL/UG. If all Gs display some regularity then one reason for this is that they display it because they are all products of FL and FL has left overt fingerprints on these Gs. That's possible and it motivates the kind of C/T work I was discussing. However, it's worth noting that the argument is hardly airtight (which does not mean that we should not do it!) . As an argument FORM. it has problems. I noted that POS arguments do not have THAT problem. You note, as have others, that it might have other ones. True. Now to your distinction:

      "I think the fUG-cUG distinction is an important one to make as it acknowledges that even though our theories describe a cognitively real object, the way they carve up that object may not directly match the cognitive reality."

      Here's my reaction. Given my realism wrt FL/UG I am not sure that I find the distinction relevant. If the carving does not match the reality then there is something wrong with the carving. This does not mean that it might not be useful to nonetheless investigate carvings we know to be false. It might be and it is done all the time and it might be very helpful. But, if it does not carve right then it is no place to stop. Two nickels are not a dime. They can often do what a dime can (buy .10 worth of something (can anything today be bought for .10?) but it cannot buy other stuff. For example, only a quarter can buy 3 minutes of air at my gas station. Two dimes and a nickel will purchase you no time at all unless you use it to get a quarter. So, make the distinction. Maybe it is important. But in the end it is not the one that I was thinking about (I actually not sure I get it, btw) for it appears to (happily) fail the realism constraint. I think that there is an FL and with UG features. It's this thing we want to describe.

    8. Well I think the C/T vs POS debate is mostly settled, but let me make one more attempt at clarifying my distinction between fUG and cUG. Here's yet another example (this time starting with the learning algorithm instead of the parser):

      A learning algorithm already has a concept space of possible languages baked into its description. So cUG is cognitively real in the sense that --- at the very least --- it is what is prebaked into the cognitively real learning algorithm. But there might not actually be anything like a cognitive carving into subcomponents: all that's encoded in your genome (through some mysterious means) is the full learning algorithm, and that's also what's computed in your brain (in fact, something even more complicated that also has the parser baked in).

      That does not mean that cUG has no cognitive reality, it is clearly a property of the system. But as a discrete object with sharp boundaries it only arises at a higher level of abstraction where we deliberately remove all parts of the learning algorithm that do not directly specify the shape of the target class. And now you run into a problem: you cannot uniquely identify A from the fact that A intersected with B yields C. There's infinitely many choices for A. Moreover, it might be methdologically preferable to pick some A' distinct from A that yields the same set but can be described much more succinctly.

      When the whole object is factored into several subcomponents by your theory, there is no guarantee that fUG is cUG. All you can hope to achieve is that fUG baked into a 100% perfect representation of the rest of the cognitive system will yield exactly the same behavior as cUG baked into the system. But since the other components are underspecified for the same reason, you have an awful amount of wiggle room.

      Taken together, this leaves as the only achievable goal of cognitive reality a factorized description such that the intersection of all components yields the real object. Since cUG is an abstract part of the real object, its structure is implicitly described in full. But we may have unwittingly distributed its description over several factors, foremost fUG, the parser, and the learning algorithm. So fUG is not necessarily the same as cUG, and I don't see a way of testing their identity.

      PS: Personally I prefer the grammar-parser example I gave earlier, but apparently that one isn't particularly elucidating.

  2. "However, I don’t want to downplay the contributions of C/T work here. It has been instrumental in grounding lots of conclusions motivated on pretty indirect theoretical grounds, and direct evidence is always a plus. What I want to emphasize is that more often than not, this additional evidence has buttressed conclusions reached on theoretical (rather than inductive) grounds, rather than challenging them."

    In other words, C/T work plays a big role in converting what would otherwise be conjectures into what can reasonably be claimed to be knowledge. We can claim to *know*, for example, that no languages forms questions by moving the first word of some grammatical category into initial position, because if this was described in any grammar of any language, some C/T worker would have noticed it and brought it to our attention.

    1. I would agree but for the "otherwise would be conjectures." Conclusions from C/T explorations are no less conjectural. And, the point I wanted to make was that whereas POS arguments carry necessary conclusions as regards what is and isn't possible, this is less so for G surveys, even extensive ones. Gaps in a C/T paradigm suggest universality. However, we really don't know how the Gs we "see" reflect the class of possible Gs that FL/UG makes available. This is not a criticism of C/T investigations. It is simply a fact and one that means that C/T discovered gaps are, IMO, MORE conjectural than POS proposed gaps are. It is the assumption that the opposite is the case that I want to question.

    2. Avery wrote: We can claim to *know*, for example, that no languages forms questions by moving the first word of some grammatical category into initial position

      To put Norbert's point a bit more boldly (now there's an expression you don't hear every day): if your goal is to reach conclusions of the form "no languages do X", then of course particularly direct evidence for those kinds of conclusions is going to come from doing typological work. But those are not the conclusions that Norbert is expressing interest in reaching.

    3. I admit to having exaggerated, but the point remains: ideas derived from POS arguments become much more solidly supported when the typology is also seen to work out. An example would be Kayne's Generalization from the early 80s ,,, this seemed plausible on the basis of a small number of Romance languages, but was shown to be problematic by Romanian and River Plate Spanish, and it seems to me completely sunk without hope of salvage by Greek, when the Greek generative grammarians began to get going in the 90s (I haven't found a work where any of them attempt to defend it, perhaps somebody else does?).

      If GB-Generativists in the early 80s had been more attuned to typology, this embarassing detour probably wouldn't have happened (in part because there was also Swahili, which also has what is probably clitic doubling without case marking, and whose basic properties were reasonably accessible in the 1970s (Nikki Keach has a 1980 Umass thesis which covere a lot of them), but were for some reason deemed irrelevant.

      The basic problem with PoS arguments is that we know very little about what kinds of learning are actually possible, also about what The Stimulus is like and how much or what kind of stimulus is necessary or sufficient to learn any particular thing, so virtually any PoS argument can in principle be knocked over by the next language. In the case of Kayne's Generalization, it was the next big penisula to the east in the northern med.

      There is in fact an argument to the effect that the structure dependence of Aux-preposing is learnable from the input, but here the typology shows that this fact is highly irrelevant to the nature of UG.

  3. One important way in which work on diverse languages can advance our understanding of FL/UG is by challenging (and, in some cases, disproving) certain Chomsky Universals.

    Norbert is fond of pointing out that the naïve version of this line of argumentation is fallacious ("language X doesn't have internally-headed free relatives, therefore Chomsky is wrong!"), and he is of course correct about that. But there is another, less nonsensical version of this. Chomsky Universals sometimes take the form of no language has a rule with the formal property P (think of the subj-aux inversion stuff: no language has an inversion rule based on linearly closest rather than structurally closest (except in examples where linearly closest means adjacent)). So if we found a language that could only be successfully modeled using a grammar G that resorts to rules that have the (allegedly unattested) formal property P, we will have disproven the particular Chomsky Universal under consideration.

    [DISCLAIMER: The following example relies on my own research; if you don't buy the argumentation therein, then it obviously doesn't exemplify what I'm taking it to exemplify.]

    One could envision the following Chomsky Universal:

    (1) No language has a rule whose application cannot be enforced exclusively via Interface Conditions (i.e., conditions statable at the interface of syntax with semantics or with morphophonology).

    I don't think this is a straw-man; many people read Chomsky's (2000) "Strong Minimalist Thesis" (SMT) to entail something like (1). What my 2014 book attempts to show is that agreement in the Kichean languages disproves (1). [And, if you think that (1) is entailed by the SMT, then it also disproves the SMT.] That argument simply could not be mounted using data from English, as far as I can tell. And so this is an example of data from languages that are considerably less studied than English informing our inventory of Chomsky Universals.