Monday, June 6, 2016

Theory, again

It’s the start of the summer so it’s time to return to some pet peeves. Here’s the fortune cookie version of the history of Generative Grammar (GG): we have moved from the study of Gs to the study of possible Gs to the study of possible FL/UGs. The (bulk of the) earliest work in GG (e.g. Syntactic Structures, LSLT, the Standard Theory) aimed to adumbrate the kinds of rules that Gs contain by studying the actual recursive mechanisms that specific Gs embody. The next stage aimed to adumbrate not only the rules that Gs actually contain but also the principles restricting the kinds of operations a G could contain (this is what UG in GB was all about). Minimalism builds on the results of all of this earlier research and aims to limn the contours of a possible human Faculty of Language (FL). It, in effect, addresses the question: why do we have the FL/UG we in fact have rather than some conceivable others?

As is obvious (but this won’t stop me form pressing the point) these research questions are closely inter-related with connections in two directions.  First, each later question starts from answers provided by the earlier one. It’s pointless to wonder about possible rules without some candidate actual ones and it is futile to investigate the limits of FL/UG without some candidate principles of FL/UG. Second, answers to later questions limit the range of answers to earlier ones. If a rule is not FL/UG possible then a particular G cannot contain such a rule and if a principle is not a possible principle of FL/UG then no FL/UG can contain that kind of principle.

So, two observations: first, the three kinds of questions above are importantly different even if closely related (as such, they must be kept logically and conceptually distinct). Second, the dialectic from answer to answer moves in both directions from “lower” level to “higher” and back again. “Lower” and “higher” are not intended as evaluative. They are just used to mark the conceptual flow noted above.

Here’s a third observation: despite their interconnections, the methods used to study each of these questions are partially autonomous from each other. People who study particular Gs can do useful work without resort to the accepted/proposed principles of FL/UG and those interested in the universal properties of Gs (i.e. the structure of FL/UG) can get a good way into this problem without bothering too much with minimalist concerns. The methods used to investigate all three questions partially overlap, but the criteria for success are not the same and even some of the detailed kinds of arguments advanced can have somewhat different flavors. So not only are the questions different, but progress on addressing them is somewhat independent of progress in addressing the others. Just as there is no discovery procedure for Gs (no reduction of later levels to earlier ones), there is none for theories of GGG (no requirement that later questions uncritically respect the answers provided to earlier ones). The questions are related to one another in roughly the way that levels in a G are: they take in one another’s washing in complicated ways.

Why do I mention this? Because I believe that some of the unease in current syntax stems from misunderstanding what question is being addressed by a particular proposal and thus what counts as evidence for or against it. Or to put this another way: if the above is a roughly correct characterization of the conceptual GG landscape, then it is important to understand that many proposals, especially “higher” ones, are hidden conditionals. For example, minimalist proposals are of the form: Given that such and such is a plausible (better still, actual) principle of FL/UG then so and so is why this kind of principle obtains rather than others.

If this is so, then there are two ways to reject a specific proposal: (i) argue against the conditional as a whole or (ii) argue only against the antecedent. The former denies that the deductive link between premise and conclusion holds. The latter denies the relevance of the deductive link even if it does hold. As I see it, most critiques of minimalist proposals are of the second kind. They deny that what is taken as given should be so taken because the premise is empirically suspect. In other words, many objections are actually objections to the underlying “GB” principle being “explained” (and hence assumed) in minimalist terms rather than the explanation itself.[1]  These critiques deny the utility of the explanation rather than question its deductive validity. Thus they conclude that showing how to deduce the principle from more general considerations is valueless because the premise is false. IMO, this conclusion is unfortunate and it reflects a general disdain for theory characteristic of much work in contemporary “theoretical” syntax. Let me vent a bit (again).

In the real sciences, a lot of time is spent trying to find ways of tying together seemingly disparate principles. It really isn’t easy to show that two principles that look different are nonetheless fundamentally the same. And the problem is in large part conceptual. And one way that conceptual problems are investigated is by (often radically) simplifying them. Of course, the hope is that the simplification will preserve many of the core features of interest and so the simplification can “scale up” as we make the premises more realistic. Such simplifications often rest on “stylized” facts that are acknowledged to be (ahem) “incomplete” (aka: false). However, investigating such empirically inadequate simple problems based on stylized facts is often a vital step in advancing understanding even though the premises might be false (as simplifications almost always are). The same should hold true in syntax.

Btw, this sort of investigation (largely pencil and paper kind of stuff) is what is commonly called ‘theoretical.’ Theoretical work consists in investigating how simple concepts can be related to produce theories with rich deductive structure. Theory places a premium on (i) the reasonableness (rather than the truth) of the basic simplification (i.e. the rough accuracy of the stylized facts), (ii) the naturalness of the assumed basic concepts and (iii) the depth of the deductive structure that results.

A good example of this in GG is Chomsky’s recent proposals concerning Merge. It runs roughly as follows: if you assume that Merge is a very simple binary operation that takes two syntactic objects (SOs) and combines them into a set of those SOs (i.e. If A is an So and B is an SO then {A,B} is an SO) then you can generate objects with unbounded hierarchical structure with the following “nice” properties: Merge must be structure dependent (linear order irrelevant to syntax) and cyclic (e.g. no lowering rules), phrase structure building and movement are two faces of the self same basic Merge operation (E and I-merge), movement (aka I-Merge) must target c-commanding positions (due to Extension), and the products of I-Merge necessarily produce copies (due to Inclusiveness and hence producing structures supporting operator-variable structures and allowing for reconstruction effects). So, from a simple idea concerning the recursive mechanism, Chomsky derives a bunch of plausible properties of Gs and UG that GGers have proposed over the last 50 years of research.

However, the generalizations deduced (cyclicity, c-command, copies etc.) are not perfect (e.g. tucking-in is not strictly speaking cyclic in the standard usage, there are many cases in which reconstruction is impossible, movement is not the only operation for which c-command is relevant). Does that mean that the Chomsky’s unification of these properties in terms of Merge is a bad one? Not necessarily. Conceptually it is an achievement for it shows how to link certain salient (stylized) features of Gs together. Empirically, it is a step forward for it links properties that have non-negligible empirical backing and that are plausibly descriptive of our FL. Is it “true”? Well, that depends on how we eventually handle the (apparent) problems for the (lower level) principles that it has unified. Should these prove to be false, then this unification will not be what we ultimately want. However, and this is important, Chomsky’s unification provides a strong (explanatory) incentive for going back and reanalyzing the (empirical) “problems” for the lower level principles, and it provides a nice example of the kind of theory we want. We really do want to have our cake and eat it too and this is what the dialectic between empirical “coverage” and theoretical “explanation” aims to provide. The problem is that for this dialectic to gain a foothold we need to appreciate both sides of the going-and-froing. We need to concretely understand the tension between explanatory force and empirical coverage and understand that the right theory needs both. Right now, IMO, our attitudes over-prize (apparent) empirical coverage. We very seldom count (or even address) the cost of lost explanation when we evaluate our proposals.

This is not a new complaint, at least from me. I make it again because in my experience GGers have a low tolerance for theoretical ambition. I suspect that this is so for several reasons. First, we tend to confuse formal work with theoretical work and this muddies our sensitivity to the explanatory oomph of different approaches. Second, linguistics is a data rich field and so supporting theory means tolerating some empirical slack at least for a while. But, last, I think that we don’t actually spend enough time teaching and touting the explanatory virtues of our best accounts. We seldom go back and ask what we have lost or try to theoretically motivate the new principles we adopt to “capture” the data. Indeed, the whole idea that data is something that needs capturing (rather than explaining) is, to my mind, quite odd.

Does this mean that theory does not need empirical support? Nope. Theories need to be justified by facts. But, facts also need to be justified by theories. One of the original hopes of the minimalist program was that it would sensitize us to what a good explanation was. It would make us aware that our “explanations” (and these are scare quotes) are often as complex as the data they address. And this is not good. IMO, this appreciation is less vivid today than it was in the earliest days of the minimalist program. And part of the problem is un-interest in theory and a misplaced belief that lots of data signifies empirical progress. In this regard, GG work has been disimproving.

[1] “GB” is in quotes because I do not mean to invidiously distinguish between GB proper and its many theoretical twins (many of them identical IMO for most of the questions I am interested in). These include LFG, RG, GPSG, HPSG a.o. From where I sit, most of these theories are intertranslatable and make effectively the same distinctions in the same theoretical places. They are more notationally than notionally distinct.


  1. Thanks for this, Norbert; I agree wholeheartedly with your critique. The contempt for theory runs deep in the field, whereas "capturing data" is considered an achievement. The absence of anything resembling a coherent theoretical framework vis-a-vis the number of published papers giving "an analysis of phenomenon P in language L" without ever telling us why we should care about P is a clear indication of this disparity.

  2. I agree that in GG today there is quite a lot of confusion between formal implementational details and theorising, and that people are less and less able to see what an elegant or explanatory account of a particular phenomenon is. But I don't agree with Norbert here that there is an over emphasis on capturing data in our field. There have always been good describers out there and everyone needs to be a good describer still because there are a lot of linguistic phenomena and patterns that we simply have not even described yet. There are good and insightful descriptions that are couched in ways that allow the generalizations to emerge naturally from basic assumptions. And there are descriptions which are hacks that use the received wisdom toolbox and the kitchen sink in ugly and unmotivated ways. I'm not naming names. Still, to my mind there is an awful lot of data-free theory-massaging going on out there which operates with abstractions over generalisations and which ends up being notational game playing. If you look at Omer's latest post, it is actually directly relevant to this discussion. Because if Omer is right, then not understanding what the data actually is at this point gives rise to an awful lot of abstract discussion about explanations where the received wisdom on feature checking and agreement is taken as given. Higher level speculation (beyond explanatory adequacy) then proceeds from there. Wrongly as it turns out. Sounds like an impediment to real progress to me.

    1. I don't believe that there is "an overemphasis in capturing data" rather I think that there is an underemphasis in explaining it. We do not really value explanations that leave empirical data points on the table but are otherwise tightly bound. We do value accounts that have little explanatory heft but cover lots of facts. If you disagree with this, then we do indeed disagree. Otherwise not.

      Second point: re your "there are a lot of linguistic phenomena and patterns that we simply have not even described yet."

      A theorist would argue that the main point of empirical investigation is not to describe phenomena or patterns but to uncover mechanisms. A way of doing this is via the phenomena and the patterns and so these are worth describing just in case they bear on these mechanisms. Of course, we never know ahead of time when a patterns might be enlightening so this is more a logical point than a suggestion of what research to actually do. However, I think that the logical point is worthwhile for it leaves the following as a fair question to a describer: why should anyone care about your description? The tacit assumption is that this is never a fair question. But if your aim is to understand the basic structures of FL then the details are interesting precisely if there is reason (hope?) that the description has implications for the mechanisms. Omer's stuff is interesting precisely because he agrees with this position. He argues that Filters are the wrong mechanism and that obligatory rules with SDs are the right ones. However, much descriptive work does not even nod in the direction of the basic issues. Is it worthwhile? Well it can be. How? When someone looks at it and discovers its implications for the structure of FL. Were all work so sensitive, then I would be happy.

      Interest in theory is not opposed to interest in facts (though the two enterprises are somewhat autonomous). Interest in theory means an interest in basic mechanisms and an appreciation that description serves explanatory ends. Given how quick we are to dismiss theory when it runs into empirical flack I doubt the filed as a whole cares much about it. It hopes that given enough careful description theory will take care of itself. I doubt that this is so. I even have a name for this attitude.

    2. "It is the business of the theorist to inspect the tools and to ask that they be cleaner."

      --Rudolf Arnheim, 'Film as Art'

    3. I don´t think we disagree that we are interested in facts not just for the sake of describing them, or getting an speech recognition/generation/translation device to work. We are interested in them in so far as they bear on an understanding of the mechanisms and structures that generate them. But I don´t know how to quantify statements about `what the field cares about´ or measure how many people are interested in what. I certainly think that space needs to be made for both kinds of work. So I do think that responsible describing is important, and it may even be important for it to be generalisation based rather than implementation based in focus. This is because, ideally, we want a body of generalisations to be available as input to the next level of discussion and theorising. Most actual bodies of data or phenomena are amenable to a number of different ´_kinds_ of explanations, and the choice between them will feed into different sorts of theories about the mechanisms involved. We often need crucial data about the space of possibilities for certain phenomena to make a judgement about how to theorise.
      Bottom line is that we are a big field now and there is still a lot of groundwork to do. And also, people have different skills, proclivities and talents. In Biology you have the people who love to go out in the field and gather specimens, and those who do the stats on the data, and those who speculate about the meaning of life. Many people do all three but in different proportions. The important thing for the field is that it be cumulative. That the work of one researcher is available to and interpretable by others and feeds into a growing jointly better understanding of what is going on.
      I agree that Omer is also, like us, searching for explanations, but he seems to be complaining that the field (or at least the subset of it that he is engaged with) has remained fixated on a theoretical status quo that is blinding it to other types of mechanisms/explanations. We need to continually maintain the dialectic between empirical input and interpretation and theorising. Or else the latter just becomes a barren discussion about the beauty of some mathematical model severed from the thing it is purporting to explain. My feeling is that I see too much of this kind of thing.

    4. Ahh, a kumbaya moment. So we agree that good empirics and good theory both have a place in the field. Yup. I suspect we are not alone in this consensus. But we do disagree. We have a disagreement about a certain judgment concerning the current "respect" the discipline has for the two kinds of research. IMO, theory is an afterthought; the kind of activity you engage in after you get the data ducks lined up right. It's what you do when you get the right generalizations (which, is never, btw, and so there is never time for it). You are somewhat coy about this point, claiming that you cannot quantify these issues. But who can? Not me. Just a judgment.

      Why do I so conclude? Because I almost always see pushback against theory and very little against empirics. I see very few papers in the journals on largely theoretical themes. I see that it is virtually impossible to get funding from agencies for a largely theoretical project. I see it very hard to publish a new idea that derives the same data out there in a new way. I see tolerance for theoretical proliferation that far surpasses the tolerance extended to those accounts unable to cover a data point. And I see this as ubiquitous. Maybe I live a sheltered life (I hope so), but this is how things look from where I sit.

      If this is correct, then the openminded attitude that you proffer (which I could not agree with more) has no chance for success unless we learn to respect what theory can bring to the table. This means insisting that empirical papers turn their hand at explaining the implications of the work, AS WELL AS INSISTING THAT THEORY PAPERS PROVIDE THEIR EMPIRICAL BONA FIDES. We need to appreciate that providing principled derivations of (what appear to be) false generalizations is an argument that these generalizations might not actually be false and not merely an argument that the principles from which the account arose are untenable. We need to understand, in other words, that theory can regulate what counts as the data just as much as data can regulate what counts as the theory. And this means developing a tolerance for uncovered data points. If we do not do this, then description is the very best we can aim for. Need I add that this, IMO, is more or less the standard tacit assumption of the discipline which is another way of saying that it has little patients for theory.

      Now, I bet you disagree. So, in the spirit of tolerance, this is my last word on the topic (at least in this post). I leave you with the last word. Thx for the discussion.

    5. “. . . it has little patients for theory.”

      I generally, for personal reasons, try not to comment on these discussions about the place of theoretical work in the discipline. But the above, from Norbert, was impossible to resist, as I could precisely see myself as a “little patient for theory” who, due to that condition, did not survive. So, while I cannot speak to the current situation, I can say that back when the diagnosis might have been given, it was in fact impossible to have papers on “largely [well, entirely and unrelentingly, but now we’re getting into quantity implicature territory] theoretical themes” published in leading journals. Indeed, two VERY leading journals (hint: “Language” and “NLLT”) had explicit policies against publishing such work (you think I’m making this up—trust me, I was there). If those journals made up, say, half of those in which a junior tenure-track faculty member was “expected” (i.e., required) to appear, the survival rate was likely to be quite low. There is, of course, little that is likely to be edifying in having a so belated morbidity and mortality conference, except to suggest that even if there has been some significant change in the discipline—as well there might have been—the current scene could still be as Norbert describes, given where things stood in living (though no doubt increasingly fallible) memory.

  3. I think that where we perceive that the biggest pushback is coming from is sometimes fairly subjective. (I personally get more pushback against theory in reactions to my own work). The judgement of where the balance is is also relative to the particular conversation group. Agreeing in principle is easy, as you point out, and there is almost no virtue in agreeing to agree with a very idealistic position. (We Agree!). The differences come in the actions and reactions to research that comes across our desk, and students in our offices and classrooms. I am completely happy signing up to the principle of insisting that `empirical papers turn their hand at explaining the implications of the work, as well as insisting that theory papers provide their empirical bona fides´. As newly minted Associate Editor for NLLT, I consider that to be precisely my remit when in a judging capacity. I wonder whether we would agree on individual assessments of actual research? There would probably be a substantial overlap. Having said that, I would like to mention that pushback against pure theory is very necessary in some circles: some of the stuff masquerading as theory is implementational mysticism and faddishness (IMO) and gets disproportionate prestige. Note that prestige in some circles does not translate into prestige in the field as a whole or even majority opinion. In any case, thanks for letting me have the last word and I look forward to this one coming up again!