Monday, April 15, 2013

(Just One) More on Interpreting the SMT

One issue I left dangling (relegated to a footnote here actually) is whether to understand the SMT as a methodological or a metaphysical thesis.[1]  The difference lies in evaluating the SMT wrt its fecundity or wrt its truth.  I am partial to the first reading. In fact, I find it hard to see how the metaphysical version of the SMT could be true.  Let me elaborate.

Minimalists like to say that that Minimalism is a program, not a theory.  I actually have some reservations about this claim for programs are vindicated to the degree that they generate interesting and true theories, if not right away, then over a reasonable time span.  The Minimalist Program (MP) is now approximately 20 years old, so we should by now be evaluating it in terms of its theories.[2]  As I believe that there are a lot of pretty good and empirically interesting (even true) proposals that the program has generated, I think that the “program” line can sometimes be a dodge.  However, whether this is true or not, there is something important about the ‘program’ vs ‘theory’ distinction that is relevant to what I want to say here. The main distinction between programs and theories is that theories are things that can be true or false whereas programs are things that are fertile or sterile.  As such, programs generate research methodologies, ways of approaching questions that can lead to theories that are truth evaluable.  And one important feature of a good program is that the methodologies do actually function as productive hypothesis generators.  One of the features of minimalism from the get-go has been its ability to suggest interesting avenues of linguistic investigation.  It has done so in several ways.[3]

First, as Chomsky has liked to stress, it keeps us honest.  Often our explanations are the same order of complexity as the phenomena they aim to explain.  This may or may not be useful (re-describing something can be a crucial step to explaining it), however, it is unlikely to be explanatory.  It’s never a good idea to explain N data points with a theory that has N degrees of freedom.  At any rate, early minimalism stressed the methodological virtues of simplicity and elegance and tried to motivate how they might be operationalized in the context of late GB syntax.  Chomsky’s 1993 paper[4], the one that launched the enterprise, was a very good guide of how to do this, deploying Ockham’s razor to great effect in cutting away some GB underbrush.  Chomsky’s basic point was methodological viz. if we want our explanations to explain then they cannot be as convoluted as the data they care about.  And he observed that firmly keeping this in focus can lead to a significant explanatory boost, i.e. less can be more, a whole lot more.

Second, MP started people (like me) thinking about the virtues of reduction/unification.  Though never absent even in earlier work, this kind of project makes deep sense in the minimalist context. Why? Precisely because it urges that we go “beyond explanatory adequacy,” to use Chomsky’s terms. GB was mainly concerned with Plato’s problem. This problem is “solved” once something is put into UG for by assumption things in UG need not be learned. This often has the effect (I am confessing here) of removing the incentive for developing swelt, streamlined, elegant accounts. So long as the principles can be shoved into UG their ungainliness fails to generate much empirical friction. To put it crudely (something many of you might think I do all too well) GB theories were to elegance what the lunar module was to aerodynamics (Did you ever see the damn thing? It looks like a pile of mechanical garbage wrapped in tin foil, see here) and roughly for the same reason. The module didn’t need to be nicely shaped to move efficiently through space as way out there space is a frictionless medium. In a sense, that was also true of GB; once inside UG, the shape of the principles didn’t much matter for Plato’s Problem. 

Now I don’t want to overstate this. Linguists have always cared about elegance, simplicity, redundancy, etc. However, MP greatly raised the status of these virtues. These virtues fueled the impulse to unify the principles of UG and unification became empirically important when one worried not only about learnability issues but also about how FL/UG itself might have arisen. I’ve talked about this (no doubt too much, but as you can see I am obsessed by this) elsewhere (here) so I will drop the issue now. But I bring it up because it bears on the correct interpretation of the SMT. 

One way of thinking about the SMT is along the lines of these more general desiderata.  In other words, the SMT is an injunction to look for examples where interface properties reveal representational structure. The PLHH work shows that the ANS+visual system can tell us quite a bit about the nature of semantic representations (aka linguistic meaning) and work on parsing and acquisition can do so as well wrt syntactic representations.  When such things are found, they can be revealing and the SMT, viewed as a methodological precept to look for such, can be, and has been, quite fecund, especially in forcing different kinds of linguists (syntacticians, phonologists, psycho types, and even neuro types) to ask how their projects and assumptions fit together. In short, as a guiding methodological principle, the SMT is a winner: fecund? Check .

What about a metaphysical thesis?  Here, things get a whole lot murkier.  Recall that the SMT is supposed to be the thesis that the grammar is the optimal solution to interface conditions.  One way of reading this is that the interfaces cause linguistic representations to have the properties they do. But what would it mean for this to be true?  I really don’t know.

There is one possibility, the standard Darwinian one in which over long periods of time the interfaces chisel away at the rough edges of FL/UG (and vice versa) till they fit snugly together (interface requirements accommodating themselves to features of FL/UG and properties of FL/UG accommodating themselves to features of the interfaces).  Maybe, but recall a good deal of Darwin’s Problem in MP rests on the premise that FL/UG popped up pretty quickly and so there was no time for the Darwin’s selectionist mutual accommodation to effectively operate.  Without this Darwininan solvent, any fit that exists between interfaces and FL/UG will be quite adventitious.  In fact, I would expect such perfect fit to be the exception rather than the rule and I expect that there will be/are many many interfaces with which the resources of FL don’t integrate at all well with systems that (try to) use them.  I can personally attest to the fact that my “dance module” is almost completely inured to verbal instruction. So, as a metaphysical thesis, I see no reason to believe that the SMT is even roughly correct. Or if it is correct it is total mystery why it is or even could be.  It would be too damn amazing were FL/UG to be just what every interface ordered. This would be super-intelligent design! This is why I find the SMT to be a pretty poor metaphysical thesis: from where I sit, it has all the hallmarks of being obviously false (indeed, incredible).

Is there anything paradoxical about a principle being methodologically fecund though metaphysically false? Nope. Fecundity and truth are related but distinct evaluative dimensions.  To repeat: programs/methodologies fecund, theories/proposals true/false. So qua methodological precept (viz. look for this!) the SMT is a powerful injunction, but qua metaphysical thesis, not so much.

Let me put this another way by considering an analogy between the SMT and the Anthropic Principle (AP) (here).  The AP can be used to deduce the values of attested physical constants. How? Well, the values must lie within a certain range in order for (conscious) life to be possible. As the universe clearly contains (conscious) life (i.e. us, well on some days at least) this fact can be used to specify a narrow range of values for the attested physical constants (e.g. the fine structure constant).  As a methodological principle, AP seems unexceptionable. Given that we are here, of course the universe must be hospitable to us and this means that the physical constants must have hospitable-for-us values.  However, as a metaphysical principle AP has a decidedly mystical air (e.g. the universe is “compelled, in some sense, for conscious life to emerge” (Wikipedia). Note the “in some sense,” always a sign that things are getting weird) that has a distinct theistic odor suggesting intelligent design. The SMT is similar. If FL’s products fit an interface transparently there is a lot to learn about the fine structure of the representation.  However, this is not because the interface causes linguistic representations to have the features they do but because in the domains where the SMT holds features of the interface and features of the representations are very closely correlated.  Thus, knowing the properties of one can tell you a lot about the properties of the other. In other words, where the SMT holds features of the interface can be used to probe features of the linguistic representation. And just as our existence has implications for the values of the physical constants (at least in our universe) per AP, so too do properties of SMT compliant interfaces have implications for the properties of linguistic representations, even if metaphysically speaking both the AP and the SMT are false. [5]

In sum, even if the general metaphysical version of the SMT is false, there is reason to hope that some interfaces will fit with FL/UG tightly. The properties of these can then be used to plumb the internal details of FL/UG (and, of course, vice versa).  These domains of investigation will then be closely integrated, allowing for the development of richer theories of both FL/UG and the relevant interface. 

Methodologically, one can go a little further and elevate the SMT to a methodological ideal. In particular, we can take as a default assumption that, for any given interface, the SMT (viz. the Transparency Thesis) holds. It should be easyish to disconfirm this if false (and I suspect that it will be often false), so it is a good 0-th level assumption to make.  In the meantime, whether the SMT holds or not for a particular interface, we will find something interesting, and that’s what makes it an ideal methodological principle.

No doubt, there are other interpretations of the SMT that are more metaphysically charged (see Introduction of this for example).  There are times when Chomsky’s allusions to third factors and snowflakes can carry this kind of tinge (there are also times when he resiles from this interpretation and explicitly adopts a methodological stance wrt MP and its precepts).  For me, it is comforting to be able to interpret the various programmatic precepts in methodological terms. Why? I understand these and can see how to use them to generate research hypotheses. Seen from this perspective, the SMT is a very good way of framing linguistic questions, even if it is metaphysically very far fetched.[6]

[1] This post developed from conversations that I had with Paul (the ‘P’ in PLHH) about the Interface Transparency Thesis and the SMT. It goes without saying that he is completely responsible for any dumb ass thing that I say here. Don’t like it, complain to him.
[2] A possible counter is that it’s too early to engage minimalist themes. Perhaps. But if so, then it’s not really a program either, more like a vision or dream.
[3] I’ve discussed this here for those interested.
[4] Epstein’s paper on c-command (here) was also very good at making these points.
[5] Now for a mea culpa (footnotes are good for this): (here) I said that the features of the ANS+visual system explain the features of L.  This strongly suggests that they are the cause of those features in L. If the above is right, this is very misleading and I accept full responsibility for misleading you.  I am so contrite that I am sure you will all forgive me. Thx for your indulgence. What we can say is that given the ITT we can deduce some features of L by noting features of ANS+visual, but in this case deducing X does not amount to explaining L (think heights of flagpoles and the shadows they cast).
[6] Curiously, this is the converse with the most vociferous versions of Linguistics Platonism: whatever its metaphysical virtues (none in my view) the methodological consequences of adopting it are confusing at best and baneful at worst (see here).


  1. I also believe that the best way to understand the SMT (and several aspects of minimalism) is as a methodological assumption. If our job as grammarians consists on comparing theories (i.e. analyses in competence, typically), then we should have a systematic way to choose between two possible analyses A and B that make exactly the same predictions. Usually, we prefer the simplest alternative (Occam’s razor). But if we don’t have any stronger and deeper ontological assumption about the simplicity of language, then the systematical preference for the simplest theory doesn’t follow from anything. Thus, the application of Occam’s razor requires an ontological assumption as the SMT (e.g. “language is an optimal device in order to connect sounds and meanings”, or the definition you prefer). Therefore, it “doesn’t matter” if the SMT is false, since its postulation is a means and not an end.

    Very interesting blog, by the way.

  2. This is a very interesting blog indeed. I want to focus on two important points:

    1. In conceding that "the general metaphysical version of the SMT is [likely] false" Norbert admits that Minimalists have no rebuttal to Postal's ontological challenge that Minimalism is internally incoherent [NOT that Platonism is true which is a different issue]. Considering that Chomsky was unable to either understand or admit this for decades I think this candid statement demonstrates substantial progress for Minimalism and want to congratulate Norbert on it.

    2. It is probably unproblematic to use SMT 'as a means' and argue: "If FL’s products fit an interface transparently there is a lot to learn about the fine structure of the representation. However, this is not because the interface causes linguistic representations to have the features they do but because in the domains where the SMT holds features of the interface and features of the representations are very closely correlated."

    However, in admitting that there is good reason to question the existence of the innate structure postulated by SMT [SMT is false as ontological thesis, we have no causation just correlation] Minimalism looses its edge over empiricist views [e.g., Tomasello, MacWhinney, Christiansen & Chater, to name a few]. If FL is merely a convenient theoretical construct that does not actually exist [and the inability of giving any biological evidence for its existence gestures in this direction], then the only reason to prefer Minimalism over empiricism would be if the former can better account for facts of acquisition or performance [no typo, I mean performance] than the latter.

    In case it is not obvious: Norbert was very clever when using the analogy between SMT and AP:

    "The AP can be used to deduce the values of attested physical constants. How? Well, the values must lie within a certain range in order for (conscious) life to be possible. As the universe clearly contains (conscious) life".

    Indeed, the values must lie in a certain range because conscious life exists. But whether FL exists or the fact that only Chomsky's granddaughter but not her kitten can acquire language can be explained by mechanisms suggested by competing views is still an open question. Admitting that SMT is likely false as ontological thesis amounts to admitting that this question remains open - which again is considerable progress and I tip my hat to Norbert once more.

    1. What exactly do you take the SMT to be? I don't see how the truth of what Norbert takes it to be - "the claim that 'language is an optimal solution' to interface conditions" - has any bearing on whether or not there is something that deserves to be called FL.
      And I also fail to see how Postal's criticism is connected to this.

    2. Ontology is about existence. LF has always been defined as a part of the human brain that causes Chomsky's granddaughter but not rocks, kittens, or chimps [or the Boston telephone exchange] to 'grow' language. Now if the claim 'Language is an optimal solution to interface conditions' is false as a metaphysical claim it follows either language or the interface conditions of LF do not exist. You may want to deny that language exists but I think it is less dramatic to assume 'interface conditions of LF' are a theoretical construct.

      Postal levels many criticisms against Chomsky's work. If you fail to see how admitting that SMT is false as metaphysical thesis relates to the ONTOLOGICAL criticism I had specified, I suggest you [re?]read: or, if that is too long for your taste

      If you are interested how it relates to the criticism that SMT [like earlier versions of Chomsky's work] postulates constructs based on questionable evidence you may want to read

    3. I'm afraid we're talking past each other (again), but I don't see why "if the claim 'Language is an optimal solution to interface conditions' is false as a metaphysical claim it follows either language or the interface conditions of LF [I'd recommend using FL lest anybody confuses it with, well, LF] do not exist."
      It could just as well be that Language, construed of as say I-language, is not an optimal solution to the interface conditions (whatever that is supposed to mean to begin with).
      No point in rehashing the Postal-thing, I was just puzzled by why you were so excited about Norbert's SMT "admission". I still don't understand that but it doesn't really matter.

  3. I'm a bit perplexed. Isn't the SMT meant to be a biological or physical thesis, not a metaphysical one. If we take it to be true, that would mean that the structure of the faculty of language is an optimal structure as far as the various systems that need to use it are concerned as a biological fact. If it were a metaphysical thesis, say about ontic commitments, I'm not sure what import it would have, but given that Chomsky takes metaphysical questions to be beside the point (perhaps that's why he declines to discuss Platonism, Christina?), and I must admit I'm in agreement with him, I doubt that when he proposed the SMT he intended it metaphysically.

    1. I do not know if Chomsky [or maybe Norbert] has by now also re-invented ontology but for us humble philosophers ontology [the branch of metaphysics relevant here] deals with questions concerning what entities exist or can be said to exist. So saying 'abstract [platonic] objects exist' is just as much an ontological claim as saying 'there exists a part of our brain that allows us to acquire language'. Now how we can confirm [or disconfirm such claims is a question of epistemology - and only the second claim can be [in principle] dealt with by empirical science.

      Most scientists do not worry about ontology but simply assume that the entities they study exist [and in most cases they can provide good evidence for existence - unless you're a global skeptic and believe we *could* live in the Matrix]. But with the putative language faculty we're not so lucky: no one has produced any biological evidence for it yet. Still that would not seem to be a serious problem - there are many things we do not know yet about the human brain - so who is to say that a language faculty can not exist?

      But now Katz and Postal enter the story - they have argued [convincingly I think] that the putative LF has properties that no concrete [physical, that is biological] object COULD HAVE. So they claim LF could not exist because it would have to have at least one impossible property. Any responsible scientist faced with such a challenge has two options: either [i] showing that the challenger was wrong or [ii] abandon his theory [program/hypothesis or whatever you would like to call that which postulates the existence of LF]. Responding to such a challenge instead by saying 'metaphysical questions are beside the point' is every bit as irresponsible as Postal documents here:

    2. This is just the way I conceive of what I'm engaged in, and why I think that the challenge is easily met, but I must admit I have little hope you'll find it convincing.

      Take dead reckoning in ants. An ant goes foraging and once it finds food, it computes the fastest way to its nest. Each token of computation results in a behaviour. But the ant is endowed with a cognitive capacity to do the computation. We have a fairly good theory of what that capacity has to look like at a computational level, although not, as far as I know, at the neuronal level. And we can represent that computation via a formula, with the variables plugged in for the times, speeds, directions and distances. Now take me saying (or parsing) the sentence 'Most rabbits have big ears'. What underlies this token of behaviour is a computation done by a cognitive capacity that I have. We have some understanding of that capacity at a computational level, although very little at the neuronal level (unsurprising, as we have almost no understanding of, say, long term memory or other cognitive capacities we surely possess, at the neuronal level). We can represent the computation via a formula (say a tree diagram, or an AVM, or a categorial grammar derivation), with relevant lexical items, features, etc, plugged in. The formula is a theoretical construct (hence an abstraction, hence not subject to gravity) of a computation in the brain (which presumably has a time course, involves chemicals interacting and energy being consumed etc). The theory is not a theory of sentences, it's a theory of mental computations in the same way that Dead Reckoning as an integral of velocity and distances is a theory of the ants mental computation. Neither theory is capable of providing the instantiating neural mechanisms yet (but see below), but both are fairly successful in explaining the behaviour of the relevant organisms.

      Everyone agrees that we'd like to be able to map the properties of the formulae to the physical events, and there are some minor results in that direction (eg Pallier et al's FMRI study in PNAS in 2011). there are also some results at the algorithmic level of description (eg the work by Pylkkanen at NYU). But in the meantime, and for those of us interested in the vast range of interesting syntactic phenomena in language, its still a worthwhile task to try to figure out a good theory of linguistic phenomena at the computational level. And here I think one can do good syntax to a great extent irrespective of metaphysical or theoretical persuasion. In fact, its such a hard task that we should throw everything we've got at the problem. Even though I have deep skepticism about, say, construction grammar approaches to the problem, I'm still very glad that there are people trying that out and I'm sure we'l learn from it.

    3. Of course, how could I forget about the dead reckoning of ants?! Just what makes you biolinguists think that anything interesting related to human language can come from the study of ant or bee navigation? Why do you need to know how the bee represents the solar ephemeris or that jays store food in times of plenty in thousands of caches? No matter what is fascinating about these animals we KNOW they do not have language. Chomsky never tires to ridicule 'ape-language-studies'. So why do you spend so much time constructing analogies to ants and bees and nematodes and lately even bacteria? What do you hope to learn from them about human LANGUAGE?

      You are absolutely correct that results from fMRI studies are 'minor' and at this point open to multiple interpretation. So here would be an area for BIOlinguists to focus their mental energies on. Of course one does not turn into a brain expert over night. So it is probably more promising to do solid old fashioned work on syntax. Again, you are absolutely correct: "one can do good syntax to a great extent irrespective of metaphysical or theoretical persuasion" - the ongoing collaboration between Chris Collins and Paul Postal surely attests to that. Even more importantly: one can do such work without advanced knowledge about biology. I challenge you to find in Paul's work ANY reference to the biology of ants or bees or frogs or nematodes or... - yet everyone agrees he is a brilliant syntactician.

      Finally I completely agree with: "In fact, its such a hard task that we should throw everything we've got at the problem". For that reason it puzzles me no end that Chomsky would go around for decades and call people who disagree with him on some issues 'irrational dogmatists' or that Norbert would go out of his way to pour insults over 'empiricists' and 'Platonists' not to mention inept philosophers - this certainly is not the way to deal productively with disagreements about metaphysical or theoretical persuasion....

    4. I notice you haven't answered the substantive point. You basically said that I was irresponsible in not providing a response to the challenge that Postal has raised for biolinguistics. I gave a response: the proposal that humans have a cognitive capacity that underlies language use and that that capacity can be modelled by theoretical constructs with computational is perfectly consistent. This is what Chomsky has been saying for years, and the claims that it's inconsistent are based on an assumption that the theory is a theory of sentences. It's not. Given you have not responded to this substantive point, I feel my responsibilities are acquitted.

    5. My apologies, how could I forget the ingenious move by Chomsky: renaming sentences 'expressions' to avoid the Katz&Postal criticism? Just how ingenious this move was is shown by the fact that professional linguists [not silly philosophers!] have accepted that linguistics is NOT about sentences. Fortunately for the field, not all linguists have accepted that [I just name Jackendoff and Culicover as representative for many who haven't]. And of course when I look at the work Norbert has published, or the fine examples David Pesetsky mentioned in the LSA keynote I have slight doubts that biolinguists have really accepted that linguistics is not about sentences.

      Now if all you ever do is saying "that humans have a cognitive capacity that underlies language use and that that capacity can be modelled by theoretical constructs with computational [? I think there was a noun missing]" then yes this may be consistent. But why is that not straight forward psychology? If on the other hand you claim to do bioLINGUISTICS I have bad news for you: the output of the putative language organ continues to be something that could not be the output of a biological organ. Do not take my word for it but read [again?] what CHOMSKY wrote in 2012:

      There are a lot of promissory notes there when you talk about a generative grammar as being based on an operation of Merge that forms sets, and so on and so forth. That’s something metaphorical, and the metaphor has to be spelled out someday...if we want a productive theory-constructive [effort], we’re going to have to relax our stringent criteria and accept things that we know don’t make any sense, and hope that some day somebody will make some sense out of them – like sets. (Chomsky, 2012, p. 91)

      Yes, there is something going on in our brains when we use language and modelling THAT in the way you suggest is perfectly coherent. But unless you denounce that Merge forms sets [which are abstract objects - for detailed discussion of why this is a problem read ] you are still stuck within a program that is based on an incoherent ontological foundation.

    6. Excellent. So we agree that a fundamental biolinguistic claim, which I put as follows "that humans have a cognitive capacity that underlies language use and that that capacity can be modelled by theoretical constructs with computational [properties, sorry for the missing noun]" may be consistent. Now, when I do linguistic analysis of some phenomenon, I'm seeking an explanation of that phenomenon in terms of such theoretical constructs and their properties. That current theory takes these constructs to be set theoretic is a hypothesis that set theoretic properties (say, the axiom of extension, or whatever) are the right kinds of properties to be put into a correspondence with aspects of the physical world. That is, one can in principle define a morphism between some properties of the constructs (not all, of course) and some properties of (our theory of) the relevant chemistry, physics, etc. That morphism would allow us to identify, for any token of behaviour, how the physical properties line up with the theoretical explanation. There is nothing inconsistent, ontologically, about this even if its something we are not in a position to do yet. (For what it's worth, I'm not committed to the idea that the computations are best modelled set theoretically, but its a proposal that's worth exploring the consequences of.) As for Chomsky's quote, he's just saying what I said here: the theoretical constructs will have to be cashed out in other terms which allow us to connect our theories of neurophysiological function and structure with our theories of linguistic function and structure. It's all pretty straightforward. Pylyshyn 1984 is a good read on these issues, I think.

    7. I am not quite sure what caused your enthusiasm since I doubt that anyone would deny that "humans have a cognitive capacity that underlies language use and that that capacity can be modelled by theoretical constructs with computational properties". Any old-fashioned nominalist can accept this and depending how you define 'language use' even Paul Postal might be happy with it..

      The ontological problems for biolinguistcs arise from what you say next: In early days of theory [hypothesis/program] construction you can be satisfied if the morphism holds only between some properties of the constructs and some properties of (your theory of) the relevant chemistry, physics, etc. But at the end of the day one needs to account for all properties and more importantly one needs to ensure that no properties are such that they could not exist in biological organs such as brains. And this is - as we say so poetically in German - wo der Hund in der Pfanne liegt - where the main problem of Chomsky's "constructive" efforts lie.

      One can put it not more succinctly than Paul Postal has done here: . Chomsky's biolinguistics requires him to be simultaneously on two branches that cannot grow on the same tree. So one has to go. In his attempts to reconcile biolinguistics with Platonism Watumull has effectively sawed off the biological branch of biolinguistics. Your strategy is different: you want to saw off the linguistic branch [more specifically that what the Chomskyan revolution ADDED to nominalist linguistics: the ability to 'generate sets', to explain how we can put finite means to infinite use etc].

      Now lets look at the results of your [pl] branch-cutting efforts. Watumull could do what he did and be quite proud of the result because he did not really cut anything of substance off. There is so far no actual biology in biolinguistics [as even Norbert admits when he is not rolling on the floor laughing about my comments]. I am not sure if Watumull intended it but the analogy of 'using a biological ladder to climb to 'Platonic Heaven"' has some merit - it's a different way to express what Katz said about the importance of Chomskyan linguistics as stepping stone from nominalism to rational realism. Now you actually want to eliminate that entire branch by suggesting that eventually the set-theoretical constructs will have to be cashed out in other terms which are neurophysiologically realizable. This eliminates the ontological incoherence indeed but it also eliminates the biggest advance of the Chomskyan revolution and brings you back to nominalism. And it also leaved you with the nontrivial task to provide SOME evidence that the contructs your theory postulates actually refer to something that exists in human brains. So you have some catching up to do to get where people like Tomasello or MacWhinney are by now...

      Thanks for the kind recommendation to read some vintage Pylyshyn. I read it years ago but from that era I prefer Churchland and [unsurprisingly] Elman]. Though my all time favourite is Drew McDermott's "Artificial Intelligence Meets Natural Stupidity" - very insightfully dealing with the overblown claims of early AI and strangely fitting to claims made in some other fields - i warmly recommend it.

    8. Is this really what the debate is about? If so, that's embarrassing. When Chomsky says that the faculty of language yields an infinite (or unbounded) number of expressions, he simply means that there is no bound to the number of computations. It is potentially infinite, although in actuality, finite, for any organism. The infinitude claims come just because, at the computational level, the relevant specification to capture the generalizations about the linguistic phenomena is such that it entails that there is no upper bound to the number of computations available. At the physical level, of course, we have physical symbols which can combine in such a way that there is no bound, in principle, to the number of computations. Each computation has, as a side effect, a token of a sentence (or noun phrase, or expression, or whatever you will). Of course there is a bound in actuality, given by the physical resources! I explain this to my first years, and it's entirely standard. Postal in his paper raises a response to his objections that is similar to this (saying that 3 unnamed linguists have proposed this) and criticises it in such as way that makes clear that he doesn't understand what the consequences of adopting a computational theory of mind are. In a CTM, you take symbols to be physical, you assume that they are not nominal, but rather encoded (so their spatial and temporal structure is relevant to the information they encode) and you assume that procedures can apply to the symbols, taking advantage of that encoded structure, to create new complex data structures, much like in genetics you assume that there are physical carriers of information (nucleotides) arranged in a spatial and temporal topology that is informationally relevant (codons) and that procedures apply to these (via ribosomes) to create an, in principle, unbounded number of proteins (which in actuality are limited by physical law. ---Note, this is meant to be a helpful analogy, not a system that is the same as human language!) So we take there to be a finite repository of physically realised symbols, probably encoded rather than nominal so they have structure, and we take Merge to be a physical procedure that can combine these in an unbounded way to create new physical structures as needed. Crucially, the symbolic nature of these entities (i.e the fact that their temporal and spatial structure matters, like, for example, the positions of 1 and 0 in a binary number) and the nature of the procedure that combines them is such that there is no limit, beyond physics, to the number of these: there is an unbounded number of them. You need one, you build it. That's what `infinite use of finite means' is all about. Of course there's not an infinite number of them in actuality. As for the 'sets are abstract' issue, as I said before, the set theoretical specification of Merge is a hypothesis about what properties Merge will have to have to explain the linguistic phenomena. Various people have used, for example, the Axiom of Extension plus this hypothesis to argue for particular views of the relevant computations (Kayne, and myself in my recent MIT press book). This all may be wrong, but we can only find that out by pushing the proposal, and seeing where it breaks.

    9. I have said it before but you may have missed it so I repeat: Whenever you find yourself assuming that Paul Postal is stupid, you can be sure that you have not correctly understood his argument. And, irrespective of what you say, you seem to agree with him [not sure why you did not notice?]. You say several times that what is infinite is not physical, exactly one of the points Postal makes. You seem somewhat confused though about the difference between a finite but unknown upper bound and no finite upper bound. This confusion allows you to navigate back and forth between systems that have the former [like human brains] and systems that have the latter [abstract Turing machines]. You sprinkle in a bit genetic jargon for effect but admit this is merely an analogy [the helpfulness of which is not really evident]. However, you make the crucial point here:

      "Crucially, the symbolic nature of these entities (i.e the fact that their temporal and spatial structure matters, like, for example, the positions of 1 and 0 in a binary number) and the nature of the procedure that combines them is such that there is no limit, beyond physics, to the number of these: there is an unbounded number of them"

      As you say it quite nicely there is no limit BEYOND PHYSICS. And this is of course the problem for BIOlinguistics, isn't it? Brains NOT beyond the limits of physics and clearly they are finite. Now if your [and presumably Chomsky's?] theory / program / hypothesis is NOT limited by the confines of physics may I ask WHAT makes it a naturalistic theory / program / hypothesis ? All your nice talk about computations does not eliminate the fact that in order for a theory / program / hypothesis to qualify as naturalistic it must be about concrete physical [in case of biolinguistics: biological] objects not about abstract Turing Machines or, heaven forbid: Platonic objects that exist beyond the limits of physics...

    10. You do understand the difference between a function in intension and extension, right? You can have a finite specification of a function in intension that places no bounds on the number of outputs that are possible to compute using a procedure that implements that function. You can physically implement a function in intension using finite resources. Such an implemented function can perform an unbounded number of computations, limited only by its physical capacity. The theory of the language faculty is such a function in intension capturing fundamental structural constraints on actual language faculties in human beings, which produce, of course, a finite number of outputs (here is where the analogy I drew with genetic coding should have been helpful). These are all, to people who have read the literature, trivialities. These trivialities have no worrisome ontological consequences, quite obviously.

      On that note, I find myself at the end of my short break from work and once again faced with the much less interesting trivialities of university administration, so unfortunately this potentially infinite conversational exchange will have to be cut short by my own physical limitations of time and energy.

    11. Thanks for this. It is clear by now that you attribute to me Chomsky's conflation and then make fun of that. But, you see, I am not the one who believes that there is no difference between language [a set of expressions] and knowledge of language [a neurophysiological state of human brains]. If you want to call the latter a physically implemented function that is fine by me since we agree it can "produce, of course, [only] a finite number of outputs".

      But it is Chomsky who claims that

      "a mentally represented grammar and UG are real objects, part of the physical world, where we understand mental states and representations to be physically encoded in some manner. Statements about particular grammars or about UG are true or false statements about steady states attained or the initial state (assumed fixed for the species), each of which is a definite real-world object, situated in space-time and entering into causal relations". (Chomsky, 1983, p 156–157)

      and once more:

      "We can take a language to be NOTHING OTHER than a state of the language faculty... And to say that somebody knows a language, or has a language, is simply to say their language faculty is in that state." (Chomsky (2000, p. 8, my emphasis)

      So you may want to give him this delightful miniature lecture next time you see him and remind him that such physical objects cannot generate "an infinite set of expressions" {Chomsky, 2000, p. 8]. And please tell your first-years that they should not take Chomsky seriously when he makes such claims because - as we have discovered - he just speaks metaphorically. Dankeschoen.

    12. Thanks for this. It is clear by now that you attribute to Chomsky an incoherent position and then make fun of that. But, you see, he's not the one who believes that (E-)language [a set of expressions], as opposed to knowledge of language [a neurophysiological state of human brains] is the proper object of study for (bio-)linguistics. If you want to call the former the proper object of study for linguistics that is fine by me (and I guess Chomsky) but I for one am not the least bit interested in studying sets of expressions.

    13. It is of course entirely up to you to properly study internal steady states [of human brains] and I wish you good luck with that. You may want to send a memo to Norbert though that the study of the examples he used in “One Final Time into the Breach (I hope): More on the SMT” is beside the biolinguistic point.

      Given that you for one are not the least bit interested in studying sets of expressions [E-language] you do not really care about “fillers” or “gaps” (e.g. assigning a Wh a theta role by linking it to its theta assigning predicate).Thus in examples like (1a) (in contrast to (1b), where there is a clear and measurable slowdown in reading times at Bill because it is a place that the who could have received a theta role.

      (1) a. Who did you tell Bill about
      b. Who did you tell about Bill

      You do not care about the expressions just that reading times for 1(a) is slower than for 1(b)? This probably extends to any [a], [b] pair regardless of what causes the slow down [as long as the cause is internal to the speaker [reader] - I language]. So presumably for you one can just reduce biolinguistics to measuring reading times? Sounds a bit behaviourist to me and is kind of what was done in the good old 40s and made fun of by Chomsky in his early work. But then who am I to judge? I believe outrageous things like that Paul Postal does brilliant linguistic work but what he does is not of the least interest to you because he has never worked on I-language [aka brain states].

      Now according to Norbert "Colin considers gaps within two kinds of complex subjects, Both prevent direct extraction of a Wh (2a/3a), however, sentences like (2b) license parasitic gaps while those like (3b) do not:

      (2) a. *What1 did the attempt to repair t1 ultimately damage the car
      b. What1 did the attempt to repair t1 ultimately damage t1
      (3) a. *What1 did the reporter that criticized t1 eventually praise the war
      b. *What did the reporter that criticized t1 eventually praise t1

      Colin concludes that the grammar allows gaps related to extracted Whs in (2b) but not (3b), but only if this is a parasitic gap. This is a very subtle set of grammatical facts" [and Norbert sound as if he approves of the conclusion - so presumably he too thinks this is an interesting fact].

      But you’re not interested in such Schnickschnack [E-language] – what interests you are internal brain-states [I language]. Right now we can talk about them only indirectly bu measuring reading times. So I am looking really forward to reading some of your publications, informing us what is going on internally when human brains deal with the above expressions [and many others]. Because, you see, I AM very interested in that too. I was just convinced by Chomsky that currently such knowledge is beyond our scientific grasp...

    14. "Colin concludes that the grammar allows gaps related to extracted Whs in (2b) but not (3b), but only if this is a parasitic gap. This is a very subtle set of grammatical facts" [and Norbert sound as if he approves of the conclusion - so presumably he too thinks this is an interesting fact]."
      Collin concludes something about the grammar, doesn't he?

      "But you’re not interested in such Schnickschnack [E-language] – what interests you are internal brain-states [I language]."
      So, what exactly could these reading time experiments tell us about the Schnickschnack?

      "Right now we can talk about them only indirectly b[y] measuring reading times."
      Our only way? Since when?

      I'm afraid you've lost me (again), I can't see how anything you wrote connects up with my (admittedly snarky) comment. That's fine, I didn't expect a reply so I'm not surprised I didn't get any. I'll leave you and David to your discussion, sorry for interrupting.

    15. Well, dear Benjamin, the purpose of my [admittedly sarcastic] comment was to show you that it is not very wise to denigrate the hard work of linguists like Postal by claiming that E-language is not of interest to linguists. For that reason, I was showing you what one has to conclude if one takes your claims seriously. Now I think that, like all sane linguists, you do not take Chomsky at his word when he says irresponsible things like the study of E-language will “not reveal anything interesting about the nature of language” (Chomsky, 1986, p. 29). It is curious that Chomsky would make such remarks and then go ahead and use example after example from E-language to support his view. I can speculate about why he made the remarks but if you really want to know you have to ask him.

      My 'only' referred to the specific example not to everything that is done in biolinguistics. But since, apparently, that was not clear I apologize. Now returning to the specific example: right now the reading time experiments can tell us at best something about performance. I know that Chomsky [and others] HYPOTHESIZE that there IS an internal I-language. But at this point we do not have any theory independent evidence confirming putative brain structures that could be implicated. Absence of evidence is of course not evidence for absence. BUT you cannot simply assume there is any 1:1 relationship between reading times and underlying I-language structure. You have to rule out that reading time is affected by any other performance determining factors. The same holds true for the currently still very crude methods like fMRI etc: you cannot tell with certainty WHAT causes differences in blood flow to certain brain areas when certain tasks [like reading [or listening to] expressions - e.g. samples of E-language] are performed.

      No need to apologize for interrupting a conversation that was finished: we agree that human brains "produce, of course, [only] a finite number of outputs". The remaining disagreements may become the topic of a paper but discussing them in detail here would be boring to people who believe they are trivialities.

  4. One question: what do you mean by "need" in "need to use it"? I assume you intend that some systems use it and those that use it do so optimally. This does not imply that every interface system does use it, only that if they use it they do so optimally. Is this right?

    1. I think that SMT would say that the computational system is optimal for the other systems of mind that actually use it. Surely not all systems use the structures delivered by the computational system (e.g maybe the olfactory system doesn't), some because they just can't make use of that kind of data structure, and others perhaps because they have no actual physical access to it (just a guess though). I've always liked the idea that FL is a bit like a genetic algorithm, optimising itself for the interface tasks that it's used for, during development (so allowing some plasticity for especially the motor systems) but with some canalisation especially for the systems of conceptual combination.

  5. I think I share some of David's confusion, so let me see if I can figure out what Norbert's getting at when he's talking about SMT as a metaphysical thesis.

    The thesis that the computational system is optimal in some specified sense is a robustly empirical thesis. However, the claim that the truth of this thesis would be sufficient to explain why language has the properties it does is metaphysically doubtful. By analogy, if it could be shown that the optimal number of eyes is three, that would not explain why I have three eyes (I don't!) Or to take a linguistic example, if it can be shown that phases permit optimally efficient computations in some sense, that does not explain why phases exist, since (1) is — to put it mildly — not a plausible metaphysical principle:

    (1) Everything works in the most optimal way possible.

    Without something like (1) (or some weaker principle which can do the job in the case at had), we have no explanatory bridge from “X would be the optimal way for the computational system to work” to “X is how the computational system works.” The only obvious candidate for something like (1) in the case of the language faculty is evolution by natural selection. But Chomsky has always taken a dim view of the hypothesis that the language faculty is optimal in some adaptionist sense.

  6. I think I see, but I think its a mistake take SMT metaphysically in this way in the first place (contra Martin and Uriagereka, which I was never convinced by). Investigating SMT, if SMT is anywhere near correct, would itself provide a particular take on what optimal would mean in this domain (since the way it's stated is as 'an' optimal solution, presumably there are other possibilities), and that result may be further understandable from the perspective of theories of physical or computational optimality in general, outside of the linguistic or biological subdomains of science. But all of this is just normal science, attempting to understand particular findings in certain empirical domains in terms of more general theories. It's a bit like saying that evaluation measures ( our old kind of simplicity!) need to be determined empirically, which was always the only way to do it since we don't know in advance what counts as optimal (e.g number of features, determinism of mapping, underspecification of order, ...). Same for optimality in SMT. Still no metaphysics.

  7. I like Alex's gloss and I agree, I hope that was clear, that I am no fan of the metaphysical reading of the SMT. I also think that David and I are on the same page here: assume that any system that uses FL uses it optimally. I take PLHH to provide a gloss on what this could mean: the representations are used transparently, i.e. there is no need for a translation into a covering grammar, the actual grammar suffices for interface use. This is a nice methodological assumption to make for it allows for ready falsification. Moreover, where accurate, it gives one another empirical window on the structure of the representations, something we all desire. An interesting consequence of this, however, is that to so function (as an empirical window on the structure of FL) it need not be that all interfaces actually/in fact optimally engage with FL. It suffices for this purpose that some do. PLHH provide evidence that this possibility is actual. Good. So, SMT as regulative ideal and with no required metaphysical consequences. Great.

  8. I always have to translate Norbert's (or other blogger's) idea into something simple to get it. In this case Norbert did it for me (perhaps in an earlier blog): It's an excellent idea to model (not too dense) gases as composed of non-interacting molecules, but no philosopher should deduce from it that molecules do no interact.

  9. I find it noteworthy that minimalists constantly need to translate ideas into something that has absolutely nothing to do with human language: 'gases composed of non-interacting molecules' here; neuron wiring in nematodes or comet trajectories in Chomsky's writings etc. etc. This raises the [admittedly naive] question: why do you use so rarely analogies from human brains to explain your ideas to each other and to the world? And why do even the analogies from human brains concern virtually always other systems like the visual system. No one denies there are some similarities but what is INTERESTING about language is DIFFERENT from the visual system. The great Noam Chomsky said [rightly IMHO] about connectionists: "They’re abstracting radically from the physical reality, and who knows if the abstractions are going in the right direction?" [Chomsky, 2012, p. 67] So let me ask you: how do you know your abstractions ARE going in the right direction?

    I also find it interesting that in Norbert's post we comment on right now not a single linguistic example supports the bold assumption "that any system that uses FL uses it optimally". David says quite rightly: "since the way [SMT is] stated is as 'an' optimal solution, presumably there are other possibilities" - so where can I find a comparison between SMT based analysis of specific linguistic examples [say 100 sentences that normal speakers of English would use] and a Platonist analysis? Since the analysis of the latter is not informed by the kind of optimality considerations discussed here there ought to be differences. And then we can judge whether the SMT based analysis is superior.

    Finally, I am fairly sure that the people who have e-mailed me with comments on and questions about my postings here will be taken aback reading that a philosopher is being ridiculed by minimalists for insisting that the entities posed by the [program/thesis/theory...] need to be at least in principle capable of existing in human brains. That apparently Norbert finds it hysterically comical that someone would expect such is an important insight that will be news for many who consider minimalism as part of the natural sciences

    1. Well, I wrote the above because Norbert's rather simple idea had become a topic for an excedingly "methaphysical" discussion.

      Now the examples are not just analogies, in fact they are homologs to SMT for they all arose from the same scientific method.