Wednesday, September 30, 2015

What are theta roles for?

So here’s my question: What’s the point of theta theory? What does a theta role do? Here’s my impression: we want theta theory to do two different kinds of things and it is not clear to me that any theory can (or should) do both. What are these two things? They are an integral part of the semantic interpretation of a sentence and they are the means by which arguments are linked to syntactic positions in “D-structure.” I should point out that noting these dual desiderata is not original to me, but arises from what I recall were earlier important discussions of these matters by Dowty, Grimshaw, and others.  Nonetheless, I feel that these issues have become more obscure over time and I would like to engage in a rambling re-think. This is all in the way of excusing the shambolic nature of what follows. Hard as it is for me to present clear arguments in general, in this case I am not even going to try. I just want to sorta kinda survey the options and try to clear up my own confusion. Needless to say, I am relying on the kindness of others to clear up the mess. Here goes.

The literature seems to have two different (though possible related, we shall see) desiderata for theta roles:

(1)  Theta roles are required for semantic interpretation
(2)  Theta roles are required to get the LAD from primary linguistic data to a G.

Let’s discuss each of these a little bit. The first view of theta roles treats them as essential semantic notions. Without theta roles, arguments would not have a semantic interpretation and given that Gs map meanings and (“with,” if you are a thoroughly modern minimalist (TMM)) sounds then we need some conception of meaning which is the target of the mapping and theta roles are taken to be one component of a well-formed meaning.

The second view treats theta roles as levers for getting a language acquisition device (aka: a child or LAD) from primary linguistic data (PLD) to a G, most particularly, from PLD to a “D”-structure. The ‘D’ here is in scare quotes here for as any TMM knows we have dispensed with D-structure in the GB sense, yet, so far as I know, every theory of GG has some analogue thereof, including current minimalist accounts. By ‘D-structure’ I just mean the G structure in which arguments are grammatically linked up to their predicates (or vice versa).[1] A G establishes thematic links before establishing any further dependencies that expressions grammatically enter into (e.g. agreement, case, binding, and especially, movement). Fixing this first relation, the one where arguments join predicates, is very important because it is very hard to study all the other dependencies, especially movement, if you have no idea where expressions begin their grammatical lives. 

Is there a necessary relation between these two desiderata? Perhaps, perhaps not (though I suspect not). Here’s what I mean. It might be that the conception of theta role required for semantic interpretation is identical to the one that used to get LADs from PLD to Gs.  However, there is no obvious reason why this need be the case. In particular, the conception of theta role required for semantic interpretation seems to be at a different grain than the one useful from priming the G pump. Let me explain.

One conception of theta role is simply as a place-holder for the notion “argument.” For example, all we mean when we say that some DP has the “agent” theta role is that it is the external argument of some predicate.  The designation “agent” does not mean much, save indicating which of the ordered arguments of a predicate some DP is related to.[2]

The problem with this conception is that it is not clear how it helps with (2). In particular, it is quite unlikely that LADs know the meanings of the predicates they are being exposed to and so it is not clear how they could use this thin sense of theta role to acquire their G. Rather, what we would like is some good coarse rule of thumb that the LAD can use to vault into the G given some PLD. This is where notions like ‘agent’ and ‘patient’ gain their value. Being an agent or patient (a doer or done-to) is plausibly an observational feature of an event participant. In other words, the substantive interpretation of notions like agent and patient plausibly have what Chomsky called “epistemological priority” (EP). They are observable non­-linguistic predicates that can be used to map PLD to (non-observable) grammatical dependencies. An example of such a useful mapping rule would be “Agents are always external arguments, patients always internal arguments.” If every D-structure corresponded to a set of (observable) theta roles with the right linking rules, then we could solve the problem of how an LAD gets from PLD to abstract Gish structures.

Now the problem is that it turns out to be hard come up with such substantive thematic notions that are also plausibly semantically general. Another way of saying this is that though there are plausibly some clear cases of “agenthood,” it is not clear that the same notion extends usefully to all (or even most) verbal subjects. Thus, though kickers may be prototypical agents, lovers may not be. At any rate, one of the well-known problems is that such substantive theta roles have a problem extending to all predicates.

One important and influential solution to this is Dowty’s work (here). It defines super categories of theta roles, collapsing them into two “proto”-flavors; Proto-Agent (P-A) and Proto-Patient (P-P). Proto roles are defined over the full semantics of a predicate indexed to a particular argument position. Thus, P-As are “verbal entailments about the argument in question,” (i.e. those DPs that have more of the “agent” properties than any other DP in that argument structure).[3] On this conception, an argument is P-A if it has a preponderance of the following properties in a given proposition: it is volitional, sentient, a causer of events or changes of states in another participant, a mover, exists independently of event named by verb ((27): 572). Indices of P-P are undergoing a change of state, being an incremental theme, being casually affected by another participant, being stationary relative to movement of another participant, not existing independently of the named event ((28: 572). These are among the contributing factors that Dowty suggests for classifying arguments into one of the proto categories (he is quite clear that these may not exhaust the relevant entailments). Note that on this conception, proto roles are defined in terms of the more articulated semantics of the sentence. In other words, given the meaning of a sentence we can compute a coarser grained classification of arguments into super categories that “average” over the differences. On this conception, proto-roles “loose” information that the actual meaning of the sentence contains.

Not surprisingly, for Dowty proto-roles do not determine semantic interpretations for they presuppose them (i.e. proto-roles are defined in terms of the entailments of the argument in question in the specific proposition). Thus, on this view, proto-roles are not important for (1) above. Their special function (if they are important at all, which is something that Dowty often questions) is to provide an account of how arguments map to syntactic positions given that we know the verbal implications of that argument (i.e. what the proposition means).

IMO, the most interesting version of proto-role theory is Baker’s UTAH version (see here).[4] UTAH directly addresses the problem of how to get from pre-linguistic information into the syntax. The idea is that proto-roles mediate the mapping from PLD to “D-structure,” (e.g. P-As map to underlying subjects and P-Ps to underlying objects). Thus, proto-roles are understood to enjoy epistemological priority and are thus able to mediate a mapping to the linguistic system. What is less clear is that Baker’s understanding of proto-roles is really the same as Dowty’s. Why?

Well first, it seems unlikely, at least to me, that LADs compute proto-roles for a given predicate to see how they map onto the syntax. This presupposes that LADs have a rather rich understanding of the meaning of each predicate prior to having any linguistic analysis of the sentence. Some features of the “scene” may be evident (e.g. on hearing “Fido is biting the ball” it is evident that Fido is an “agent” and the ball a ‘patient’) but it seem to me unlikely that this is a consequence of a computation over the meaning of “bite” indexed to the subject and object positions. Rather, here the two notions are simple primitives applying more or less (im)perfectly to the scene at hand. To get from PLD to G, this kind of sloppy information may suffice (at least for a sufficient number of verbs) but it is unlikely to be based on a prior full understanding of the predicates involved. Rather the opposite. Of course, once the G is engaged, then there is more than theta theory available to guide the LAD. So, for the linking problem, all that UTAH must do is get the LAD into the G, then the G can offer other kinds of linguistic information useful for acquiring the G of interest.

Second, Baker also assumes that the theta roles that solve the linking problem are also inputs to the semantic interpretation of the sentence. Note that this is very different from Dowty. For Dowty, proto-roles are too coarse to provide a semantic interpretation. Baker’s suggestion that theta roles are critical to meaning (rather than notions derived from the meaning) assumes a different conception of linguistic meaning than Dowty’s conception. It is unclear to me whether this conception has been fully articulated.

There is a second influential view of theta roles, one that aims to tie it more tightly to a natural semantics. This eschews proto-roles and develops a more articulated inventory of thematic functions.  So, here we get not just two or three roles but a myriad of these. Agents, causers, experiencers, instruments, goals, sources, beneficiaries, targets of emotion, etc.  This richer conception allows theta structure to explicate argument structure. Theta roles don’t just reflect meaning. They determine it. Here, theta roles are cut thinly enough so that they can support intuitive differences in the meanings of different predicates. Not all agents/causers/experiencers are the same. We need hyphenated versions of these to get the full range of mappings that all the different predicates in a language manifest.

There are two main problems with this conception, I believe. First, as Dowty argues quite persuasively, we really don’t have an even approximately decent theory of what these richer roles are or how to specify them.  In particular, there are many many verbs where it is quite unclear what the theta roles of the relevant arguments is let alone how they differ. The most obvious cases involve symmetrical predicates like ‘face’ (e.g. “Carnegie Hall faces the Carnegie Deli”) or ‘resemble’ (“Bill resembles Sam”). In such cases it is quite difficult to see what thematic difference might distinguish one argument from the other. And this problem generalizes. Why? Because there are many different ways of being an agent and it is not at all clear that a hugger is an agent in the exact same way that a lover is. But if these differences are semantically relevant, then it appears that we will need about as many theta roles as we have predicates. This is effectively Dowty’s point, but in the other direction.  You can’t get from agents directly to huggers as the concept is intended to abstract away from what makes huggers different from lovers.  But if you want to get all the way to the actual semantic role that subjects of these particular predicates play, then you will need a lot of hyphenated theta roles.

Second, it is not clear whether this conception will get you any purchase on (2). Again, as Dowty notes, for this end we want a coarser notion, one that will allow us to map arguments to syntactic structure in some general way. Cutting roles too finely will not yield a simple mapping from roles to structure.

It is worth considering for a minute how these two conceptions interact with the theta criterion. As Grimshaw, among others, noted a long time ago, the theta criterion can make do with a very thin conception of theta role. All it requires is that whatever a theta role is a DP must get one and no more than one of them. It does not matter how we distinguish roles, only that we have some way of tying roles to syntactic positions. The prohibition amounts to the claim that an argument must saturate some position and cannot saturate more than one. So far as the theta criterion goes, we don’t really need a general conception of theta role, only of something like “argument position.” The theta criterion restricts arguments to one and only one of these.

A substantive theory of theta roles, one where the kinds of theta roles we have matter, then only really arises with the linking problem. Here we need theta roles that enjoy EP because grammatical notions are not observables, and so to prime FL, to get us to Gs, we need some notions that can bridge the G non-G divide (i.e. some observables that are (at least weakly) correlated to Gish concepts).

Let me be a little clearer. Subject-hood and object-hood are not observable except via an FL lens. Agent-hood and patient-hood likely are. My (ex) dog Sampson could parse many scenes into agents and patients (doers and done-tos), at least some of the time. If this is so (and I am certain that it is) then these sorts of notions have EP status (they are not parasitic on FL for their viability), and these notions can be used to prime FL via something like UTAH (i.e. agents are subjects, patients are objects). UTAH uses the EP thematic notions to access FL given some PLD.  But, if this is what one needs thematic notions for, then it is not at all clear that every argument in every sentence need have a theta role. All that is required is that enough PLD can be parsed in this way to get the G system off the ground. Once the LAD has accessed FL and started developing a G then these Gish notions can take over/supplement the analysis of the PLD. In other words, theta roles as EPs need not be very general (i.e. cover every conceivable predicate and argument), they just need to be general enough to cover enough PLD predicates to prime FL and get it going. Once FL is engaged then its resources are available for further linguistic analysis. And for this purpose, these notions can be (actually should be) quite coarse as their aim is not to provide an interpretation for the sentence but to just crack open the FL module and make it usable by the LAD, which, when on-line, is then able to provide (more) grammatical ways of analyzing the incoming PLD (e.g. this agrees with that so this is a subject, this is adjacent to the verb so this is the object, etc.).

One might go a step further here, I think. To solve the linking problem you want coarse roles that are not determined by calculating the verbal inferences of an argument. Why? Because this is just too fancy a procedure. You want very coarse indicators, those that Sampson could (and did) use. The problem with proto-roles as understood by Dowty is that they don’t seem to be EPish. They are not so much observables as inferables. What I mean is that to get proto-roles the LAD would need to compute inferences off of pretty sophisticated semantic representations. And these need not be very accessible. Better to have limited coarse-grained properties that fit a small number of available predicates than to have a sophisticated system that generalizes across all predicates. You just don’t need the latter if what you want to do is solve the linking problem.

I’ve rambled on long enough and repeated myself way too much (as if repetition and clarity go hand in hand!). Here is what appears to be the main conclusion: we seem to have been asking theta roles to do two things that don’t obviously pull in the same direction. We want them to provide an interpretation for the sentence and to solve the linking problem. However, the kind of roles we want for the first appear to be different from the kinds of roles we need for the second. IMO, the linking problem is the important one for GG. But if this is right, then having a theory of roles that applies to every DP in every sentence is unnecessary (or at least not obviously required). We need a few gross observational roles that apply to enough PLD predicates to get a G up and running. Once engaged, an LAD gets immediate access to a whole slew of linguistic features that the LAD can effectively use to continue acquiring its G. One this conception, we just don’t need a general theory of theta roles ( that assigns each argument an interpretive role). Which seems like a good thing given that one does not appear to be currently available or likely to be forthcoming.

[1] Everyone (including advocates of the movement theory of control, e.g. me) assumes that at least the following is accurate: every (contentful (i.e. non pleonastic)) DP enters the derivation through a thematic door. Thus the first relation that any such DP grammatically enters into is a thematic relation. This is also true of every version of minimalism that I am aware of.
[2] Even for neo-Davidsonians like Scheine and Pietroski where theta roles serve an important type-lifting role (they are relational predicates that tie a DP to an event variable), all that is generally required is a distinction between internal vs external argument. What flavor these are (whether they are agents or experiencers or causes or…) does not really matter much. The same is even truer for standard conceptions where arguments are effectively related to their predicates by saturating a variable position of the predicate via lambda conversion.
[3] Arguments actually, for it not defined syntactically but over the propositional structure. We can say that a DP has the proto-role in virtue of representing the relevant argument. I will leave such niceties aside here.
[4] This is an online version of the paper that appeared in Haegeman’s edited volume Elements of Grammar. It is a great paper. One of those that I wish that I had written.


  1. "In other words, the substantive interpretation of notions like agent and patient plausibly have what Chomsky called “epistemological priority” (EP). They are observable non­-linguistic predicates that can be used to map PLD to (non-observable) grammatical dependencies."

    Could you say why you believe this? I don't think it is plausible at all that coarse-grained thematic role concepts are prior in any respect. If anything is plausibly "prior", it is non-linguistic correlates of the finer-grained properties Dowty looked at like causation, volition, animacy, change, movement, etc.

  2. I have the same question as Kyle. I don't see why "proto-roles as understood by Dowty...don’t seem to be EPish." If anything, they seem much more EPish. Is AGENT really more of a natural class than the sorts of properties often associated with AGENTs---e.g. VOLITIONAL, SENTIENT, etc.?

    (I'm actually unclear here whether you mean to have more specific roles here or not---e.g. AGENT-OF-KICKING---presumably for Fodor&Leporian reasons. In that case, do we really want UG---or at least the learner's initial linking algorithm---to come prebuilt with all the linking roles you'd need to make that work---e.g. AGENT-OF-KICKING -> SUBJECT or a rule that respects the hierarchy AGENT-OF-KICKING > PATIENT-OF-KICKING?)

    So I think your response is, give me whatever list of EP roles/properties you want, Dowty-style linking rules are just "too fancy a procedure" for a learner to employ. I could imagine this response coming from two sources: (i) Dowty's linking rules are fundamentally relational (on arguments), while you might be able to have a simpler set of nonrelational linking rules like AGENT -> SUBJECT, PATIENT -> OBJECT, at least to get the learner started; or (ii) Dowty's linking rules are globally relational (you have to take into account all of the relevant entailments), while the simple rules are relational in a much simpler way (they make reference to a thematic hierarchy like AGENT > PATIENT).

    That characterization seems accurate (though correct me if you wouldn't give such a characterization), but I doubt it tells us what kinds of learners are more or less plausible. The stupid learner in (i) is going to make a lot of stupid mistakes on really simple verbs---e.g. "NP break" v. "NP break NP"---that could well show up downstream. The non-Dowty relational learner in (ii) will do better, but at the cost of relationality.

    But why should I believe that comparing a few more plausibly EPish properties a la Dowty is going to be harder than comparing single roles (especially as the number of EPish roles expand)? The big jump seems to be granting relationality in the linking rules. And once I've gone the relational route, it's straightforward to state the two alternatives in the same learning model by slightly altering the representational constraints (as is true of all problems that contrast a categorial system with the analogous featural system). And at that point, it's an empirical question.

    1. Reasonable points. I guess I think that the more sophisticated the roles the harder they will be to "see." I am not against subbing "causer" for "agent." What I am not eager to do is collapse causer and experiencer and both with mover. The more pro to roles average over these basic concepts the less language independent they strike me as being. Of course I could be wrong. The question I always get back to is what would Sampson perceive. I can imagine him perceiving agents/causers and things done-to (patients) but the pro to-roles seem too fancy and require the kind of rich linguistic knowledge that we are trying to explain. But like I said, this could be wrong. What is not wrong is the need to worry about just this question and exploring whether when we answer it we need assume that whatever it is that links roles to syntax is also what interprets arguments at CI. I think we want to decouple these, or at least an argument is needed to identify them.

      As for relationally, I guess this is what I would like to dispense with. I want course EPish properties for these strike me as plausibly being language independent. the coarser the better.

    2. Right. So there's sort of two major dimensions we might worry about. One is whether the linking rules are categorial (as in many, but not all, approaches prior to Dowty) or featural; the other is whether the linking rules (or the algorithms by which learners take advantages of such rules) are relational or not. All four combinations are possible, though as far as I know the featural-nonrelational combination isn't instantiated in the literature. (I could be wrong about this and would be grateful for references.)

      (As a side note, we have a followup in progress to the recent Reisinger et al. paper on computational approaches to proto-roles---which can be found here: explicitly compares the relational and nonrelational approaches. Full disclosure on that paper: I am now working with this group.)

      As a matter of learning though, I think there is a natural affinity between featurality and relationality. The featural system, unless further constrained, implies complexity in learning since the number of possible feature configurations that a particular argument might have will be exponential in the number of features (or worse, depending on the nature of the featural system). In some sense, relationality actually helps this problem since each argument mutually constrains the possible featural combinations of the other arguments.

    3. The combinatoric problem arises, however, only if you care to cover a lot of of the predicates. If your interest is in covering only a small salient subset that then vaults you into the G then it is not clear that you need anything very complex. Or, if your interest is the linking problem then it is not obvious that you need solve it in the general case. All you need do is solve it in the clear cases and then use other linguistic information to map arguments to their syntactic structure positions. So, here is my question: why should I have Dowty's ambitions? What goes wrong if some predicates (or even most) have no theta roles in the substantive sense or even in the pro-role sense? What do I loose?

      Note, btw, the coarse theory of theta roles makes a kind of prediction: it suggests that certain types of predicates will be freely available on the PLD: those where intuitive notions of cause or agent are easy to spot. On a pro to role theory no obvious distribution of verbs in the PLD is demanded, so far as I can see. If this is so, then there is an empirical difference here that is testable in a pretty direct way.

    4. Here's why I'm uneasy: AGENT seems like a pretty high-level concept that is detectable via (but not necessarily defined in terms of) lower level concepts like the ones Dowty entailments are related to. Now, suppose that Dowty is a good mentalistic characterization of the adult system---i.e. linking rules are specified featurally, not categorially, and linking rules operate over them. Why shouldn't I just adopt a learner-is-the-grammar hypothesis (a form of continuity hypothesis)?

      Again assuming that Dowty is a good characterization of the eventual system, it's logically possible that learners have a completely separate mechanism that throws away useful information, but I'm not convinced that we have to posit a separate kind of learning mechanism to get a learner that does this. For instance, you could imagine a learner that does something similar to what's known as dropout in the deep learning literature (but which has analogues in various traditions), wherein the learner samples features to pay attention to over the course of learning.

      With respect to the empirical predictions, that may be true modulo the "observability" or "concreteness" of the events denoted by a particular predicate. (I scare quote observability because I have no idea what it means for an event to be either observable or concrete.) For instance, I seriously doubt you're learning that "think" requires SENTIENCE of its subject from observing that a particular object out in the world is sentient and thinking.

      In any case, I think the categorial nonrelational theory will make some further testable predictions: kids should be terrible at the inchoative versions of verbs that undergo the causative-inchoative alternation while they're in the stage where they use any sort of nonrelational mechanism, unless that mechanism says something like "all bets are off for intransitives."

    5. I have no idea of what it might mean that events are not observable or concrete. The question is can non-humans perceive 'chasings' or 'bitings'? If they can perceive these, which I assume they can, then these enjoy EP. Can they perceive causers (I prefer this to agents actually). Again, I assume they can and do. Can they perceive subjects? Not without a G. Why? Because these notions are defined in terms of Gs. Can they perceive an argument? I again doubt it. WHat we want, I think, are notions that are like causer/agent that are perceptible independent of G knowledge that can get you into a G. I also happen to believe that it must be coarse if it is to do the work for LADs do not know the meanings of most predicates and so computing implications off these meanings would be unrealistic. Last, I don't see that coarse meanings won't do just fine. They organize the PLD just enough to get the job done (or could, I think). If this is correct, who needs the kind of full coverage that proto-roles is designed to deliver? Why care if arguments of arbitrary predicates can be assigned roles? And if you don't care about this, who needs pro to roles?

    6. Ler me try and make the relevant point another way. I understand why someone might want a very articulate (non proto-role) theory of theta roles if theta roles are intended to provide a semantic interpretation of a sentence. These are needed because sentences have interpretations and if theta roles determine how arguments are interpreted then we need a theta role for every argument or that argument will not have an interpretation.

      Now, Dowty rejects this view. He does not think that theta roles interpret arguments. In fact, he doubts that the desired conceptions are definable even if one wanted them, which he does not. Why? Because he doesn't think that's how arguments and predicates get interpretations. However, he thinks that IF one wanted something like theta roles then proto-roles can be defined that would cover all predicates and arguments assigning a proto-role to each. But this has nothing to do with giving a meaning to the arguments and predicates as it is defined in terms of the meanings they independently have. Nonetheless, should you want each argument to have a theta role for any given predicate then the kinds of theta roles you want are proto-roles. Proto-roles can cover all arguments for all predicates. The question then is why would you want such a general definition? To solve the linking problem: explain how LADs get from data analyzed in EP terms to descriptions that are G rich.

      My point is that IF this is the problem you want solved it is not obvious you need proto-roles. More specifically, it is not clear that you need a notion that will deliver a role for every argument for every predicate. You might be able to get away with a far coarser notion that only covers "enough" PLD predicates, leaving many predicate arguments with no roles at all.

      If this is right, then my observation/problem/question is why try and define proto-roles? Why assume that we need a notion that covers all arguments of all predicates? It's not clearly to provide a semantic interpretation, nor is it required to get the LAD into the G. So why bother? What's it give us? And if it gives us what we don't need, why assume that such a general notion of theta role is necessary?

    7. Hmm yes, it seems to me that the 'ARG_i' positions in certain versions of linking theory in LFG (eg Kibort's, and as developed by Asudeh et al), I, II, III in Relational Grammar, or the various positions available in VP shell analyses all do pretty much the same jobs as 'macroroles', and that the justification for their existence is syntactic.

      Otoh 'Epistemological/Semantic Primes' such as the 'X is doing something (bad) to Y' that dogs seem to grasp can be justified typologically, but the fact that events and situations that fit them seem to be treated consistently in language after language (with differences between languages), but languages can express an immense range of situations that don't fit any of them.

  3. It seems to me that, at least on a broad conceptual level, the tension between Dowty's and Baker's views is much less severe, perhaps non-existent, if theta-roles and the Theta Criterion are construed as belonging to the external interpretive ("C-I") system(s) -- which seems to me to be the only meaningful interpretation of these notions within the current standard model (and the only sense in which they can be said to have epistemological priority). UTAH, or whatever variant of it we want to adopt, is then a principle regulating how those systems read the structures generated by G (more specifically, the "D-structure", now called "vP phase"); this seems to be the view adopted by Baker towards the end of the paper you cited. Then G need not know anything about theta properties, and ideally just merges away blindly.

    1. Actually, I don't believe that having EP is any way related to theta roles being part of CI broadly construed. The two functions are unrelated. More specifically, the coarse roles that Dowty defines is not serviceable for semantic interpretation if you understand this in the conventional way (as Dowty does and as he insists). Why? Because these roles cut across significant semantic differences. So, I actually do not think that his kinds of pro to roles are useful for semantic interpretation, agains as he himself insists.

      But, and this is the good news, they are useful for the linking/UTAH problem if restricted to the few verbs where a fine (rather than coarse) interpretation of roles is adopted. This can leverage you into the system even if these roles, unlike pro to roles, are not useful for defining external argument roles in general.

      So, here's where I think the tension lies: coarse pro to roles are needed to assign every DP a role but they are inadequate for semantic interpretation (and, I believe, for UTAH given that they presuppose knowledge of the semantics) while fine roles are useful for linking and UTAH but are not general. My suggestion is that we decouple these functions and forget about theta theory at CI.

  4. I think it quite plausible that a set of limited linking rules can guide the child to certain rudimentary syntax-concept mappings. And if this requires (for one thing) that prelinguistic infants parse the world in terms of causal events, and perceive the participants in those events as doers and done-tos, it seems we have some evidence that they do.

    For instance, Leslie and Keeble 1987 showed that 2 month olds strongly distinguished spatio-temporal reversals of Michotte-type stimuli (one ball's movement is perceived to have caused the movement of another ball). By merely flipping the direction of action, these reversals also reversed who caused whom to move. Infants didn't so strongly distinguish spatially-similar reversals of stimuli that failed to support a perceptual causal relation (i.e. one ball moves, there's a substantial temporal delay, and then the other ball moves). In the latter case, the lack of increased attention to the reversals can be attributed to the fact that, since perception supported no causal relation between the objects' movements, the balls played no particular roles, and there was consequently no (interesting!) perception that the roles were reversed.

    Other research (here) shows that 12 month olds inferred the existence and location of a causal agent that they had no perceptual evidence for.

    Still other work (here and here) suggests that infants as young as 6 months old evaluate characters positively/negatively depending on the roles they play in helping and hindering events.

    (h/t Brent Strickland's paper on how language reflects core cognition)

    1. Nice points. Something like the Michotte stuff is what I had in the back of my mind (thx for bringing it forward and updated). I suspect that this kind of causal "perception" is not limited to humans and so is part of our mammalian endowment more generally. It would not suffice to provide thematic structure to every verb, but then it need not do so, as you note. Thx.

    2. Jean Mandler at UCSD also has a conceptual primitive set, at least some versions of which are, interestingly tome, for the most part a subset of Anna Wierzbicka's (eg this one, except that her LINK looks to me like a confused amalgam of BECAUSE OF and WANT.