Faculty of Language: Right sizing ling papers

Tuesday, October 18, 2016

Right sizing ling papers

I have a question: what’s the “natural” size of a publishable linguistics paper? I ask because after indulging in a reading binge of papers I had agreed to look at for various reasons, it seems that 50 is the assumed magic number. And this number, IMO, is too high. If it really takes 50 pages for you to make your point, then either you are having trouble locating the point that you want to make, or you are trying to make too many of them in a single paper. Why care?

I care about this for two reasons. First I think that the size of the “natural” paper is a fair indicator of the theoretical sophistication of a field. Second, I believe that if the “natural” size is, say, 50 pages, then 50 pages will be the benchmark of a “serious” paper and people will aim to produce 50 page papers even if this means taking a 20 page idea and blowing it up to 50 pages. And we all know where this leads. To bloated papers that make it harder than it should be (and given the explosion of new (and excellent) research, it’s already harder than it used to be) to stay current with the new ideas in the field. Let me expand on these two points just a bit.

There are several kinds of linguistics papers. The ones that I am talking would be classified as in theoretical linguistics, specifically syntax. The aim of such a paper is to make a theoretical point. Data and argument are marshaled in service of making this point. Now, in a field with well-developed theory, this can be usually done economically. Why? Because the theoretical question/point of interest can be crisply stated and identified. Thus, the data and arguments of interest can be efficiently deployed wrt this identified theoretical question/point. The less theoretically firm the discipline the harder it is to do this well and the longer (more pages) it takes to identify the relevant point etc. This is what I mean by saying that the size of the “natural” paper can be taken as a (rough) indicator of how theoretically successful a field is. In the “real” sciences, only review papers go on for 50 pages. Most are under 10 and many are less than that (it is called “Phys Rev Letters” for a reason). In the “real” sciences, one does not extensively review earlier results. One cites them, takes what is needed and moves on. Put another way, in “real” sciences one builds on earlier results, one does not rehearse them and re-litigate them. They are there to be built on and your contribution is one more brick in a pretty well specified wall of interlocking assumptions, principles and empirical results.

This is less true in theoretical syntax. Most likely it is because practitioners do not agree as widely about the theoretical results in syntax than people in physics agree about the results there. But, I suspect, that there is another reason as well. In many of the real sciences, papers don’t locally aim for truth (of course, every scientific endeavor globally does). Here’s what I mean.

Many theoretical papers are explorations of what you get by combining ideas in a certain way. The point of interest is that some combinations lead to interesting empirical, theoretical or conceptual consequences. The hope is that these consequences are also true (evaluated over a longer run), but the immediate assumption of many papers is that the assumptions are (or look) true enough (or are interesting enough even if recognizably false) to explore even if there are (acknowledged) problems with them. My impression is that this is not the accepted practice in syntax. Here if you start with assumptions that have “problems” (in syntax, usually, (apparent) empirical difficulties) then it is thought illegitimate to use these assumptions or further explore their consequences. And this has two baleful influences in paper writing: it creates an incentive to fudge one’s assumptions and/or creates a requirement to (re)defend them. In either case, we get pressure to bloat.

A detour: I have never really understood why exploring problematic assumptions (PA) is so regularly dismissed.[1] Actually, I do understand. It is a reflex of theoretical syntax’s general anti-theoretical stance. IMO, theory is that activity that explores how assumptions connect to lead to interesting consequences. That’s what theoretical exploration is. If done correctly, it leads to a modicum of explanation.

This activity is different from how theory is often described in the syntax literature. There it is (often) characterized as a way of “capturing” data. On this view, the data are unruly and wild and need to be corralled and tamed. Theory is that instrument used to pen it in. But if your aim is to “capture” the data, then capturing some, while loosing others is not a win. This is why problematic assumptions (PA) are non grata. Empirically leaky PAs are not interesting precisely because they are leaky. Note, then, that the difference between “capturing” and “explaining” is critical. Leaky PAs might be explanatorily rich even if empirically problematic. Explanation and data coverage are two different dimensions of evaluation. The aim, of course, is to get to those accounts that both explain and are empirically justified. The goal of “capture” blurs these two dimensions. It is also, IMO, very counterproductive. Here’s why.

Say that one takes a PA and finds that it leads to a nice result, be it empirical or theoretical or conceptual. Then shouldn’t this be seen as an argument for PA regardless of its other problems? And shouldn’t this also be an argument that the antecedent problems the PA suffers from might possibly be apparent rather than real? All we really can (and should) do as theorists is explore the consequences of sets of assumptions. One hopes that over time the consequences as a whole favor one set over others. Hence, there is nothing methodologically inapposite in assuming some PA if it fits the bill. In fact, it is a virtue theoretically speaking for it allows us to more fully explore that idea and see if we can understand why even if false it seems to be doing useful work.

Let’s now turn to the second more pragmatic point. There has been an explosion of research in syntax. It used to be possible to keep up with, by reading, everything. I don’t believe that this is still possible. However, it would make it easier to stay tuned to the important issues if papers were more succinct. I think I’ve said this on FOL before (though I can’t recall where), but I have often found it to be the case that a short form version of a later published paper (say a NELs or WCCFL version) is more useful than the longer more elaborated descendant.[2] Why? Because the longer version is generally more “careful,” and not always in a good way. By this I mean that there are replies to reviewers that require elaboration but that often obscure the main idea. Not always, but often enough.

So as not to end on too grumpy a note, let me suggest the following template for syntax papers. It answers three questions: What’s the problem? Why is it interesting? How to solve it?

The first section should be short and to the point. A paper that cannot identify a crisp problem is one that should likely be rewritten.

The second section should also be short, but it is important. Not all problems are equally interesting. It’s the job of a paper to indicate why the reader should care. In linguistics this means identifying how the results bear on the structure of FL/UG. What light does your question, if answered, hope to shed on the central question of modern GG, the fine structure of FL.

The last section is the meat, generally. Only tell the reader enough to understand the explanation to the question being offered. For a theory paper, raw data should be offered but the discussion should proceed by discussing the structures that these data imply. GGers truck in grammars, which truck in rules and structures and derivations. A theory paper that is not careful and explicit about these is not written correctly. Many paeprs in very good journals take great care to get the morphological diacritics right in the glosses but often eschew providing explicit derivations and phrase markers that exhibit the purported theoretical point. For GG, God is not in the data points, but in the derivations etc. that these data points are in service of illuminating.

Let me go a bit over the top here. IMO, journals would do well to stop publishing most data, reserving this for available methods addenda available online. The raw data is important, and the exposition should rely on it and make it available but the exposition should advert to it not present it. This is now standard practice in journals like Science and there is no reason why it should not be standard practice in ling journals too. It would immediately cut down the size of most articles by at least a third (try this for a typical NLLT paper for example).

Only after the paper has offered its novelties should one compare what’s been offered to other approaches in the field. I agree that this is suggestion should not be elevated to a hard and fast rule. Sometimes a proposal is usefully advanced by demonstrating the shortcomings in others that it will repair. However, more often than not comparisons of old and new are hard to make without some advanced glimpse of the new. In my experience, comparison is most useful after the fact.

Delaying comparison will also have another positive feature, I believe. A proposal might be interesting even if it does no better than earlier approaches. I suspect that we upfront “problems” with extant hypotheses because it is considered illicit to offer an alternative unless the current favorite is shown to be in some way defective. There is a founder prejudice operative that requires that the reigning champion not be discomfited unless proven to be inferior. But this is false. It is useful to know that there are many routes to a common conclusion (see here for discussion). It is often even useful to have an alternative that does less well.

So, What, Why How with a 15-20 page limit, with the hopes of lowering this to 10-15. If that were to happen I would feel a whole lot guiltier for being so far behind in my reading.

[1] Actually, I do understand. It is a reflex of theoretical syntax’s general anti-theory stance.

[2] This might be showing my age for I think that it is well nigh impossible nowadays to publish a short version of a paper in a NELs or WCCFL proceeding and then an elaborated version in more prestigious journal. If so, take it from me!

26 comments:

UnknownOctober 18, 2016 at 3:08 PM
I think I agree that there probably is some causal relation between the "general anti-theory stance" and the the length of papers, but I wonder if there are other reasons for this as well. One thing that comes to mind is that in linguistics, we have squibs. I think I'm too young to have a clear idea of whether this really puts pressure on non-squibs to be longer, but I imagine that it might at least to some extent.

Dunno, but I'd be curious to hear others' thoughts.

And there might be other things at play, too.
ReplyDelete
Replies
UnknownOctober 18, 2016 at 3:58 PM
There's several aspects of linguistic writing that lead to bloat:

1) The insufficiency of existing notation
I cringe every time I see a prose explanation of how a given structure is built. You know, the usual "XP moves to Spec,YP after which ZP extracts from XP, which triggers bla bla bla, yadda yadda yadda". These kind of explanations often take up a half a page and are much harder to follow than simply drawing up a derivation tree or writing down the sequence of rules. It's very similar to early 19th century math, which is painful to read nowadays. Efficient notation is essential for a field, and syntax doesn't use much beyond labeled bracketing and traces.

2) Data Presentation
Linguistic writing has no techniques for compressing acceptability judgments for presentation. That's why the discussion of the data alone can already take up half a paper. Rather than dumping all data into an online supplement, I'd like to see a system to present and talk about data more succinctly. No idea how to do it, though.

3) Derivational writing
Instead of presenting a solution, linguistic papers tend to retell the history of how the researcher came up with the solution. It's particularly egregious with monographs, where in some cases everything is thrown away that was proposed before the last chapter. That's unheard of in math or CS: you define your analysis or model (again, good notation helps), then you apply it to data. The application to data is enough to show why you defined XYZ in a particular way, no reason to walk us through fifty deadends.

4) Completionist papers
This was already hinted at in your post, but the field's expectations for what constitutes a smallest publishable unit are very high. You can't just present a formalism, show that it solves one particular set of data, and then address problems in follow-up papers. Maybe the field doesn't sufficiently appreciate that an analysis of a given phenomenon, even if it cannot readily be extended to any other piece of data, still reveals interesting aspects of that phenomenon? Trying to make your account perfect before publishing is also detrimental to progress, you wanna follow the open source motto "release early, release often".

5) Reader handholding
In comparison to the papers I see in math, CS, computational linguistics, and molecular biology, it feels downright patronizing how much handholding is in linguistics papers. If there's a definition, it's followed by three examples of how it works (okay, that's actually a good thing because often the definitions aren't explicit enough to work on their own; so we have a case of two minuses yielding a plus).

For any relevant piece of data, you don't just get a specification of the analysis (e.g. via a list of the lexical items with their features), you get a two page explanation that walks you through it from the very first Merge step to the very end.

The discussion of previous literature is a special case of this type of handholding --- instead of presupposing that the reader knows the relevant works or is smart enough to read a survey first if they don't, you do a mini-review for everybody. I think this is part of the humanities heritage of linguistics.

Besides length, there's also several other aspects of linguistic writing that could be improved imho. Definitions should be clearly indicated as such rather than being lumped into the main text or being put into the example numbering scheme. Similarly, crucial assumptions should be highlighted typographically. And most importantly, papers should have an identifiable macro structure so that you can quickly find the specification of the analysis and the relevant data, rather than piecing it together from various paragraphs that are distributed over the whole 50 pages.
ReplyDelete
Replies
davidadgerOctober 19, 2016 at 12:36 AM
I agree with much of this, especially prose specifications of derivations, which also set my teeth on edge, but which I've had requested by reviewers. But I think I understand why there's a need for narrative presentation of data: the narrative presentation is actually the underpinning of the analysis. You can't just present the data in a compressed tabular format, as the various steps in the argument are the support for why you take the data to have the interpretation and import that it does. Then that analysis is what is generally connected to the theory. In fact, some of the most beautiful and impressive syntax is of exactly this sort: all the heavy work (and the long-lasting insights) are the marshalling of arguments for a particular view of the phenomenon, which is then shown to inform/follow from/ destroy some theoretical position.

This narrative presentation might also be behind Thomas's concern about `derivational writing'. Quite often such writing is not actually a reconstruction of the process of research (it certainly shouldn't be), it's a rational reconstruction of how the process of research should have gone. I actually agree that it'd be better not to use this (very venerable) technique of presentation (with Principle X', X'', X''', X(final form), etc), but in my own work, at least, I often get complaints about not doing this, and about having too technical an approach to simply specifying the system as a whole, without handholding.

Because of the data narrative underpinning the analysis, linguistics papers are most like philosophy papers, I think, rather than papers in the other humanities. Papers in literature, history, etc. tend to actually be very short (15pp or so), since the real work is saved for (and generally repeated in) monographs.
ReplyDelete
Replies
Peter SvenoniusOctober 19, 2016 at 2:43 AM
Does anybody feel like nominating some papers as good models of how a good linguistics paper should be?
ReplyDelete
Replies
markOctober 20, 2016 at 4:30 AM
Seems to me Peter's request for good models is impossible to answer without specifying the subfield, and even then the metric by which one might decide what's a good model is highly debatable.

For the record, the most cited journal ever in Language is Sacks, Schegloff & Jefferson (1974) (surpassing even Chomsky's review of Skinner). It has a highly condensed style where an elegant system of ordered rules for turn-taking is deduced on the basis of conversational data (i.e., competence inferred from performance). Samples of actual records of conversation are supplied along with the specification of the system, which makes the argument essentially replicable for any reader. It's an impressive paper in many ways and it has been massively influential in many fields, though it has too many footnotes to my taste and it is not particularly easy to read.
ReplyDelete
Replies
AveryAndrewsOctober 23, 2016 at 3:38 PM
I think the ideal length for a linguistics paper is 30 pages, but there are many reasons why it sometimes has to be exceeded, although very rarely beyond 50. One reason that linguists might need more formal handholding than computational people is simply that they are on the whole not as good at math ... possibly fundamentally as able in many cases, but practice makes a big difference. Linguistics also I suspect has a considerable population of people who do have significant mathematical ability, but were traumatized in various ways by bad or inappropriate teaching in K-12, so the need for handholding might be psychologically a bit deeper than just having to go at a slower pace due to less practice.
ReplyDelete
Replies
ewanOctober 24, 2016 at 7:56 AM
Lit review sections are often bloated and meandering. I admit I may have a bias. I have had two papers come across my desk recently with bloated and meandering lit review sections, and only one was a linguistics paper, and only in that case did I blame the customs of the field for the bloat, rather than simply the author (the other was a psychology paper). That having been said, I have the genuine suspicion that linguistics authors feel they have a minimum page count.
ReplyDelete
Replies
markOctober 26, 2016 at 2:11 AM
Here's a call for abolishing word limits in the field of ecology & evolution, noting that longer papers tend to be more widely cited: http://retractionwatch.com/2016/10/25/should-journals-abolish-word-limits-for-papers/#more-45425

The dynamics are likely different in linguistics (let alone generative syntax), but I found these observations insightful:

"Longer papers are probably better cited because they contain both more and a greater diversity of data and ideas (Leimu & Koricheva, 2005b). We argue that the positive relationship between citations and both author number and references cited support this hypothesis. Studies that have more authors tend to draw on a greater diversity of expertise, whether practical or intellectual (Katz & Martin, 1997), and thus present a greater diversity of ideas and/or data types, especially when collaborations are interdisciplinary. Likewise, papers likely cite more references because they have a greater diversity of arguments to support or ideas to place into context."

In other words, the kind of super-compact format Norbert and others argue for may be useful within a narrow discipline, but it may also make it harder for a broader audience to engage with theory & data and build on results.
ReplyDelete
Replies
JALOctober 28, 2016 at 7:17 AM
Impressionistically at least, at NLLT we are indeed receiving longer and longer papers, despite maximum length limitations. It's an increasing burden on reviewers and editors. Part of the problem is that the existing literature is large and diverse. Authors and reviewers want previous work to be given its due, and want the particular choice of assumptions from the previous literature to be justified (since a different choice could have been made). Mark's point on replicability is quite important, hence I disagree strongly with the suggestion to eliminate data from the papers.
ReplyDelete
Replies
JALOctober 30, 2016 at 1:01 PM
The idea of archiving material related to papers is important. (I've been part of a working group on data citation in linguistics https://sites.google.com/a/hawaii.edu/data-citation/ There'll be a panel session at the LSA.) It's unclear how that will help shorten papers, though. Right now when an author makes a claim, they provide representative data supporting that claim. The reviewers and readers can look at that data and evaluate it (e.g. there's a confound b/c of animacy/focus particles/verb class/etc/etc). For a review situation, it's then up to the authors to find data that doesn't have the confound. For a reader situation, later readers can also notice factors whose relevance were discovered long after publication (e.g. d-linking/etc). If we relegate all data to an appendix, it'll be like reviewing abstracts -- having to flip back and forth to see the data, which is often annoying. (Also think endnotes vs footnotes.) If we don't provide representative data, but just put all data in an archive, then the reviewer and reader will be more likely to just "trust" the authors, rather than sifting through much more data to see if there are any confounds. That's likely to reduce the quality of the papers in that more confounds will go unnoticed. So, improving the structure of papers is a worthwhile goal, but it strikes me as not so easy.
ReplyDelete
Replies
NorbertOctober 31, 2016 at 4:09 PM
I screwed up and put Mark de Vries' comment on the wrong thread. I deleted it from there and am putting it up here. It seems that Google doesn't like him and is preventing him from posting. I am sorry. I am sorry that I don't know how to read. Here is Mark's comment:

When -- long time ago -- I apologized to the late Hans den Besten for the many footnotes in a chapter of my PhD thesis, citing a writing advisor according to whom the use of footnotes means that the text is ill-structured, Hans just shrug his shoulders and said reassuringly that those guys simply don’t understand the complexity of the matter.

Having said that, yes, we all agree that the average syntax paper tends to be too long and convoluted (obviously not one size fits all). This is counterproductive for all parties: readers, reviewers, and authors themselves. I suppose most of us, including me, are ‘guilty’ of participating in a culture where long papers are the norm. But there is no reason for individual blame or to state that the field is anti-theoretical or insufficiently sophisticated. It is not so strange that authors prefer to be visibly recognizable as experts on the topic rather than risk the accusation of being ignorant of various related matters. But clearly things have gotten out of hand. Reviewers and editors play a key role, here:
-- Reviewers are somehow tempted by the system to come up with every possible counterargument or potentially problematic data point (from any language) they can think of.
-- Authors are expected to cite and to some extent discuss every work ever published on the topic. Evidently, this is untenable in the long run.
-- Editors require authors to respond to everything reviewers say.

Now what? Encouraging individual authors to be more succinct is totally insufficient. We’d need an active Shorter and Clearer Paper Movement. Here are some simple suggestions for its party program:
-- EDITORS and REVIEWERS must be convinced that each review should consist of two parts. The first evaluates the soundness of the argument and the clarity of the core proposal of the paper at hand. The second may contain helpful further references, additional data, thoughts/etc., which are not supposed to play a crucial role in the overall assessment.
-- AUTHORS try to clearly highlight the core argument structure of the paper. If there is more to discuss, the first part of the paper deals with the essentials, and then there can be sections “more concerning x, additional thoughts about y, ...” which can be skipped by readers. Relevant additional data can be in appendices, etc. (So the total page number can still be high if necessary, but the core of the paper is shorter.)
-- THE FIELD should reach a new consensus concerning citation. Do we think it is fine for an author to tell his/her own story, simply refer to an overview article for further information, and only cite/discuss other work where it is really important? (Mind you, the linguistic citation index will go down over time.)
-- JOURNALS accept that there can be follow-up articles that do not elaborately summarize all the foundations.
-- READERS appreciate that papers are not all-encompassing.
-- EVERYONE bears in mind that “...the most worthwhile scientific books [papers] are those in which the author clearly indicates what he does not know; for an author most hurts his readers by concealing difficulties.” (Évariste Galois, 1811-1832)

One final remark. I don’t understand Norbert’s claim that the point of departure for every paper must be a clearly-defined problem. Yes, we tell this to undergrad students for didactic reasons, and it’s generally not a bad rule of thumb. But what happened to actual discoveries, new ideas, original deductions, explorations, ...? Of course one can always artificially construct a ‘problem’ with hindsight, but that may be entirely beside the point.
ReplyDelete
Replies

Add comment

Faculty of Language

Comments

Tuesday, October 18, 2016

Right sizing ling papers

26 comments:

Contributors