Dec 19, 2014

Closely connected

Photo by Sudhamshu Hebbar on Flickr [CC BY 2.0]
An article written by the British linguist Vyvyan Evans entitled “Language Instinct is a Myth” which I shared on Twitter the other day triggered a lively discussion with my colleagues. One of the questions raised on Twitter was how come the idea that we are born with a built-in language capacity (aka the innateness hypothesis) has prevailed for so long and Chomsky, its main promoter, is part of all Master's in TESOL programmes if the theory has largely been discredited (Scott Thornbury asks the same question on his in X is for X-bar Theory).

This was indeed the case on my MA programme: Behaviorism and Chomsky took up a large part of two of my Psycholinguistics courses while such a fascinating, more recent theory as Connectionism received scant or almost no attention. Now that I am on the giving end, i.e. giving rather than listening to lectures, I was also surprised at the lack of references to Connectionism in an SLA course syllabus which I inherited (and revamped completely). Chomsky, on the other hand, is such an enduring staple of TEFL/TESOL courses that even BA students come to the course already knowing him – or at least having heard his name.

For an overview of Chomsky's theory see THIS POST by Geoff Jordan or THIS POST by Kylie Barker.

Connectionism is also conspicuously absent from Vyvyan Evan’s criticism of Chomsky although it provides much a stronger argument against the innateness hypothesis. What is Connectionism and how is it connected to this blog?

It doesn’t sound right but I can’t explain why…

The word “grammar” usually conjures up in the learner’s – and teacher’s – mind images of verb tables and rules which can be memorized and tested. Indeed, some rules – or rather rules of thumb – exist and can be called upon when punctuating a sentence (add an apostrophe after plural nouns ending with –s) or spelling a word (“i before e except after c”). Learners can be given useful rules such as: We add –s/-es to make the plural form or most nouns or add –d/-ed to make the past tense of regular verbs. The rules that can be formulated and verbalized are known as explicit rules. A greater number of linguistic rules though – probably much greater than explicit rules – are implicit rules, i.e. rules which competent language users (not only native speakers!) know but cannot express verbally. The knowledge of implicit rules is evident when we can tell if a sentence is correct or incorrect just because it sounds right. This intuitive feel for grammaticality, however, has nothing to do with the sixth sense; it has sound psychological foundations.

Artificial neural networks (explained as simply as possible)

Via Wikimedia Commons [CC BY-SA 3.0]
The human brain is made up of a massive number of neurons meshed into complex networks. In order to understand how information is stored across neural networks of the brain, artificial neural networks can be created on the computer. Artificial neural networks simulate how the brain functions when processing information and consist of many units (representing neurons) and their connections (representing synapses), hence the name “connectionism”.

Already in the 1980s, connectionist experiments shed considerable light on the processes underlying human cognition and, specifically, language acquisition. For example, a computer simulation created by Rumelhart and McClelland (1986) was “trained” to predict irregular past forms of the verbs it had not previously encountered. For example, after the network “learned” that found is the past form of find, and bound is the past of bind, the network would produce wound in response to wind.

Via Wikimedia Commons [CC BY 1.0]
Now on to something truly phenomenal. Neural networks have been shown to exhibit the same “faulty” behaviour as humans. The next set of verbs which was fed into Rumelhart and McClelland’s network consisted of both irregular and regular (-ed) verbs. As the number of regular verbs began to grow, the network started to produce errors such as *broked instead of broke or *taked instead of took. But after more training, regular and irregular verbs fell into place as the network recovered and started producing correct forms again.

As you probably know, this process is very similar to the developmental stages language acquirers – mainly native speakers but also L2 learners – go through. Initially they produce correct irregular forms, such as went and took, but, as encounters with regular verbs become more and more frequent (washed, cleaned, started, finished), they backslide to *goed and *taked. Chomskyans would, of course, explain this as learners starting to grapple with the rules of the past simple and overgeneralising them to irregular verbs. But Rumelhart and McClelland’s neural network displayed the same psycholinguistic phenomena as overgeneralization and backsliding – in the absence of any grammar rules! 

Rules or connections?

Connectionists offer a simple but compelling explanation of this phenomenon: the neural network learned the pattern on which the past tense is formed as a result of exposure to linguistic data. Regular verbs showed such a strongly consistent pattern in the input that the connections formed between units in response to regular verbs outweighed all the connections activated by irregular verbs.

Posing a radical challenge to the classical rule view of language, connectionism posits that language learning has nothing to do with learning rules, although language behaviour may ultimately appear to be rule-governed. Just like the artificial network was not taught the rules of the past simple, mental representations of rules need not be present in explicit form anywhere in the brain. According to connectionism, language acquirers merely form mental associations between various elements (phonemes, morphemes, words etc) which frequently occur together in language input.

Connectionism and lexical chunking

You are probably wondering what it all has to do with this blog. In fact, connectionism and its younger brother “emergentism” are compatible with the idea of lexical chunking and provide a solid psycholinguistic base for the Lexical approach.

  • Chunks consist of words that often go together:  of course, I hope so, to make things worse, it’s been a long time since…, i.e. they are frequently co-occurring elements in language 
  • Learning a language is not a matter of mastering explicit grammar rules as I have argued HERE
  • Learning chunks can help establish important patterns which pave the way to grammar acquisition - see HERE
  • Repetition plays a crucial role – learners need multiple exposures to “strengthen the connections” between co-occurring elements
  • Although explicit teaching of grammar rules does not result in implicit knowledge it may speed up the process of grammar acquisition through noticing

So, to return to the discussion with my fellow professionals on Twitter: should Chomsky be banished from TESOL programmes’ syllabi? Certainly not. As a starting point in SLA research it merits analysis and discussion. Without understanding Chomsky’s Universal Grammar hypothesis, it would be hard, for example, to explain Krashen* or to grasp the significance of the nature vs. nurture debate as regards language acquisition. But the question remains why Chomsky's view has dominated the field for so long. Is it because, as Vyvyan Evans claims, it's simple?

What do you think?


Rumelhart, D. E.,& McClelland, J. L. (1986).On learning the past tense of English verbs. In J. L. McClelland, D. E. Rumelhart & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition, Vol. 2: Psychological and biological models (pp. 216–271). Cambridge, MA: MIT Press.

* Interestingly - and perhaps ironically - Michael Lewis draws heavily on the work of Krashen, who is clearly a Chomskyan


  1. Yes. Definitely. Exposure's the key. And since our brain is comprised of neurons which are all connected to each other, it seems that this theory Connectionism makes perfect sense and can be the foundation of other theories.

    1. Thank you for your comment, Michele!

  2. This discussion of UG and connectionism seems to me to accomplish very little.

    1. The assertion that UG theory has been “largely discredited” is not supported by any evidence or arguments (and, BTW, is not what Thornbury suggests in “X is for X-bar theory”).

    2.The assertion that people learn languages by “forming mental associations between various elements (phonemes, morphemes, words etc) which frequently occur together in language input” is not supported by any evidence and rests on a completely inadequate sketch of connectionism.

    3.Nothing is said about the severe limitations of connectionist models. Gregg examined the Ellis and Schmidt model (see Gregg, 2003: 58 – 66) in order to emphasise just how little the model has learned and how much is left unexplained. Gregg emphasises the sheer implausibility of the enterprise and wonders how connectionists seriously propose that the complexity of language emerges from simple cognitive processes being exposed to frequently co-occurring items in the environment. What role do frequency effects have, and how do they interact with other aspects of the SLA process? We need to know how frequency effects fit into a theory of SLA, because frequency itself is no theory at all. As Gregg points out “connectionism itself is not a theory….. It is a method, and one that in principle is neutral as to the kind of theory to which it is applied.” (Gregg, 2003: 55)

    4.Connectionism, by adopting an associative learning model and an empiricist epistemology (where some kind of innate architecture is allowed, but not innate knowledge, and certainly not innate linguistic representations), can’t explain how children come to have the linguistic knowledge they do. How can general conceptual representations acting on stimuli from the environment explain the representational system of language that children demonstrate? How do children come to know which form-function pairings are possible in human-language grammars and which are not, regardless of exposure to input?

    Gregg, K. R. (2003) The state of emergentism in second language acquisition. Second Language Research, 19(2): 95-128

    1. Hello Geoff,

      I warned you on Twitter that you'd disagree so I am surprised that you bothered commenting :) But seeing as you have, thank you. Good to have you here!

      I know the article you're quoting from. And I can reply with my own barrage of quotes from Nick Ellis, Diane Larsen-Freeman, O'Grady, Tomasello, Bybee, all of whom address the points you raise (valid points, I concede) better than I ever could. But I have no doubt you've read them yourself. Come to think of it, I can even quote from your own book (2004) where you explicitly state that UG is of no use when describing the process of SLA since it's a property theory.


  3. Hi Leo,

    Thanks for the generous welcome. May I say that, while I don’t think your treatment of UG or connectionism is very good, I like the look and feel of your blog very much.

    In regard to your comment about my book, I’m afraid you don’t get it quite right. UG is not just a property theory, it’s a transition theory too: it explains both the “what” and the “how” of first language acquisition. I don’t say UG theory is of no use to those trying to construct a theory of SLA because it’s a property theory, I say it isn’t much use because, unlike connectionists, I think the processes of first and second language acquisition are different, and thus need different explanations.



    1. Hi again Geoff,

      Thank you for the compliment!

      Re UG & connectionism, if I understand correctly then, according to your view, both theories can live in perfect harmony as regards L2 / SLA. Since many nativists support the no access or partial access view and connectionsts allow some kind of innate architecture, both theories can be seen as compatible.


  4. 2 quick P.S.s if I may.

    Sorry to be anonymous, but WordPress says I'm not the owner of canlloparot.

    2. Chomsky's UG theory is so strong because he deliberately severely restricts its domain, It doesn't deal with most aspects of descriptive grammar and it says absolutely nothing about discourse competence or pragmatic competence. Its limited domain was the reason Scott gave for not finding the intricacies of X-bar theory worth the hassle of studying; a view which I have a lot of sympathy for. :-)


    1. Good to see that you and Scott have some common ground :)

  5. As a teacher, not a researcher, whose interest lie in understand what is going on and how I could best help them progress in their language acquisition/learning, I pretty much appreciate all the discussions going on around the blogs and twitter. Although twitter is not the best place to discuss it as it limits us to express ourselves, it can trigger interesting discussions like this which I think probably happens around M.A programs.
    This is really great to show teachers who had not done an M.A yet that there are things we should be reflecting on and digging deeper.

    Once again, thanks for everyone's contribution. As for me, Geoff knows that already, my pedagogy studies focused a lot on Interactionism/developmental perspective for language learning and I never really cared for the other two major perspectives (behaviorism and nativism) apart from my quick move to discard them as valid. When I moved from the PPP model it was very easy for me to embrace Dogme.

    From a longlife learner and teacher educator,

    1. Hi Rose,
      Thank you for your comment.
      What you describe is the case on most TESOL programmes: behaviorism and nativism are in the curriculum while more recent theories are sidestepped. This is exactly the point I wanted to raise in my post.


  6. A lot I want to contribute my Chomsky is a bit of a time sink and at Xmas I just can't get into it. Cop out, I know. I promise something in 2015 ;)

    I will say Evans' book is rather credulous in places (ape language, lingusitic relativism)

    Loved this post btw


    1. Thanks, Russ.
      So you're reading Evans's book? Let me know what you think when you've finished it. I've seen a lot of criticism...

  7. Your description of connectivism seems entirely compatible with Chomsky's basic idea of an innate LAD. Sure, the brain does clever things, but the key question is whether the brains of any other species could perform the same linguistic function (actually, we know they can't).

    This seems to be part of the current trend of addressing psychological |(and social) questions by looking to MRI scans and neural anatomy to provide answers. This is a deceptive approach, albeit interesting, because it is self-evident that whatever our mental processes and innate abilities may be our brain must support them. The article seems to be arguing at cross-purposes.

    1. Thank you for your comment.

      Connectionists do not rule out existence of innate architecture which, as you say, allows our brain to do clever things - it's not a completely blank slate. But unlike Chomsky, connectionism does not posit innate linguistic knowledge encapsulated in a separate linguistic module (LAD).

    2. Thank you, Leo. Yes, I would agree, but I'm not sure that a separate linguistic module is crucial to the Chomskian notion of innate language-learning ability. At any rate, the key point seems to be that an account of what the brain is (may be) doing during language acquisition is not a rival theory to the Chomskian idea that such acquisition is innate, just an account of it from another angle. It's rather like the beginnings of genetic theory at the turn of the last century, which many heralded as an alternative to Darwin's theory, but which we later came to see as just fleshing out what the theory meant in practice (the Fisher Synthesis).

