The Acquisition of Verbs by English Infants∗

Sunjing Ji

University of Arizona

This paper starts by introducing the debate between the nativist account

and the learning account of language acquisition. It participates in the

debate by addressing three questions concerning verb productivity. First,

do young children have


syntactic knowledge of the verb

category? Second, is vocabulary size a good predictor for a child’s

syntactic productivity? Third, is children’s speech correlated with adults’ speech with regard to verb productivity? It is predicted that, if the limited scope learning account is right, the following should be expected: (1) frequent verbs and infrequent verbs are expected to have different productivity in children’s speech; (2) verb productivity in child speech is significantly lower than that in adult speech; (3) frequent verbs and infrequent verbs behave differently in terms with the correlation between verb productivity and an individual’s vocabulary size; (4) children and adults are correlated with regard to verb productivity. The analyses based on large longitudinal data in this paper confirm all the above predictions, suggesting that a learning approach of language acquisition for verb usage is supported.

1. Introduction This paper addresses the puzzle of infants’ early acquisition of syntactic structures.

Chomsky (1975) mentioned that ‘creativity’ is one unique property of human language – the capacity of infinitively combining words to form well-formed sentences. The puzzle of language acquisition is that, despite such complexity, all infants are able to acquire their native language successfully within a period as short as a couple of years. A question that arises is what constitutes language users’ abstract knowledge of sentences and enables them to make grammatical novel utterances. Two approaches have addressed this issue from very different angles. According to Generative Linguistics (Chomsky, 2002, 1965), linguistic competence is understood as a set of innate phrase structural rules. According to Cognitive Linguistics (Lakoff, 1987; Tomasello, 2003), the abstract representation of sentences is understood as constructions that are associations between forms and the corresponding meanings.

This paper participates in this debate by re-investigating the verb usage in both the child and adult production based on data from the English CHILDES (Child Language Data Exchange System) corpus (MacWhinney, 2000). The following three questions are addressed in this paper. First, do children under three have more limited knowledge of sentence structure than adults? Second,

for an individual adult or child speaker, is vocabulary size a good predictor for the verb usage in the production? Third, to what degree does child production reflect adult input? In the following sections, a brief introduction of different theories about language acquisition is presented, followed by two specific heated debates about whether innate syntactic categories are needed for young children during first language acquisition (for both determiner acquisition and verbargument acquisition). Due to the controversy about different findings on both sides of the debate, this paper provides a novel analysis based on a few measurements of verb productivity. In order to distinguish both sides of the debate, in this paper, it is predicted that verb frequencies should have an effect on verb productivity, if young children’s acquisition of verb usage is based on a limited scope learning mechanism rather than innate syntactic structure. Specific predictions of this paper can be found in section 4 of this paper.

2. Theoretical Background This section provides a brief introduction to three major approaches suggested for how a child may acquire his/her first language.

2.1 The Syntactic Approach According to the syntactic approach for language acquisition, sentences bear underlying structures. Such underlying structures constitute the innate language faculty specifically unique to humans, facilitate infants learning of a sophisticated language system, and allow language users to creatively generate potentially infinite numbers of sentences (Chomsky 1975).

One piece of evidence for the underlying syntactic structures is the phenomenon of syntactic categorization. In other words, the usage of a verb is constrained by the idiosyncratic properties of that particular lexical item specified in the lexicon. For example, the usage of transitive verbs requires the appearance of direct objects, as observed by the contrast between examples (1) and (2), while the usage of intransitive verbs prohibits the occurrence of direct objects, as shown by examples (3) and (4). So the verb ‘hit’ is a transitive verb, the verb ‘smile’ is an intransitive verb, whereas the verb ‘walk’ is a mix of both. We can refer to such constraints as the rule of transitivity or the rule of intransitivity.

Such rules apply productively. In other words, any nominal arguments (theoretically speaking) can be combined with a verb, as long as the rule of transitivity or the rule of intransitivity is fulfilled. If the rule is violated, it will lead to an ungrammatical/bad sentence.

(1) The boy hits the ball.

(2) *The boy hits.

(3) The boy smiles.

(4) *The boy smiles a joke.

(5) The boy walks.

(6) The boy walks a dog.

Theories supporting this approach to language acquisition suggest that such underlying structures are formal devices of Universal Grammar (UG), and such underlying structures facilitate children’s acquisition of their native language.

For example, the theory from Radford (1990) suggests that X-bar theory is part of UG. Another example framework is proposed by Valian (1991), in which the innate device includes knowledge of clause structure.

2.2 The Semantic-bootstrapping Hypothesis

The semantic-bootstrapping hypothesis is driven by the observation that the underlying syntactic structure of a sentence is consistent with its semantic structure. One way of representing the semantic structure of a sentence is referred to as the verb-argument structure. For example, the verb ‘hit’ in example (1) takes two arguments– the agent ‘the boy’ and the theme ‘the ball’.

The agent is the volitional actor of the activity denoted by the verb, whereas the theme is the object undergoing the activity. According to the semanticbootstrapping hypothesis, children develop the knowledge of real word concepts, map objects onto the syntactic category ‘noun’ and map actions onto the syntactic category ‘verb’. Theories supporting the semantic basis of language acquisition suggest that children make use of conceptual knowledge in order to know if a direct object is needed, and the observation of the semantics from the real world triggers the corresponding innate syntactic structure during language acquisition (Pinker, 1984).

This hypothesis, also called the Generalization Hypothesis in the literature, has been tested by comprehension experiments. For example, Naigles (1990) used sentences containing novel verbs such as those in (7-8), and had children watch act-out scenarios. Naigles found that 25 month olds hearing sentence (7) looked longer at a scene in which Big Bird did a novel action on Cookie Monster than the one in which both animals did actions independently. The listening time was reversed when the children listened to

stimuli like (8). These results suggest that children as young as two already have sophisticated knowledge of distinguishing transitive from intransitive verbs.

(7) Big Bird is kradding Cookie Monster.

(8) Big Bird and Cookie Monster are kradding.

2.3 The Limited-scope Formulae Learning Approach The above two approaches are both nativist views, since both assume that a formal device is important for language acquisition. An alternative perspective is the usage-based piecemeal learning mechanism proposed by Tomasello (2003).

This alternative suggests that there is no need to resort to the innate acquisition device. What constitutes the abstract representation of sentence is the proposed construction grammar. According to construction grammar, sentences are understood as a set of form-meaning correspondences that exist at all levels of the lexicality-schematicity continuum. Furthermore, this approach suggests that patterns from the input provide sufficient cues for the acquisition of sentence structures as mapping from meaning and forms via intentional learning.

According to this approach, the learning process is considered as involving ‘pre-emption’ and ‘entrenchment’. Pre-emption refers to the evidence provided by encountering a particular verb construction: if the sentence form ‘John makes the apple disappear’ occurs in the input instead of the form ‘John disappears the apple’, the latter must be ungrammatical, otherwise it could have occurred in the input. Entrenchment, according to Braine and Brooks (1995), refers to the idea that encountering a verb frequently in the input ‘entrenches’ its usage and resists extending the generalization onto novel constructions. For example, children were found less likely to overgeneralize on frequent verbs (*He came me to school.) than on the infrequent forms (*He arrived me to school.).

Instead of having full competence from the very beginning, young children start with gradual learning of the usage of individual verbs. The abstract knowledge of sentence structures comes from gradual learning from concrete instances. Due to the piecemeal nature of this learning, it has been called the Verb-island Hypothesis.

One corpus study that supports the learning approach of language acquisition is Lieven, Pine, and Baldwin (1997). Lieven et al. (1997) adopted the lexical-based positional analysis of roughly the first 400 multiword utterances produced by 11 children aged from 1-3. This analysis divides children’s multiword utterances into three categories – frozen phrases, intermediate utterances and constructed utterances. Lieven et al. (1997) found that the latter two categories were able to account for 60% of children’s total production; the majority of the rest were accounted for by frozen phrases. Furthermore, it was Coyote Papers – Proceedings of the Arizona Linguistics Circle 3 October 30 - November 1, 2009 SUNJING JI 5 found that correct and incorrect pronoun forms co-occur in the production of the same child during the same period. This undermines the account of children making use of underlying syntactic rules for utterance production. In addition, both prototypical and non-prototypical verb argument structures were found in early production, suggesting that semantic structures are less likely to be the governing mechanism for the acquisition of syntactic categories. Lieven et al.

(1997) concluded that children’s early development of sentence production is item-based, which does not require underlying syntactic or semantic generalizations.

3. Specific Debates There have been heated debates based on each of the above theoretical accounts.

This section will review two specific debates – one about the determiners and the other about verb-argument structures.

3. 1 The ‘Determiner’ Debate Valian (1986) argued for the fact that young children before 2 and a half have abstract knowledge of determiners. From all the speech production of the 6 investigated English children, she found that determiners were positioned correctly (as opposed to incorrect noun-determiner or adjective-determiner sequences) with a very few exceptions. There was no production of the ungrammatical determiner-determiner sequences except for those that could be interpreted due to reasons such as a missing copula, speech hesitations or repetition, and there were no determiners produced in isolation except for a few errors. Since all the criteria used in her study were distributional regularities, no semantic correlate existed. The semantic-bootstrapping hypothesis was not supported. This leads to her conclusion that young children must have innate syntactic categories as young as a little over 2.

Pine and Martindale (1996) re-examined Valian (1986)’s adult-like account of children’s acquisition of determiners. First, Pine and Martindale (1996) pointed out that some utterances that are neither nouns or noun + adjective sequences were not treated as errors by Valian (1986). Second, Valian (1986)’s criteria were too lax for children to pass, and an alternative learning approach based on limited scope formulae of verb paradigms would also reach the criteria.

Finally, in order to distinguish the two opposing accounts of determiner acquisition, Pine and Martindale (1996) conducted the overlap measures. The predictions were that if the syntactic account were right, the behavior of using the determiner ‘a’ should be immediately available to the other determiner ‘the’, resulting in a large overlap of contexts. In other words, a large number of different nouns would be observed to be combined with both determiners. On the other Coyote Papers – Proceedings of the Arizona Linguistics Circle 3 October 30 - November 1, 2009 SUNJING JI 6 hand, if the limited scope formula account were right, no large overlap would be expected in the child speech, and the overlap of adult speech as a control would be expected to be significantly larger than the overlap of child speech. For most of the noun overlap (the proportion between the type and token number of the different nouns co-occurring with both determiners for an individual speaker) and predicate overlap (the proportion between the type and token number of the different predicates including prepositions, verbs, copula co-occurring with both determiners for an individual speaker), they found that adults had significantly larger overlap than children, supporting the limited scope formula account.

Valian, Solt, and Stewart (2009) re-addressed the issue of ‘determiner’ acquisition by investigating 21 children and corresponding adult speech.

