From constituents to phrases
Identifying constituents is a major step towards describing the structure of sentences. Remember the following examples, which we discussed in Section 7.1:
Being able to identify Zoe, a student and a student with green hair as constituents allows us to state a generalization that we would not be able to state otherwise: all three sentences consist of a constituent, followed by a verb, followed by another constituent. However, if we were to turn this into a general rule about the structure of English sentences, we would predict that the following sentences should also be possible:
However, this is not the case. There seem to be different types of constituents, and in the cafeteria does not seem to be the right kind of constituent to occur before a verb in English.
So, what is the difference between the underlined constituents in (1a-c) and the one in (2)? This may not be easy to tell — all four constituents contain at least one noun ((1c) and (2) contain two), but in (1a) it is a proper noun, in the other cases, they are common nouns, three of them contain an article ((1b, c) and (2)), and two of them contain a preposition ((1c) and (2)). It’s not the words they contain that distinguish them, at least not by themselves.
Is it their structure, perhaps? It could be, but not in a straightforward way: all four constituents end in a noun, one of them consists of nothing else, two of them start with a determiner, one starts with a preposition — in other words, the linear order of words in the constituent does not help us figure out the structural difference between (1a-c) and (2).
However, the tests introduced in Section 7.2 provide evidence. For example, note that the underlined constituents in (1a-c) can all be replaced by a pronoun like she, but the one in (2) cannot — we can replace the sequence the cafeteria with a pronoun, or we can replace the sequence in the cafeteria with there:
Similarly, if we were to turn the sentence into a headline and delete all superfluous words, the constituents in (1a-c) would consist of a single noun, while the constituent in (2) would consist of a preposition and a noun:
In other words, in the constituents in (1a-c), the noun seems to be the most important word, in the one in (2) it seems to be the preposition. Additional evidence for the latter is that we can, under some circumstances, even delete the noun and leave behind the preposition:
In linguistics, constituents are labeled according to the word class of the most important word they contain — the one that cannot be deleted, the one that determines by what the constituent can be replaced. This word is referred to as the head of the phrase.
The sentences we have looked at so far all contain the three most important types of phrases in English: noun phrases (typically abbreviated as NP), prepositional phrases (PP) and verb phrases (VP). Zoe, a student and a student with green hair (as well as green hair and the cafeteria) are noun phrases, in the cafeteria, in the apartment and with green hair are prepositional phrases. In the constituent watched a documentary, we can delete everything except for the verb, so it must be a verb phrase. There is one additional important phrase type that we have not discussed: the adjective phrase (AP). As its name suggests, it is a constituent in which the adjective is the head (the most important word, which is left behind when all other words are deleted). Some examples of adjective phrases are shown in (6):
A notation for constituent structure
In order to distinguish the different types of constituents when analysing the structure of a sentence, we could add labels to the kind of box notation that we used in Section 7.2. Let us also add labels for the individual words, as well as one for the sentence as a whole.
The result could look like this, using abbreviations for the phrase types:
![[S [NP [N Zoe]] [VP [VP [V watched] [NP [ART a] [N documentary]]] [PP [P in] [NP [ART the] [N cafeteria]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/box-zoe-cafeteria-1024x209.png)
Figure 7.3.1: Box diagram of the sentence Zoe watched a documentary in the cafeteria
![[S [NP [N Zoe]] [VP [VP [V watched] [NP [ART a] [N documentary] [PP [P about] [NP [ADJ renewable] [N energy]]]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/box-zoe-energy-1024x267.png)
Figure 7.3.2: Box diagram of the sentence Zoe watched a documentary about renewable energy.
You will occasionally see box diagrams similar to this in the linguistic research literature, but more typically, constituent structure will be represented as a bracketing structure like the following, which correspond to Figures 7.3.1 and 7.3.2:
[S [NP [N Zoe]] [VP [VP [V watched] [NP [ART a] [N documentary]]] [PP [P in] [NP [ART the] [N cafeteria]]]]] [S [NP [N Zoe]] [VP [V watched] [NP [ART a] [N documentary] [PP [P about] [NP [ADJ renewable] [N energy]]]]]]
Each constituent is enclosed in a pair of square brackets (sometimes, parentheses are used instead), with a label (often subscripted) directly following the opening bracket. For experienced linguists, such bracketing structures are quite easy to read, but they can be confusing if they become too complex, so a more easily readable representation that is widely used is the tree diagram.
In a tree diagram, words that form a constituent are connected to a common node by lines (so-called branches), and the node is labeled using an abbreviation of the word class or phrase type. The tree diagrams corresponding to the box diagrams in Figure 7.3.1 and 7.3.2 are shown in Figure 7.3.3 and 7.3.4.
![[S [NP [N Zoe]] [VP [VP [V watched] [NP [ART a] [N documentary]]] [PP [P in] [NP [ART the] [N cafeteria]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-zoe-cafeteria-1024x686.png)
Figure 7.3.3: Tree diagram of the sentence Zoe watched a documentary in the cafeteria
![[S [NP [N Zoe]] [VP [V watched] [NP [ART a] [N documentary] [PP [P about] [NP [ADJ renewable] [N energy]]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-zoe-energy-1024x916.png)
Figure 7.3.4: Tree diagram of the sentence Zoe watched a documentary about renewable energy.
Before we continue, an important note: While almost all linguists agree that constituents exist and that they can generally be labeled in the way described here, there are many controversies surrounding details of constituent structure, so the tree diagrams you will see here are not necessarily the same as the tree diagrams you will see elsewhere. For example, some linguists believe that every node in a tree should only have two branches splitting off, not three, as in the noun phrase in Figure 7.3.4. They have their reasons, but constituency tests do not support this idea, so we will not be concerned with this idea here. Some linguists believe that the article is the head of a phrase like in the cafeteria, not the noun. Again, they have their reasons, but they do not follow from the very straightforward definition of head that we use here, so, again, we will not be concerned with this idea. Finally, some linguists do not like the idea that verb phrases can contain other verb phrases, and so they use the label V’ or V̅ (“V bar”). We can see the advantages of this notation, but we also like to keep things simple, so we will stick with the label VP. The fact that linguists disagree about details like this is not because they are confused about constituency or because they don’t know what they are doing, but because syntactic structure is very complex and there is often more than one reasonable way to analyze it. We stick as closely as possible to the principles of constituency analysis outlined in Section 7.2, which should provide you with a good understanding of the general idea.
Regardless of the specific choices in labeling constituents and and regardless of whether we represent them as box diagrams, as bracketing structures or as tree diagrams, constituency analysis provides a precise and systematic way of describing the structure of individual sentences. It also serves well as a basis for capturing the general rules for combining words into sentences in a given language. We will explore the first of these options in the remainder of this section and return to the second option in Section 7.4.
Coordination
The following sentences all contain a so-called coordinating conjunction — and, or, but:
If you use the tests introduced in Section 7.2, you will find that in each case, the conjunction and the two constituents to the left and to the right of it form a larger constituent. Here are some examples, but try to apply the other tests, too:
These two constituents no longer function as fully independent constituents independently — they cannot individually be deleted, treated as fragments or be moved, and they cannot always be replaced. Again, here are some examples, but try to apply the other tests, too):
The combined constituent has the same phrase type as the two constituents it contains. Figure 7.3.5 shows the tree diagrams for the coordinated structures in (7a-d).
![[NP [NP [PDET Aylin's] [N tapas]] [CONJ and] [NP [AP [A renewable]] [N energy]]]
[PP [PP [P at] [NP [N home]]] [CONJ and] [PP [P in] [NP [ART the] [N cafeteria]]]]
[VP [VP [V watches] [NP [N documentaries]]] [CONJ or] [VP [V sleeps]]]
[AP [AP [ADJ fascinating]] [CONJ and] [AP [ADV very] [ADJ long]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/trees-coordinated-phrases-1024x836.png)
Figure 7.3.5: Tree diagrams of coordinated NPs, PPs, VPs and APs.
Coordinating conjunctions have an interesting property: they can only coordinate constituents of the same type — two noun phrases, two prepositional phrases, and so on. We can therefore use it to determine the phrase type of a constituent by testing whether it can be coordinated with a constituent whose type we already know. This will be useful in the next Section 7.4.
Structural ambiguity
You are already familiar with the concept of ambiguous sentences — like the following one from Chapter 6:
The ambiguity is due to the fact that the word papers is polysemous, with (at least) two meanings — it could refer to Zoe’s passport and visa, or to the newspapers that she brought to read on her journey.
But what about the following sentence, which is also ambiguous:
It could refer (a) to a situation where Zoe and (former US vice president) Al Gore watched a documentary together, or (b) to a situation where Zoe watched a documentary starring Al Gore (for example, An Inconvenient Truth from 2006).
The ambiguity is not due to any polysemies in the words watch, documentary or with. Instead, it is due to the fact that there are two possible syntactic structures that the sentence could have — this is referred to as structural ambiguity. The PP with Al Gore could be part of a verb phrase together with watched a documentary — in which case, Zoe and Al Gore would be watching a documentary together. The corresponding structure is shown in Figure 7.3.6.
![[S [NP [N Zoe]] [VP [VP [V watched] [NP [ART a] [N documentary]]] [PP [P with] [NP [N Al Gore]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-zoe-docgore-1024x748.png)
Figure 7.3.6: One possible tree diagram of the sentence Zoe watched a documentary with Al Gore.
Or the PP could be part of a noun phrase, together with a documentary — in which case, Al Gore would appear in the documentary, while Zoe would be watching it alone. The corresponding structure is shown in Figure 7.3.7.
![[S [NP [N Zoe]] [VP [VP [V watched] [NP [ART a] [N documentary] [PP [P with] [NP [N Al Gore]]]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-zoegore-doc-1.png)
Figure 7.3.7: Another possible tree diagram of the sentence Zoe watched a documentary with Al Gore.
Apply the replacement, deletion and fragment tests to the two tree structures just given, to make sure you understand why they have these two different interpretations. Then, determine which of the two tree structures each of the following sentences has (some may be ambiguous and have both).
(i) Aylin watered the tree with a garden hose.
(ii) Aylin watered the tree with the yellow leaves.
(iii) Aylin surprised the student with green hair.
(iv) Aylin knows a student with green hair.
(v) Aylin rolled the sushi with avocado.
(vi) Aylin rolled the sushi with a rolling mat.
Ambiguities that arise from different positions of a prepositional phrase in a tree are so common that they have their own name: PP-attachment ambiguities. But there are many other types of structural ambiguity, typically involving more complex sentences than we can describe at this point.
Let us look at another example, involving adjective phrases:
There are two possible readings of this sentence: either Aylin’s term paper has enough arguments, but they need to be more convincing, or Aylin’s term paper has convincing arguments, but there need to be more of them. Again, the ambiguity is not due to the polysemy of one of the words used, but to the fact that there are two possible structures: more could be a determiner in the noun phrase more convincing arguments, or it could be an adverb modifying convincing in the adjective phrase more convincing. The two structures are shown in Figures 7.3.8 and 7.3.9.
![[S [NP [PDET Aylin's] [N term paper]] [VP [V needs] [NP [DET more] [AP [ADJ convincing]] [N arguments]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-morearguments-1024x747.png)
Figure 7.3.8. A possible analysis of the sentence Aylin’s term paper needs more convincing arguments.
![[S [NP [PDET Aylin's] [N term paper]] [VP [V needs] [NP [AP [ADV more] [ADJ convincing]] [N arguments]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-moreconvincing-1024x801.png)
Figure 7.3.9. Another possible analysis of the sentence Aylin’s term paper needs more convincing arguments.
Finally, coordinating conjunctions are also frequently involved in structural ambiguity. Consider the following sentence:
It tells us that Aylin does not like telenovelas in general, but Chilean telenovelas in particular. But what about the novels? Here, two interpretations are available: a) that Aylin likes novels in general, or that she likes Chilean novels in particular. This is due to the fact that the adjective Chilean can occupy two different positions in the structure of the sentence.
First, it could form an NP with the noun telenovelas, and this NP would be coordinated with a second NP containing the noun novels — in this case, the novels are not specifically Chilean. The corresponding structure is shown in Figure 7.3.10.
![[S [NP [N Aylin]] [VP [V likes] [NP [NP [AP [ADJ Chilean]] [N telenovelas]] [CONJ and] [NP [N novels]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-allnovels-1024x1010.png)
Figure 7.3.10. A possible analysis of the sentence Aylin likes Chilean telenovelas and novels
Second, the nouns telenovelas and novels could be coordinated, and the adjective Chilean could form an NP with this coordinated noun — in this case, both the telenovelas and the novels would be specifically Chilean. The corresponding structure is shown in Figure 7.3.11.
![[S [NP [N Aylin]] [VP [V likes] [NP [AP [ADJ Chilean]] [N [N telenovelas] [CONJ and] [N novels]]]]]](https://linguistica.info/b/lei/wp-content/uploads/2025/01/tree-chileannovels-1024x846.png)
Figure 7.3.11. A possible analysis of the sentence Aylin likes Chilean telenovelas and novels
Determine the structural ambiguities in each of the following sentences and draw tree diagrams corresponding to the two interpretations:
(i) Zoe ate vegan pizza and sushi.
(ii) Zoe ate pizza and sushi with tuna.
(iii) Zoe watched documentaries and slept in the cafeteria.
(iv) The documentary was very long and interesting.
CC-BY-NC-SA 4.0, Written by Anatol Stefanowitsch