How do words get in the dictionary?

How Do Words Get in the Dictionary?
How Do Words Get in the Dictionary?

“How does a word get into the dictionary?” That's one of the questions Merriam-Webster editors are most often asked.

The answer is simple: usage.

To decide which words to include in the dictionary and to determine what they mean, Merriam-Webster editors study the language to determine which words people use most often and how they use them.

Each day most Merriam-Webster editors devote an hour or two to reading books, newspapers, magazines, electronic publications — in fact a cross-section of all kinds of published materials; in our office this activity is called “reading and marking.

” The editors are looking for new words, new meanings of existing words, evidence of variant spellings or inflected forms — in short, anything that might help in deciding if a word belongs in the dictionary, understanding what it means, and determining typical usage.

Any word of interest is marked, along with surrounding context that offers insight into its form and use.

The marked passages are then input into a computer system and stored both in machine-readable form and on 3″ x 5″ slips of paper to create citations.

Each citation has the following elements:

  1. the word itself
  2. an example of the word used in context
  3. bibliographic information about the source from which the word and example were taken

Merriam-Webster's citation files, which were begun in the 1880s, now contain nearly 15 million examples of words used in context and cover all aspects of the English vocabulary. Citations are also available to editors in a searchable text database (linguists call it a corpus) that includes 50,000,000 words drawn from a great variety of sources.

How does a word make the jump from the citation file to the dictionary?

The process begins with dictionary editors reviewing groups of citations. Definers start by looking at citations covering a relatively small segment of the alphabet — for example gri- to gro- — along with the entries from the dictionary being reedited that are included within that alphabetical section.

It is the definer's job to determine which existing entries can remain essentially unchanged, which entries need to be revised, which entries can be dropped, and which new entries should be added.

In each case, the definer decides on the best course of action by reading through the citations and using the evidence in them to adjust entries or create new ones.

Before a new word can be added to the dictionary, it must have enough citations to show that it is widely used.

But having a lot of citations is not enough; in fact, a large number of citations might even make a word more difficult to define, because many citations show too little about the meaning of a word to be helpful.

A word may be rejected for entry into a general dictionary if all of its citations come from a single source or if they are all from highly specialized publications that reflect the jargon of experts within a single field.

To be included in a Merriam-Webster dictionary, a word must be used in a substantial number of citations that come from a wide range of publications over a considerable period of time. Specifically, the word must have enough citations to allow accurate judgments about its establishment, currency, and meaning.

How Are Words Added to the Dictionary?

Last month, Merriam-Webster announced that 530 new words have been added to the dictionary. Among them are new meanings of existing words, such as “they” as a nonbinary pronoun, “free solo” (as a climber, this is particularly exciting to me), “Bechdel Test,” and “dad joke.”

How Do Words Get in the Dictionary?

How, exactly, are these new words added? What criteria does a word have to meet in order to find its way into the Book of English Words?

When Webster’s Dictionary was first published in 1828 to standardize American English, it contained 70,000 words. The previous version of the dictionary, put together by Samuel Johnson in 1755, contained 40,000 words.

  • Today, the standard English language dictionary, such as the Oxford English Dictionary, contains over half a million words.
  • That’s…quite a jump.
  • Category ID: 13193
  • Category ID: 14292

Did we really have an 85% increase in the number of words in the English language? Yes and no. There’s never been an exact amount. Language is fluid and constantly changing, and the gathering of words became more robust, especially when the process became digitized.

The point of a dictionary, after all, is to give the public a handy guidebook to find the meaning of words they come across, and learn how to use them in context. (Readers do this with books, too, we just learn the wrong pronunciations for new words we never heard out loud.

)

I had a lot of questions about this process, so I reached out to Merriam-Webster to see if they could help me understand how far a word has to travel before the dictionary gives it a real home.

Peter Sokolowski, a lexicographer (noun; a person who compiles dictionaries) at Merriam-Webster, chatted with me about the journey of a word, the research process for identifying new words, and how it is perfectly fine that words such as “wonderful” and “literally” gain a new meaning, no really.

See also:  6 writing resolutions for 2019

How does a word get added to the dictionary?

How are words added to the Oxford English Dictionary?

Home How are words added to the OED?

Words come into the English language in all manner of ways.
The Oxford English Dictionary’s
mission is to record all of these word stories, capturing their development as
they continue to unfold.

How Do Words Get in the Dictionary?

Learn about the journey of OMG from invention to inclusion in the OED with our interactive feature.

For a word to be considered for inclusion in the OED, it must first be added to the dictionary’s ‘watch list’ database. Contributions to this watch list come from an enormous variety of sources – from the OED’s own reading programmes to crowdsourcing appeals with the general public, and increasingly from automated monitoring and analysis of massive databases of language in use.

The OED’s editors
consider thousands of word suggestions from these sources every year, reviewing
each and every one. Words that have not yet accumulated enough evidence for
permanent record in the OED remain on
the watch list for continued monitoring, while suggestions for words with
sufficiently sustained and widespread use are assigned to an editor.

Editors begin by reviewing the information gathered so far for
their assigned word before embarking on their own research to trace the word’s
development.

This research might lead them to search newspaper archives, online
forums, academic studies, magazines, law tracts, recipe books, or social media
for published evidence of the word.

If a key example is available in a library
or archive beyond digital access, editors also have the opportunity to enlist
the help of the OED’s network of
researchers, who are based at institutions around the world, to track down the
elusive example.

Once an editor has pieced together a detailed picture of the
word, they begin to draft the dictionary entry to record it in the OED.

For words without an existing OED entry, this begins with the word
itself – called the headword
– and includes its pronunciation, forms, etymology, definition, example
quotations, and any other senses or associated phrases it may have.

For new
senses of existing words, these are included in their chronological position in
the entry, with the definition and example quotations.

  • This work involves several specialist teams at the OED, such as the pronunciation editors,
    who create the audio
    files and transcriptions that reflect a word’s most common pronunciations, and
    the bibliographers, who review
    the quotations to ensure that sources are cited accurately.
  • Once the dictionary entry has been signed off by each team,
    it is passed on to the finalization team, which includes the dictionary’s Chief
    and Deputy Chief Editors, for the final stamp of approval before it takes its
    place in the OED.
  • Completed entries are published in quarterly updates on OED Online, and each update is accompanied by release notes looking at key themes and notable new additions from the latest crop of words.
  • Find out more about the latest updates to the OED – and contribute your suggestions for words you’ve seen or heard to be included in the world’s definitive record of the English language.

How we create language content | Oxford Languages

All of Oxford Languages’ content aims to describe, rather than prescribe, the way languages are used by people around the world. We take an evidence-based approach to language content creation, looking at real examples of the ways words are used in context to provide an accurate picture of a language.

  • To gather this evidence, we have a world-class language research programme that utilizes big data technologies to continuously monitor language development in real time.
  • Our corpora – massive collections of spoken and written language data – track and record the very latest language developments across an enormous variety of publications, covering everything from specialist journals to newspapers to social media posts.
  • We have large corpora in English, Arabic, Indonesian, and many other languages in development, enabling the lexicographers and language technologists who create our dictionaries, datasets, and language resources to identify new and emerging words in context and spot trends and patterns in usage, spelling, regional varieties, and more.

Our expert team of lexicographers source all of our descriptive sentence examples from our vast language databases to provide accurate and meaningful descriptions of words in use. The team analyses the corpus data to select examples that support a word in the correct grammatical and semantic context without distracting from the essential information the definition conveys.

We do our best to eliminate sentence examples that repeat factually incorrect, prejudiced, or offensive statements from the source and are always grateful when readers inform us of cases that do not meet our rigorous quality standards – whether due to human error or changing cultural sensitivities – so that we can review and update our content.

All of our content is the result of continuous research and review as we seek to document and describe new language developments as they unfold, providing the world’s most trusted language content.

See also:  How natural is "all natural" food coloring?

How new words are born

As dictionary publishers never tire of reminding us, our language is growing. Not content with the million or so words they already have at their disposal, English speakers are adding new ones at the rate of around 1,000 a year. Recent dictionary debutants include blog, grok, crowdfunding, hackathon, airball, e-marketing, sudoku, twerk and Brexit.

But these represent just a sliver of the tip of the iceberg. According to Global Language Monitor, around 5,400 new words are created every year; it’s only the 1,000 or so deemed to be in sufficiently widespread use that make it into print. Who invents these words, and how? What rules govern their formation? And what determines whether they catch on?

Shakespeare is often held up as a master neologist, because at least 500 words (including critic, swagger, lonely and hint) first appear in his works – but we have no way of knowing whether he personally invented them or was just transcribing things he’d picked up elsewhere.

It’s generally agreed that the most prolific minter of words was John Milton, who gave us 630 coinages, including lovelorn, fragrance and pandemonium.

Geoffrey Chaucer (universe, approach), Ben Jonson (rant, petulant), John Donne (self-preservation, valediction) and Sir Thomas More (atonement, anticipate) lag behind.

It should come as no great surprise that writers are behind many of our lexical innovations. But the fact is, we have no idea who to credit for most of our lexicon.

If our knowledge of the who is limited, we have a rather fuller understanding of the how. All new words are created by one of 13 mechanisms:

1 DerivationThe commonest method of creating a new word is to add a prefix or suffix to an existing one. Hence realisation (1610s), democratise (1798), detonator (1822), preteen (1926), hyperlink (1987) and monogamish (2011).

2 Back formationThe inverse of the above: the creation of a new root word by the removal of a phantom affix. The noun sleaze, for example, was back-formed from “sleazy” in about 1967.

A similar process brought about pea, liaise, enthuse, aggress and donate.

Some linguists propose a separate category for lexicalisation, the turning of an affix into a word (ism, ology, teen), but it’s really just a type of back formation.

3 CompoundingThe juxtaposition of two existing words. Typically, compound words begin life as separate entities, then get hitched with a hyphen, and eventually become a single unit.

It’s mostly nouns that are formed this way (fiddlestick, claptrap, carbon dating, bailout), but words from other classes can be smooshed together too: into (preposition), nobody (pronoun), daydream (verb), awe-inspiring, environmentally friendly (adjectives).

4 RepurposingTaking a word from one context and applying it to another. Thus the crane, meaning lifting machine, got its name from the long-necked bird, and the computer mouse was named after the long-tailed animal.

5 ConversionTaking a word from one word class and transplanting it to another.

The word giant was for a long time just a noun, meaning a creature of enormous size, until the early 15th century, when people began using it as an adjective.

Thanks to social media, a similar fate has recently befallen friend, which can now serve as a verb as well as a noun (“Why didn’t you friend me?”).

6 EponymsWords named after a person or place.

You may recognise Alzheimer’s, atlas, cheddar, alsatian, diesel, sandwich, mentor, svengali, wellington and boycott as eponyms – but did you know that gun, dunce, bigot, bugger, cretin, currant, hooligan, marmalade, maudlin, maverick, panic, silhouette, syphilis, tawdry, doggerel, doily and sideburns are too? (The issue of whether, and for how long, to retain the capital letters on eponyms is a thorny one.)

7 AbbreviationsAn increasingly popular method. There are three main subtypes: clippings, acronyms and initialisms.

Some words that you might not have known started out longer are pram (perambulator), taxi/cab (both from taximeter cabriolet), mob (mobile vulgus), goodbye (God be with you), berk (Berkshire Hunt), rifle (rifled pistol), canter (Canterbury gallop), curio (curiosity), van (caravan), sport (disport), wig (periwig), laser (light amplification by stimulated emission of radiation), scuba (self-contained underwater breathing apparatus), and trump (triumph. Although it’s worth noting that there’s another, unrelated sense of trump: to fabricate, as in “trumped-up charge”).

8 LoanwordsForeign speakers often complain that their language is being overrun with borrowings from English. But the fact is, English itself is a voracious word thief; linguist David Crystal reckons it’s half-inched words from at least 350 languages.

Most words are borrowed from French, Latin and Greek; some of the more exotic provenances are Flemish (hunk), Romany (cushty), Portuguese (fetish), Nahuatl (tomato – via Spanish), Tahitian (tattoo), Russian (mammoth), Mayan (shark), Gaelic (slogan), Japanese (tycoon), West Turkic (horde), Walloon (rabbit) and Polynesian (taboo).

Calques (flea market, brainwashing, loan word) are translations of borrowings.

9 OnomatopeiaThe creation of a word by imitation of the sound it is supposed to make. Plop, ow, barf, cuckoo, bunch, bump and midge all originated this way.

10 Reduplication

How Do Words Get Added to the Dictionary?

New words, phrases, and definitions are added to the Oxford English Dictionary four times a year, and this month’s revision includes over 1,200 changes and updates, from a new “sense” of the word thing to the “well-established, but newly-prominent usage of woke,” as Head of U.S. Dictionaries Katherine Connor Martin writes on the OED’s blog.

See also:  What is dark matter?

Martin, one of the people who decides which new words and “senses” get added to the OED, agreed to answer a few questions for us about how that process works, and whether dictionary rivalries exist. (We’re looking at you, Merriam-Webster.)

How did you come to be Head of U.S. Dictionaries for OED?

There are dozens of editors who work as lexicographers on the OED, but only a few in the US office.

I was lucky enough to become one of them 14 years ago, when I saw an advertisement on the Oxford University Press website and applied for the job — not a very romantic origin story! My background was in history, not linguistics, but that’s not unusual. OED editors come from many different academic backgrounds.

How do you decide which words and definitions to add throughout the year?

We have a large database of potential new words and senses, compiled from a lot of different sources: personal observation, suggestions from the public, our reading program, and through computational analysis.

In order to be entered into the OED, a word or definition must satisfy certain evidentiary requirements. For example, there must be widespread evidence in a variety of sources, attested over a significant period of time.

The OED is a historical dictionary which aims to cover the full thousand-year history of English, so it tends to wait a bit before adding neologisms, to ensure that they have staying power.

What’s your favorite new word or “sense” from this latest addition?

Help

This is one of the questions Merriam-Webster editors are most often asked.

The answer is simple: usage.

Tracking Word Usage

To decide which words to include in the dictionary and to determine what they mean, Merriam-Webster editors study the language as it's used. They carefully monitor which words people use most often and how they use them.

Each day most Merriam-Webster editors devote an hour or two to reading a cross section of published material, including books, newspapers, magazines, and electronic publications; in our office this activity is called “reading and marking.

” The editors scour the texts in search of new words, new usages of existing words, variant spellings, and inflected forms–in short, anything that might help in deciding if a word belongs in the dictionary, understanding what it means, and determining typical usage.

Any word of interest is marked, along with surrounding context that offers insight into its form and use.

Citations

The marked passages are then input into a computer system and stored both in machine-readable form and on 3″ x 5″ slips of paper to create citations.

Each citation has the following elements:

  1. the word itself
  2. an example of the word used in context
  3. bibliographic information about the source from which the word and example were taken

Merriam-Webster's citation files, which were begun in the 1880s, now contain 15.7 million examples of words used in context and cover all aspects of the English vocabulary. Citations are also available to editors in a searchable text database (linguists call it a corpus) that includes more than 70 million words drawn from a great variety of sources.

From Citation to Entry

How does a word make the jump from the citation file to the dictionary?

The process begins with dictionary editors reviewing groups of citations. Definers start by looking at citations covering a relatively small segment of the alphabet – for example gri- to gro- – along with the entries from the dictionary being reedited that are included within that alphabetical section.

It is the definer's job to determine which existing entries can remain essentially unchanged, which entries need to be revised, which entries can be dropped, and which new entries should be added.

In each case, the definer decides on the best course of action by reading through the citations and using the evidence in them to adjust entries or create new ones.

Before a new word can be added to the dictionary, it must have enough citations to show that it is widely used.

But having a lot of citations is not enough; in fact, a large number of citations might even make a word more difficult to define, because many citations show too little about the meaning of a word to be helpful.

A word may be rejected for entry into a general dictionary if all of its citations come from a single source or if they are all from highly specialized publications that reflect the jargon of experts within a single field.

To be included in a Merriam-Webster dictionary, a word must be used in a substantial number of citations that come from a wide range of publications over a considerable period of time. Specifically, the word must have enough citations to allow accurate judgments about its establishment, currency, and meaning.

Be the first to comment

Leave a Reply

Your email address will not be published.


*