# How to think about astronomically large numbers

• Last Updated on August 8, 2019
• We have an intuition that more observations is better.
• This is the same intuition behind the idea that if we collect more data, our sample of data will be more representative of the problem domain.

There is a theorem in statistics and probability that supports this intuition that is a pillar of both of these fields and has important implications in applied machine learning. The name of this theorem is the law of large numbers.

In this tutorial, you will discover the law of large numbers and why it is important in applied machine learning.

After completing this tutorial, you will know:

• The law of large numbers supports the intuition that the sample becomes more representative of the population as its size is increased.
• How to develop a small example in Python to demonstrate the decrease in error from the increase in sample size.
• The law of large numbers is critical for understanding the selection of training datasets, test datasets, and in the evaluation of model skill in machine learning.

Discover statistical hypothesis testing, resampling methods, estimation statistics and nonparametric methods in my new book, with 29 step-by-step tutorials and full source code.

Let’s get started.

A Gentle Introduction to the Law of Large Numbers in Machine LearningPhoto by Raheel Shahid, some rights reserved.

## Tutorial Overview

This tutorial is divided into 3 parts; they are:

1. Law of Large Numbers
2. Worked Example
3. Implications in Machine Learning
1. Take my free 7-day email crash course now (with sample code).
2. Click to sign-up and also get a free PDF Ebook version of the course.

### Law of Large Numbers

The law of large numbers is a theorem from probability and statistics that suggests that the average result from repeating an experiment multiple times will better approximate the true or expected underlying result.

The law of large numbers explains why casinos always make money in the long run.

— Page 79, Naked Statistics: Stripping the Dread from the Data, 2014.

We can think of a trial of an experiment as one observation. The standalone and independent repetition of the experiment will perform multiple trials and lead to multiple observations. All sample observations for an experiment are drawn from an idealized population of observations.

• Observation: Result from one trial of an experiment.
• Sample: Group of results gathered from separate independent trials.
• Population: Space of all possible observations that could be seen from a trial.

Using these terms from statistics, we can say that as the size of the sample increases, the mean value of the sample will better approximate the mean or expected value in the population. As the sample size goes to infinity, the sample mean will converge to the population mean.

… a crowning achievement in probability, the law of large numbers. This theorem says that the mean of a large sample is close to the mean of the distribution.

— Page 76, All of Statistics: A Concise Course in Statistical Inference, 2004.

This is an important theoretical finding for statistics and probability, as well as for applied machine learning.

### Independent and Identically Distributed

It is important to be clear that the observations in the sample must be independent.

This means that the trial is run in an identical manner and does not depend on the results of any other trial. This is often reasonable and easy to achieve in computers, although can be difficult elsewhere (e.g. how do you achieve identically random rolls of a dice?).

In statistics, this expectation is called “independent and identically distributed,” or IID, iid, or i.i.d. for short. This is to ensure that the samples are indeed drawn from the same underlying population distribution.

### Regression to the Mean

The law of large numbers helps us understand why we cannot trust a single observation from an experiment in isolation.

## How to Understand Extreme Numbers

We live among vast orders of magnitude, but they don’t have to be off-putting. Photo Illustration by  NASA Goddard Space Flight Center / Flickr

The late statistics wizard Hans Rosling, who died in February of 2017 at age 68,  brought at least 10 toilet paper rolls to some of his beloved presentations.  He would stack them into a tower on a table, each roll representing one billion people.

In a 2012 talk at the Skoll World Forum, he used the rolls to show how, as the number of children in the world—2 billion—holds steady, the global population will rise from 7 billion to the (also indefinitely stable) figure of about 10 billion.

“We are debating peak oil,” he remarked, “but we know that we have reached peak child.”

With his whimsical props and other colorful visualizations, Rosling was renowned as a translator between large, almost unfathomable numbers and the language of everyday experience.

He understood that most of us need visualizations or analogies to mentally work with millions, billions, and other big powers of 10 that help define our world. They’re important for decisions that affect daily life, such as how money gets invested and which government policies you support.

But just how well—or poorly—do people understand quantities in powers of 10? And how can we better equip ourselves to navigate them?

In thinking about these questions, I started reading about a niche branch of science called numerical cognition. Researchers in this field study not just how we think, but how we think about, and with, numbers.

Not a lot of research has been done on how our minds perceive and comprehend large orders of magnitude—big differences between the size of, say, a cell and our sun.

But one academic outwardly passionate about it is David Landy, assistant professor of psychological and brain sciences at Indiana University in Bloomington.

Landy has done several studies on how well people understand very large numbers—namely, the millions, billions, and trillions—in relation to each other on a number line.

“To me, the impressive thing is that people succeed at all.”

## How to Think About Astronomically Large Numbers

But there are ways to make sense of these astronomically large numbers and build an intuition for just how big they really are. What's the best way to do it?

That's exactly what we'll be talking about today. .

### How to Think About Millions

In my opinion, the best way to develop intuition for large numbers is to come up with simple visualizations that help put things in perspective. One common way to do this is to think about stacks of paper money. A typical \$1 bill has a thickness of about 0.004 inches—which is also the approximate thickness of a normal sheet of printer paper.

If we stack 1,000 dollar bills, we'll have a pile of money that's about 1,000 x 0.004 inches = 4 inches high.

Let's use this as our reference point to think about just how large the number 1,000,000 is. Namely, \$1,000,000 worth of \$1 bills would make a stack of money about 1,000,000 x 0.

004 inches = 4,000 inches high. Which turns out to be just a little more than the length of a football field!

1,000,000 is to 1,000 as the length of a football field is to the thickness of a sheet of paper.

So 1,000,000 is to 1,000 as the length of a football field is to the approximate length of a credit card. And 1,000,000 is to 1 as the length of a football field is to the thickness of a sheet of paper. Which, for me at least, is a much more tangible way to think about what “millions” really means.

### How to Think About Billions

How about billions? If we play the same game with stacks of dollar bills we find that one billion \$1 bills makes a pile of money that's 1,000,000,000 x 0.004 inches = 4,000,000 inches high.

How high is that? Pretty high—about 60 miles.

For comparison, that's approximately the thickness of the Earth's atmosphere! Which means that 1,000,000,000 is to 1,000,000 as the thickness of Earth's atmosphere is to the length of a football field.

But let's not just play this dollar bill game, let's come up with another visualization that you can use to think about the magnitudes of these numbers. In particular, let's think about time.

How long is one billion seconds? It's just under 32 years! Or, to stick with our “astronomical” theme, it's about the same amount of time that Saturn takes to complete one orbit around the Sun.

For comparison, one million seconds is about a measly 11.5 days and one thousand seconds is about 16.5 minutes. No matter what comparison you use, the difference between thousands, millions, and billions is huge!

## Large numbers

For other uses, see Large number (disambiguation).

 This article may contain an excessive amount of intricate detail that may interest only a particular audience. Please help by spinning off or relocating any relevant information, and removing excessive detail that may be against Wikipedia's inclusion policy. (January 2019) (Learn how and when to remove this template message)

(Learn how and when to remove this template message)

Numbers that are significantly larger than those used regularly

Large numbers are numbers that are significantly larger than those ordinarily used in everyday life, for instance in simple counting or in monetary transactions. The term typically refers to large positive integers, or more generally, large positive real numbers, but it may also be used in other contexts. The study of nomenclature and properties of large numbers is sometimes called googology.[1][2]

Very large numbers often occur in fields such as mathematics, cosmology, cryptography, and statistical mechanics. Sometimes people refer to numbers as being “astronomically large”. However, it is easy to mathematically define numbers that are much larger even than those used in astronomy.

### Large numbers in the everyday world

Scientific notation was created to handle the wide range of values that occur in scientific study. 1.

0 × 109, for example, means one billion, a 1 followed by nine zeros: 1 000 000 000, and 1.0 × 10−9 means one billionth, or 0.000 000 001.

Writing 109 instead of nine zeros saves readers the effort and hazard of counting a long series of zeros to see how large the number is.

Examples of large numbers describing everyday real-world objects include:

• The number of bits on a computer hard disk (as of 2010[update], typically about 1013, 500–1000 GB)
• The estimated number of atoms in the observable universe (1080)
• The mass of Earth consists of about 4×1051 nucleons
• The number of cells in the human body (estimated at 3.72 × 1013)[3]
• The number of neuronal connections in the human brain (estimated at 1014)
• The lower bound on the game-tree complexity of chess, also known as the “Shannon number” (estimated at around 10120)[4]
• The Avogadro constant is the number of “elementary entities” (usually atoms or molecules) in one mole; the number of atoms in 12 grams of carbon-12 – approximately 6.022×1023.

### Large numbers and governments

Large numbers have been central to “statistics-driven thinking”, which have become “ubiquitous in modern society.” Beginning with 17th-century probability theory, statistics have evolved and become integral to both governmental knowledge and power.

There is a complex “reciprocity between modern governments and the mathematical artifacts that both dictate the duties of the state and measure its successes”. These tools include economics, mathematical statistics, medical statistics, probability, psychology, sociology, and surveys.

These have led to applied econometrics in modern times.[5]

Illinois Senator Everett Dirksen is noted as saying, “A billion here, a billion there, pretty soon, you're talking real money.” Although there is no direct record of the remark,[6] he is believed to have made it during an appearance on The Tonight Show Starring Johnny Carson. (See wikiquotes of Everett Dirksen.)

### Astronomically large numbers

Other large numbers, as regards length and time, are found in astronomy and cosmology. For example, the current Big Bang model suggests that the universe is 13.8 billion years (4.

355 × 1017 seconds) old, and that the observable universe is 93 billion light years across (8.8 × 1026 metres), and contains about 5 × 1022 stars, organized into around 125 billion (1.25 × 1011) galaxies, according to Hubble Space Telescope observations.

There are about 1080 atoms in the observable universe, by rough estimation.[7]

According to Don Page, physicist at the University of Alberta, Canada, the longest finite time that has so far been explicitly calculated by any physicist is

10

10

10

10

10

1.1

years

{displaystyle 10^{10^{10^{10^{10^{1.1}}}}}{mbox{ years}}}

which corresponds to the scale of an estimated Poincaré recurrence time for the quantum state of a hypothetical box containing a black hole with the estimated mass of the entire universe, observable or not, assuming a certain inflationary model with an inflaton whose mass is 10−6 Planck masses.

[8][9] This time assumes a statistical model subject to Poincaré recurrence.

A much simplified way of thinking about this time is in a model where the universe's history repeats itself arbitrarily many times due to properties of statistical mechanics; this is the time scale when it will first be somewhat similar (for a reasonable choice of “similar”) to its current state again.

Combinatorial processes rapidly generate even larger numbers. The factorial function, which defines the number of permutations on a set of fixed objects, grows very rapidly with the number of objects. Stirling's formula gives a precise asymptotic expression for this rate of growth.

Combinatorial processes generate very large numbers in statistical mechanics. These numbers are so large that they are typically only referred to using their logarithms.

Gödel numbers, and similar numbers used to represent bit-strings in algorithmic information theory, are very large, even for mathematical statements of reasonable length. However, some pathological numbers are even larger than the Gödel numbers of typical mathematical propositions.

Logician Harvey Friedman has done work related to very large numbers, such as with Kruskal's tree theorem and the Robertson–Seymour theorem.

### “Billions and billions”

To help viewers of Cosmos distinguish between “millions” and “billions”, astronomer Carl Sagan wrote a book and discoursed, stressing the “b”. Sagan never did, however, say “billions and billions”.

The public's association of the phrase and Sagan came from a Tonight Show skit. Parodying Sagan's affect, Johnny Carson quipped “billions and billions”.

[10] The phrase has, however, now become a humorous fictitious number—the Sagan. Cf., Sagan Unit.

### Examples of large numbers

• googol =

10

100

.

{displaystyle 10^{100}.}

• centillion =

10

303

{displaystyle 10^{303}}

or

10

600

{displaystyle 10^{600}}

, depending on number naming system.

• millillion =

10

3003

{displaystyle 10^{3003}}

or

10

6000

{displaystyle 10^{6000}}

, depending on number naming system.

• micrillion =

10

3000003

{displaystyle 10^{3000003}}

or

10

6000000

{displaystyle 10^{6000000}}

, depending on number naming system.

• The largest known Smith number = (101031−1) × (104594 + 3×102297 + 1)1476 ×103913210.
• The largest known Mersenne prime =

2

82
,
589
,
933

1

{displaystyle 2^{82,589,933}-1}

(as of December 21, 2018),

• googolplex =

10

googol

=

10

10

100

{displaystyle 10^{ ext{googol}}=10^{10^{100}}}

.

• Skewes' numbers: the first is approximately

10

10

10

34

{displaystyle 10^{10^{10^{34}}}}

, the second

10

10

10

964

{displaystyle 10^{10^{10^{964}}}}

,

• googolplexian =

10

googolplex

=

10

10

googol

=

10

10

10

100

{displaystyle 10^{ ext{googolplex}}=10^{10^{ ext{googol}}}=10^{10^{10^{100}}}}

.

• Graham's number, larger than what can be represented even using power towers (tetration). However, it can be represented using Knuth's up-arrow notation.
• Rayo's number is a large number named after Agustín Rayo which has been claimed to be the largest named number. It was originally defined in a “big number duel” at MIT on 26 January 2007.

### Systematically creating ever-faster-increasing sequences

Main article: fast-growing hierarchy

Given a strictly increasing integer sequence/function

f

0

(
n
)

{displaystyle f_{0}(n)}

(n≥1) we can produce a faster-growing sequence

f

1

(
n
)
=

f

0

n

(
n
)

{displaystyle f_{1}(n)=f_{0}^{n}(n)}

(where the superscript n denotes the nth functional power). This can be repeated any number of times by letting

f

k

(
n
)
=

f

k

1

n

(
n
)

{displaystyle f_{k}(n)=f_{k-1}^{n}(n)}

, each sequence growing much faster than the one before it. Then we could define

f

ω

(
n
)
=

f

n

(
n
)

{displaystyle f_{omega }(n)=f_{n}(n)}

, which grows much faster than any

f

k

{displaystyle f_{k}}

for finite k (here ω is the first infinite ordinal number, representing the limit of all finite numbers k). This is the basis for the fast-growing hierarchy of functions, in which the indexing subscript is extended to ever-larger ordinals.

For example, starting with f0(n) = n + 1:

• f1(n) = fn(n) = n + n = 2n
• f2(n) = f1n(n) = 2nn > (2 ↑) n for n ≥ 2 (using Knuth up-arrow notation)
• f3(n) = f2n(n) > (2 ↑)n n ≥ 2 ↑2 n for n ≥ 2.
• fk+1(n) > 2 ↑k n for n ≥ 2, k < ω.
• fω(n) = fn(n) > 2 ↑n – 1 n > 2 ↑n − 2 (n + 3) − 3 = A(n, n) for n ≥ 2, where A is the Ackermann function (of which fω is a unary version).
• fω+1(64) > fω64(6) > Graham's number (= g64 in the sequence defined by g0 = 4, gk+1 = 3 ↑gk 3).
• This follows by noting fω(n) > 2 ↑n – 1 n > 3 ↑n – 2 3 + 2, and hence fω(gk + 2) > gk+1 + 2.
• fω(n) > 2 ↑n – 1 n = (2 → nn-1) = (2 → nn-1 → 1) (using Conway chained arrow notation)
• fω+1(n) = fωn(n) > (2 → nn-1 → 2) (because if gk(n) = X → nk then X → nk+1 = gkn(1))
• fω+k(n) > (2 → nn-1 → k+1) > (nnk)
• fω2(n) = fω+n(n) > (nnn) = (nnn→ 1)
• fω2+k(n) > (nnnk)
• fω3(n) > (nnnn)
• fωk(n) > (nn → … → nn) (Chain of k+1 n's)
• fω2(n) = fωn(n) > (nn → … → nn) (Chain of n+1 n's)

### Notations

Some notations for extremely large numbers:

• Knuth's up-arrow notation/hyperoperators/Ackermann function, including tetration
• Conway chained arrow notation

## Ginormous Numbers Could Create a Mental Black Hole

A trillion. A googol. A centillion. TREE(3). Somewhere between zero and infinity is a host of finite, but mind-bogglingly huge numbers.

But while mathematicians have dreamed up large numbers for ages and they pop up frequently in physics and math, gigantic numbers are bafflingly hard to fathom.

“Really big numbers are in every piece of technology we look at, in every organism we look at,” said John Borwein, an applied mathematician at the University of Newcastle in Australia. And yet, “These are enormous numbers that no human being is intuitively equipped to understand.”

Some of these numbers are so huge that even writing them down involves completely different math notation. And thinking about them may give you more than a headache: according to mathematicians, in theory, storing ginormous numbers in one's head could create a black hole. [The 8 Most Massive Numbers in Existence ]

• Even so, such numbers are inherent in our understanding of quantum physics and probability, and even make an appearance in mathematics proofs.
• Ancient fascination
• From the start, people have been fascinated with the question of how big numbers could get.

Ancient Babylonian texts, for instance, mention incredibly large numbers, which students used to practice multiplication or division, said Henry Mendell, a historian of mathematics.

And in a text called the Sand Reckoner, Greek philosopher Archimedes calculated the number of grains of sand that would fit into the universe, which came to about 10 raised to the 63 (or 1 with 63 zeroes after it) grains of sand, Mendell told LiveScience.

The physical world

In the physical world, most of the big numbers can still be expressed using scientific notation. For instance, the universe contains about 10 to the 80 atoms (not that far from Archimedes' estimate).

1. And in the realm of extremely tiny numbers — which, after all, are simply the inverse of extremely gigantic numbers — the mysterious acceleration of the universe due to dark energy is described by a cosmological constant, which is 10 to the minus 122, said Scott Aaronson, a computer scientist at MIT, who has written about very large numbers.

## Dealing with Big Numbers: Representation and Understanding of Magnitudes Outside of Human Experience

Being able to estimate quantity is important in everyday life and for success in the STEM disciplines. However, people have difficulty reasoning about magnitudes outside of human perception (e.g., nanoseconds, geologic time).

This study examines patterns of estimation errors across temporal and spatial magnitudes at large scales. We evaluated the effectiveness of hierarchical alignment in improving estimations, and transfer across dimensions.

The activity was successful in increasing accuracy for temporal and spatial magnitudes, and learning transferred to the estimation of numeric magnitudes associated with events and objects.

However, there were also a number of informative differences in performance on temporal, spatial, and numeric magnitude measures, suggesting that participants possess different categorical information for these scales. Educational implications are discussed.

Being able to estimate quantity is central to everyday life, as when someone cooking dinner estimates how much food is required to serve four guests and how much time dinner will take to prepare.

Estimation is also important for success in science, technology, engineering, and mathematics (STEM).

For example, accurate estimation of the position of numerals on a number line is strongly predictive of mathematical achievement (Booth & Siegler, 2006; Siegler & Booth, 2004), as well as causally related to arithmetic learning (Booth & Siegler, 2008).

Unfortunately, students consistently have difficulty understanding and comparing the magnitudes of scientific phenomena at very small scales (e.g., Swarat, Light, Park, & Drane, 2011) and at very large scales (e.g.

, Jones, Tretter, Taylor, & Oppewal, 2008; Libarkin, Anderson, Dahl, Beilfuss, & Boone, 2005; Tretter, Jones, Andre, Negishi, & Minogue, 2006).

Even students in STEM majors have difficulty reasoning about magnitudes outside of human perception (Drane, Swarat, Hersam, Light, & Mason, 2009). In particular, students have difficulty in identifying and comparing absolute magnitudes (Jones et al., 2008; Tretter et al., 2006).

For example, while most students are able to place major geologic events in the correct order, they do not understand the magnitude of time between these events (Libarkin, Kurdziel, & Anderson, 2007).

Developing a clear understanding of how people reason about magnitudes outside human perception is both theoretically and practically important. Many scientific concepts (e.g., geologic time), technologies (e.g., nanotechnology), and global issues (e.g.

, global warming) occur at scales that cannot be directly perceived by humans. In fact, “size and scale” have been identified as fundamental to science education (e.g., Hawkins, 1978; Peterson & Parker, 1998; Schneider, 1994; Tretter et al.

, 2006), and are especially relevant for understanding many of the pressing global environmental issues of today (Buizer, Arts, & Kok, 2011).

Size and scale have been proposed as a unifying theme in science education by the NRC Framework for K‐12 Science Education (National Research Council, 2011), the Benchmarks for Science Literacy (American Association for the Advancement of Science (AAAS), 2008), and the Next Generation Science Standards (NGSS Lead States, 2013).

Most research investigating how people reason about magnitude has focused on magnitudes at scales within the range of human experience. For example, studies of magnitude representation along a mental number line have concentrated on estimations of numbers less than 1,000 (e.g.

, Ebersbach, Luwel, Frick, Onghena, & Verschaffel, 2008; Izard & Dehaene, 2008; Opfer, Siegler, & Young, 2011). More recently, research is beginning to examine reasoning about larger scales.

Thompson and Opfer (2010) found that students as young as sixth grade are accurate at estimating numbers on a number line from zero to 100,000, and second graders can be trained to be just as accurate. In larger ranges, however, even adults may have difficulties.

Approximately 50% of undergraduate students were inaccurate when estimating numbers on a 1,000–1 billion number line (Landy, Silbert, & Goldin, 2013).

Resnick, Davatzes, Newcombe, and Shipley (in press) suggest that people use categorical information when making estimations, including at the billion scale, and that variation in estimation at large magnitudes is due to imprecision of categories. However, given the limited number of studies, more research is needed to characterize how people reason about magnitudes outside human perception.

A review of how people reason about magnitudes within human scales may be useful in thinking about how one might reason about larger magnitudes. Converging evidence from cognitive, neurocognitive, developmental, and comparative fields suggest that reasoning about any type of magnitude (e.g.

, temporal, spatial, abstract) uses the same neural and conceptual resources (e.g., Bueti & Walsh, 2009; Cantlon, Platt, & Brannon, 2009; Lourenco & Longo, 2011; Walsh, 2003; for a review, see Cohen Kadosh, Lammertyn, & Izard, 2008).

This view is formalized as a general magnitude system (Lourenco & Longo, 2011) or a theory of magnitude (Walsh, 2003), and argues that the inferior parietal cortex is responsible for processing all more/less judgments required for action.

Processing involves the automatic extraction of magnitude information, which is spatially organized (e.g., Dehaene, Bossini, & Giraux, 1993; Ishihara, Keller, Rossetti, & Prinz, 2008; Moyer & Landauer, 1967).

The spatial organization of magnitude is often characterized as being structured along a mental number line (e.g., Hubbard, Pinel, Piazza, & Dehaene, 2005; Izard & Dehaene, 2008; Moyer & Landauer, 1967).

The exact distribution pattern of magnitude along the mental number line is debated (see Barth and Paladino (2011) and Opfer et al. (2011) for a discussion on mental models of magnitude representation).

However, all theories suggest people possess compressed representations of the relatively larger, unfamiliar magnitudes. Compression refers to the observed pattern of overestimation of relatively smaller magnitudes and underestimation of relatively larger magnitudes.

For example, on a number line from 0 to 1 billion, people may overestimate one million to be located roughly at midpoint (which is actually 500,000,000) and underestimate 750,000,000 to be closer to the midpoint than the end of the number line.

Resnick, Shipley, Newcombe, Massey, and Wills (2012) hypothesize that compression of the mental number line can be accounted for by the category adjustment model (e.g., Huttenlocher, Hedges, & Prohaska, 1988; Huttenlocher, Hedges, Vevea, 2000).

The category adjustment model argues that magnitude is stored as a hierarchical combination of metric and categorical information. In the absence of lower level information (e.g., precise metric information), people use higher level categories to aide in estimation.

Variation in estimation therefore occurs due to imprecision of category boundaries (Shipley & Zacks, 2008; Zacks & Tversky, 2001). Recall is biased toward the “prototype” of the respective category.

With regards to the mental number line, smaller and more familiar magnitudes may constitute individual categories, populating much of the mental number line; whereas a wide range of larger unfamiliar numbers may be encompassed by a few categories, such as “big” and “really big,” thus making these larger numbers harder to discriminate.

The integrated theory of numerical development suggests that number learning involves continually expanding the size and type of number whose magnitudes can be accurately represented (Siegler & Lortie‐Forgues, 2014).

While this may be a continuous process for magnitudes within human perception (Matthews, Lewis, & Hubbard, 2015; Siegler & Lortie‐Forgues, 2014; Siegler, Thompson, & Schneider, 2011), there is some evidence that for magnitudes outside of human perception, resources from small number processing are recycled, as opposed to extended (Landy, Charlesworth, & Ottmar., 2014).

At these larger scales, people divide numerical magnitudes into categories, apply linear responses to each subscale, and then connect each subscale to form one continuous range (Landy et al., 2014).

With regards to reasoning about scientific phenomena outside of human perception, there is some evidence from the science education literature to suggest that such phenomena are reasoned about using magnitude‐based categories, akin to the accounts described above.

Patterns of overestimation of relatively smaller magnitudes and underestimation of relatively larger magnitudes are also seen at extreme scales. For example, a majority of high school students overestimate how long ago dinosaurs first appeared on Earth (Libarkin et al.

, 2007; Petcovic & Ruhf, 2008; Resnick et al., 2012); and underestimate when life first appeared on Earth (Catley & Novick, 2008). Furthermore, estimations can vary by over 2 billion years in both cases. People also use conceptual categories to organize magnitudes at extreme scales (e.g.

, Swarat et al., 2011; Trend, 2001; Tretter et al., 2006); for example, geologic time is conceptualized as the Geologic Time Scale.

These similarities between estimations within and outside of human perception are present for different types of magnitudes, including both temporal (Catley & Novick, 2008) and spatial (Tretter, Jones, & Minogue, 2013b) magnitude estimations.

## 3. Mathematical Interlude: Very Large Numbers¶

Problems in statistical mechanics tend to involve large numbers. Very very large numbers. For example, we often
need to average over all the atoms or molecules in a system.

For any macroscopic system, that will be on the order of
Avogadro’s number: 1023, plus or minus a few orders of magnitude.

In other cases we average over microstates,
the number of which usually grows exponentially in the number of atoms. These are really astronomically large
numbers.

This has an interesting result. In many cases, we can simply treat averages as exact numbers. Any variation about the
average value is so small, we can completely ignore it. This is a great simplification, and is one of the reasons
statistical mechanics is so successful.

It also is the reason that, despite its name, statistical mechanics involves very little real statistics.

A large part
of the field of statistics deals with the variations about averages: computing the probability of a distribution
producing values in a certain range, determining whether a measured value might plausibly have come from a certain
distribution, etc. But in statistical mechanics, we can usually ignore these questions. The chance of measuring any
value except the average one is negligible.

The main goal of this chapter is to demonstrate this fact. We will do that first through a simple example, and then
through a very general theorem.

Do you ever worry that all the oxygen molecules in the air around you might spontaneously migrate to the other side of
the room, leaving you unable to breathe? It could happen. Each molecule diffuses independently, so there is a small
but finite probability that just by chance, they will all end up in the same half of the room at the same time.

I hope I am not worrying you. While this is theoretically possible, it is so absurdly improbable that you really can
ignore it. We will first compute just how improbable it is, and then study the statistics of this problem in more
detail.

Let (N) be the number of oxygen molecules in the room. We will describe the position of each one simply by
whether it is in the left or right half of the room. That gives (2^N) possible arrangements of molecules, every
one of which is equally probable. Of all those arrangements, only one has all the oxygen in the opposite half of the
room from you, so the probability is (1/2^N).

• Depending on the size of the room you are in, it probably has somewhere in the vicinity of 1026 oxygen
molecules in it. So the probability of all the oxygen being on the same side of the room at any given moment is
• [frac{1}{2^{10^{26}}} approx frac{1}{10^{10000000000000000000000000}}]
• That is a very, very large denominator. For comparison here are some much smaller numbers:
• The age of the universe is on the order of 1017 seconds.
• There are somewhere around 1080 atoms in the visible universe.