The Case of
The Special Theory
describes a subtle but profound change to Newton's powerful model. To reconcile Newton's model with newly formulated physical laws and with the growing scale and precision of actual experiments, Albert Einstein replaced several of Newton's assumptions with others, while keeping its essence intact. Einstein's innovation sustained the power and applicability of the model of the inertial frame, which remains central to our modern understanding the physical world.
Back Contents Comments Next

In The Case of the Year 2000 and the Case of the Missing Calendar, we described a powerful innovation --- a "hand for the mind" --- which extended human understanding of time. By using an external calendar, humans can reason about events they have not experienced or have yet to experience. Underlying this invention was the possibility of coordination without communication: if on June 5 we make a date for a week later, our calendars will both correctly identify June 12th even if we never meet to coordinate them during the intervening days.

In the early 20th century, physicists abandoned this possibility in its most general form, dropping a powerful common sense idea for a confusing conception of time where we can no longer speak of things "happening at the same time" unless they occur in the same place. Physicists adopted this strange way of looking at the world in order to keep the powerful model of the inertial frame in the face of surprising and disturbing experimental results.

The Opening Conflict:
Maxwell's Laws

In the middle of the 1800's, the English physicist James Clerk Maxwell came up with a concise set of laws describing the interaction of electricity and magnetism. Since the beginning of history, these two phenomena had been known separately: static electricity was a condition of materials which provoked either attraction (for instance between charged fabrics) or excitation (of shocked human and animal subjects); magnetism was a mysterious property which physically moved or steadied certain materials and allowed explorers and seafarers to keep their courses with no familiar landmarks other than the compass they could hold in their hands.

In the late 1700's and early 1800's, experiments had shown mysterious connections between electricity and magnetism: electrical phenonema could be shown to have magnetic effects and systems of magnets could induce electrical effects. These connections were robust and repeatable and scientists through the 1800s became able to describe and predict them quite precisely. However, there was no consistent account of either why they were different or why they were connected.

Maxwell changed all that by gathering together the experimental results and proposing a set of four laws which described electricity, magnetism, and their interaction. Though the details of these laws are too complex to describe here, it suffices to say that they linked the two phenomena through motion: motion of electrical charges produced magnetic effects while motion of magnets produced electrical effects. Simple experiments showed that this connection to motion was invariant between inertial frames: an electromagnetic experiment in uniform motion worked exactly the same as one at rest.

The problem was that the equations describing Maxwell's laws were not invariant in all of the the same ways as Newton's laws. While they looked the same regardless of position and orientation, when they were rewritten for experiments in motion, they didn't look the same or give the same results as for systems at rest. Since the experimental evidence was that they still applied in exactly the same way, they could not really be described as accurate for systems in motion, even though they made perfect predictions if you ignored the motion!

This was a problem. It was as though a cashier had the books balance at the end of the day even though he knew he was shortchanged on every transaction. If he were superficially pragmatic (and a little foolish), he might say "I must be doing something right" and not worry about it. But the physicists at the end of the 1800's, while enthusiastic that they'd unified electricity and magnetism, were deeply concerned about this anomaly. In resolving it, they would end up abandoning some of our most basic common sense assumptions about space and time.

Adjusting the Clock

When I was a perpetually distracted youth in high school, I was often late for meals, events, and chores (of course, this never happens now). Occasionally, the consequences of being identified as late were tragic (for a teenager) and I contemplated the idea, never implemented, of setting my mother's watch back by five or ten minutes to ensure the appearance of personal timeliness. Of course, given that my knowledge of the extra minutes would affect my activities, it probably wouldn't have worked for long, but mathematicians and physicists in the late 1800's proposed a remarkably similar solution to the anomaly introduced by Maxwell's laws.

The idea was that rather than rewriting Maxwell's laws for moving inertial frames in the same way that Newton had rewritten his laws, we rewrite them differently. In particular, we rewrite them so that distances in space and intervals in time are both changed to make Maxwell's equations look exactly the same in a moving inertial frame (as they do experimentally). Note that we have to change distances and time together, so that Newton's laws (which connect distance and time, among other things) also remain the same. The resulting change preserves both the invariance of Newton's laws and the invariance of Maxwell's laws.

This particular way of rewriting both Newton's and Maxwell's laws, called the Lorentz transformation, was proposed independently by Henrik Lorentz and George Fitzgerald in the late 1800's. In an ironic coincidence, considering the transformation's consequences, the two independent inventors at first each believed that the other had been the first to publish the transformation!

Though the mathematical details of the transformation are not particularly complicated, it won't be neccessary to detail them for our purposes. However, three points are important. First, the new way of rewriting really did maintain the invariance of inertial frames for both Newton's laws and Maxwell's laws: both sets of laws looked the same no matter how fast you were going, providing you were looking at something (for instance, an experimental apparatus) moving along with you. Second, the rewritten versions of Newton's laws differed consistently when describing a situation or apparatus which was moving relative to the observer: distances or intervals of time within the moving frame might appear distorted. Though the laws themselves looked the same from a distance, the measurements of time and distance changed in ways consistent with the laws. Third, this distortion was negligible until the differences in movement approached the speed of light. That explained why the new theory didn't contradict any existing experiments.

If this were a book about physics, I would go into more detail about the transformation and its direct consequences. But it is a book about models, so we will instead look at how the transformation changed physics and in particular our understanding of time.

The Decisive Conflict:
The Earth's Speedometer is Broken

While the first challenge to Newton's laws came directly from the form of Maxwell's laws, the decisive challenge (for many observers) came from a physical experiment which was suggested by Maxwell's laws. As I mentioned above, Maxwell's laws connected electricity and magnetism through motion. This led scientists to wonder about the motion of the earth and how to factor it into their calculations.

How do you measure the speed of the earth? Let's start by taking a look at a more prosaic speed measuring device, the autombile speedometer. The speedometer of a car is driven by a device connected to one of the car's axles, which is in turn connected (by the friction of rubber tires) to the ground. As the ground speeds by beneath the car's wheels, the device measures the motion and presents it to the driver. Understanding this mechanism is a wonderful example of inertial frames "in action": the speedometer is actually measuring the speed of the ground and not the car, but because we can transform between inertial frames, we can use the speed of the ground (assuming that the ground is not a treadmill) to determine the speed of the car.

The speedometer has to work this way because it is trying to detect the speed of the car from inside the car. A similar trick is neccessary to detect the speed of the earth from on the earth. (Even if we were in a spaceship and measuring the earth's velocity from outer space, we would still need to figure out our ship's actual velocity in order to know the earth's actual velocity, so it wouldn't help.).

But what do we use as the "ground" for measuring the velocity of the earth? What "stays still" as the earth moves and what is the "axle" with which we measure its relative motion? The answer, in the middle of the nineteenth century, had two components. The "ground" was the hypothetical "aether" which physicists had introduced as the "medium" through which light and other effects propagated in the same way that sound propagates through air. The "axle" for measuring the relative motion of the aether was the comparison of the speed of light in different directions; light moving in the opposite direction to the earth's motion should seem to move relatively faster than light moving in the same direction as the earth's motion. This was called the "aether wind" by analogy to the atmospheric wind one feels when moving quickly through still air (for instance, through the open window of a moving car).

Through an ingenious arrangement of mirrors and interacting beams of light, the physicist Albert Michelson attempted to measure the absolute speed of the earth by measuring differences in the speed of light through the aether in different directions. His first efforts, in 1881, failed to show any motion of the earth. This first failure was attributed to measurement errors, but six years later in 1887, together with Edward Morley, a much more precise experiment still failed to show any apparent motion of the earth in any direction (including even the motion of the earth in its orbit!). Either the Earth really was at rest and everything else moving around it (an almost pre-Copernican coincidence) or something else was going on.

In a half-page paper in 1889, George Fitzgerald made the proposal that the failure to detect the aether wind could be a consequence of space-time distortions associated with velocity. He proposed a set of transformations which would explain the failed experiment, but the proposal was largely ridiculed. Three years later, however, Hendrik Lorentz independently proposed the same transformation as the solution to two problems: the invariance of Maxwell's equations and the failed efforts to detect the earth's motion through the aether.

The Lorentz transformation took one of the basic assumptions of Newton's physics and relaxed it. Newton's physics relied on the the invariance of distances and intervals in inertial frames and in particular on the notion of absolute time advancing consistently in different inertial frames. The Lorentz transformation drops these assumptions by attaching the progress of time and the measurement in an inertial frame to the frame's velocity. However, the profound effects of dropping this assumption demonstrates how the assumptions in a model can interlock. Often, when one assumption (invariance of distance and interval) is dropped or changed, many unexpected things (absolute space and time) change as well.

Changing Times, Comparing Times

On the surface, distortion of distances and time described by the Lorentz transformation might be suprising but not neccessarily problematic. It says that when we look at a different inertial frame, our remote measurements of intervals and distances within the frame will be distorted so as to preserve the laws of physics. The notion of distorted measurements is not unusual in everday experience: the rules of visual perspective tell us that our measurements of objects at a distance are affected by the distance. An elephant 100 meters away looks substantially smaller than an elephant only one meter away.

However, unlike perspective and other optical illusions, the distortion of time and space described by the Lorentz transformation is quite real. An apparently thumb-sized elephant (distorted by mere optical perspective) cannot fit into one's pocket, but the same full-size elephant could really fit in the millimeters between the spinning blades of a fan (if only for a moment, as it sped through at nearly 99.9998% of the speed of light).

First, the distortion of time is likely to cause problems for the "calendar logic" which helped us understand confused computers and forgetful children. However, before we start using relativistic excuses for tardiness, we need to recall that the effects only appear when velocities approach the speed of light. To talk about the problems, we'll introduce Seiko and Miriam, two interstellar truckers whose travels routinely bring them near the speed of light.

When Seiko and Miriam meet in their favorite bar on planet X to make an appointment to meet fifty years later on planet Y, it is not enough to specify a meeting place and an interval. They must also coordinate their flight plans, since time will pass differently depending on how fast they are going on the way to the meeting. But this is only the beginning of their problems.

Beyond confusing their social calendars, the effects of time distortion may make it complicated for Seiko and Miriam to compare events in their pasts. Because "time passed" depends to some degree on "path taken," they cannot agree on the time between or sometimes even the order of events which they observed separately.

Of course, in everyday life, these problems do not occur because our travels never even approach the speed of light. However, it is deeply disturbing that we cannot always describe or agree on absolute intervals or orderings in time. The common sense notions of absolute time and distance are so pervasive and important, that as more physicists accepted the distortions of the Lorentz transformation as real phenomena, they began to look for ways to tell time and compare events despite the effects of velocity.

Looking For Clocks

Stricly speaking, time distortions do not make it impossible for Seiko and Miriam to arrange meetings, they just make it more difficult. Sitting in a bar on planet X, they can still point at the distant star system Y (on the holographic star maps in their wallets) and say "let's meet there the next time that all of Y's major planets are aligned." Given good telescopes to observe its planets and some number crunching to plan their courses, they can each arrange to arrive at the appropriate moment, using the planet's motions as a clock. They can even separate the clock and meeting place, agreeing to meet at a system Z when system Y's planets (as seen from the system Z) reach a particular configuration.

But this does not restore absolute time, as Seiko and Miriam are just agreeing on some relative time to use as a standard. To regain absolute time and especially to be able to agree about durations and orders in the past, it is neccessary to find a clock which everyone agrees on.

Obviously, since the Lorentz transformation distorts time based on velocity, one good candidate for a universal clock would be a clock with no velocity at all or (as physicists like to say) "at rest." Seiko and Miriam could then convert their "local time" to this clock's time by each taking their absolute velocity and applying the Lorentz transformation in the opposite direction. With this kind of adjusted clock, everything that we seemed to lose when time became distorted is recovered, since this "clock at rest" provides the basis for all of our timings.

But how do we figure out our absolute velocity if our measurements of space and time keep changing as we move? For decades after the Lorentz transformation was first proposed, scientists (including Lorentz) tried to come up with ways of determining absolute velocity and saving absolute time. The ultimate but slowly accepted solution (if you want to call it that), published in Albert Einstein's 1905 paper "The Special Theory of Relativity," was to find a way to avoid having to define absolute time. It was called the "special theory" because Einstein was aware that he was not handling all types of motion, but only motion at a uniform velocity. In particular, it did not handle acceleration, which would have to wait for his formulation of the "general theory" two decades later.

Simplicity Regained

To understand Einstein's innovation, let's look at these theories as models. Newtonian physics presented a mathematical model of space and time and a way of defining groups of places and events (inertial frames) which made possible some very powerful generalizations about nature. The mathematical theory referred to points in space and moments in time and to sets of such points moving together at uniform velocity. Through these patterns of reference, the behavior of quite complex systems could be simplified and accurately predicted.

Generations after Newton formalized this powerful model, his successors discovered principles describing electromagnetic phenomena (Maxwell's Laws) whose combination with Newton's framework was inconsistent. In particular, the accuracy of the new laws was not consistent with the assumption that only positions --- not distances or intervals --- were changed by motion at a particular velocity. Fortunately, the original assumptions of the Newtonian model could be changed to include the frame's velocity in the determination of both distances in space and intervals in time. With this transformation, the description of time and space for Newtonian experiments became more complicated (but still accurate and self-consistent) in order to make the description of electromagnetic experiments consistent.

One particular complication was that the new model included two kinds of space and time: relative space and time (within uniformly moving inertial frames) and absolute space and time (within frames at absolute rest). Relative space and time was based on absolute space and time with a transformation based on absolute velocity, but observations and physical laws (other than the transformation itself) were all described in terms of relative time. As a model, this was more complicated than the Newtonian framework, requiring that descriptions and laws include both sorts of space and time.

In addition, the new model abandoned the strong claim of universal invariance possible with Newton's laws. Because the theory referred to these two kinds of space and time, there had to be ways to tell them apart, and those distinctions could not be invariant in the way that Newtonian mechanics had been. So even though the model was introduced to save the invariance of two known sets of laws (kinetic and electromagnetic), it implied the existence of other laws which could not be invariant. Logically, any phenomena which could be used to detect absolute uniform velocity could not be invariant under uniform velocity.

Einstein's slowly accepted solution, published in his landmark 1905 paper, was to introduce a model which returned to the simplicity of "one kind" of space and time by abandoning absolute space, time, and velocity in favor of a framework which included only relative velocities and where no one frame was "at rest". In the new model, it was possible to pick any inertial frame and describe all of the others with respect to it; the new model preserved invariance of the known laws within inertial frames but also preserved invariance under the choice of the observer's frame of reference. And it did all of this with only one sort of space and time.

A Different Invariance

In the same way that Newton's invariance relied on the fact that distances and intervals were not affected by velocity, the invariance between observers' frames in Einstein's theory relied on the assumption that light always had the same speed through vacuum, regardless of how one was moving (one's inertial frame) when measuring it. This was particular important because otherwise events which were causally ordered --- for instance, the release of a ball above a surface and its eventual impact --- might be inverted. By adding the "speed of light" principle to Newton's inertial frames and introducing the Lorentz transformation based solely on relative velocities, Einstein could demonstrate a simpler model of space and time which preserved the invariance of causality and other laws in inertial frames.

The model underlying this theory was quite different from the model underlying Newton's theory, even though they were "about" the same thing. In Newton's theory, the mathematical model referred to points and moments on an absolute stage where the laws of physics described their activity; in Einstein's theory, the points and moments were relative measurements which changed and bent to fit the laws he described. In Newton's theory, the speed of light was a phenomenon to be measured; in Einstein's theory, it was more like a law than a phenomenon, providing the basis by which other phenomena were interpreted.

This brings up an interesting sidenote on the social history of the idea of relativity. Thinkers who have not looked carefully at the physics and mathematics of Einstein's theory confuse relativity with certain philosophical brands of "relativism" which make the assertion that all points of view are equally valid. However, relativity actually says "all possible points of view are equally valid" where "possibility" is literally defined by physical law. What the theory actually says is that experience itself (our measurements of space and time) will change to preserve physical laws, a very different assertion than asserting that individual experience or judgment alone has priority!


The length of this last example reflects mostly the unfamiliarity of the model being described; as a model, little more is happening than in the previous cases we've discussed. Assumptions are made, dropped, or changed and the consequences are examined. However, we can draw a few lessons from this particular example.

Models have their own logic.
A model isn't just a "copy" of some world or set of phenomena. The model has its own consequences and it is the character of those consequences which determine both the power of the model (being able to make descriptions for moving frames) and the problems of the model (not being able to consistently describe electromagnetic phenomena in motion).

Models make assumptions.
In providing this simplicity of expression, models make assumptions which end up pervading all of the model's consequences. Newton's physics inherited common sense assumptions about absolute time which were part of what enabled him to make such powerful claims. Einstein invented a new assumption --- about measurements of the velocity of light --- which made it possible to create an similarly simple theory with profoundly different foundations.

Different assumptions fit different purposes.
Common sense assumptions about time are a good fit to common sense contexts and Newton's physics is also a good fit to those contexts. Only when those assumptions break down (for instance, when we have inertial frames moving at very high relative speeds) does the model stop being effective.

Models evolve.
Often, models change by dropping, making, or changing assumptions. This evolution tends to be gradual but its consequences --- the way laws and expectations change when the model is changed in a small way --- can be quite dramatic. But even though this evolution is gradual, that does not neccessarily mean that it is "easy." Because our assumptions help us think more clearly, working with changed assumptions may often be immensely difficult. Toward the end of this book, I discuss why creatively changing models is so difficult.

Scientific Progress: Newton and Einstein

One compelling way to look at science is as a game of guessing how the world works. Theories are predictions and the goal of the game is to find theories whose predictions are the most accurate. In this view of science, Einstein has beaten Newton because when velocities are high or measurements are extraordinarily precise, the predictions of Newton's theories are just wrong. If we look at experiments in a speeding spaceship, we will see different things than Newton would have ever predicted.

Given this perspective on scientific progress, the most that the modern mind can give Newton is a consolation prize: he was right for everything he could observe, limited to mere planetary velocities no greater than a few thousand miles per second. But if we speak to scientists, this competitive perspective goes over uneasily: scientific cooperation is far more common than direct competition and even the competition occurs in a context of common goals. Indeed, international scientific cooperation has often continued at times when national moods or politics would have condemned it (or worse).

An alternative to this competitive model of science is a more gradual model, where the work of scientists builds upon the work of their predecessors. With this model, we tell a story of Newton specifying laws which Einstein modified to fit the new data, always coming closer to the perfect predictions which both Newton and Einstein were seeking.

But this gradual model has its own problems when we look at the actual history of science. Thomas Kuhn, in his seminal The Structure of Scientific Revolutions, points out that certain scientific advances have a more radical character than others. These "revolutions" involve changes in basic assumptions and criteria which cannot be described as simple additions or amendations to earlier theories. Special relativity, which we just discussed, is one classic example of such a revolution.

Unfortunately, the naive revolutionary view returns us to the competitive model of science where the winner is the new "dominant paradigm" and the loser becomes a part of history and not of science. This view has at least two tragic consequences: first, we risk losing figures like Newton as sources of inspiration and admiration; second, the day-to-day work of scientists loses respect as it is viewed as merely a future victim of the next scientific revolution. It is not surprising that working scientists, particularly in the "harder" sciences, consider Kuhn's notion of revolution with skepticism if not distaste.

Viewing scientific progress as the evolution of model and description, rather than just prediction, offers a more sophisticated notion of revolution. Einstein's special relativity, for example, includes two of the most important elements of Newton's model: inertial frames and the invariance of laws across them. What it changes is the replacement of absolute time and distance with relative measurements and the replacement of the simple Galilean transformation with the more complex Lorentz transformation. Seen in this way, the revolution looks more gradual and incremental than when we consider the theory as a monolithic oracle of prediction.

Of course, from many other perspectives --- the mathematical form of the laws, the interpretation of experiments, and the construction of subsequent theories --- the small change in one part of the model has radical and revolutionary consequences for other parts. But if we understand the theory as an interface between our understanding and our perceptions, we see that past and current science are all parts of developing human models for and interfaces to a shared universe. Revolutions still exist as extraordinary events in scientific progress but they are more like the transformation of caterpillar to butterfly than the defeat of a vanquished foe. The change is radical and dramatic, but many of the original components remain in the new form.

Copyright (C) 1997, 1998 by Kenneth Haase
Draft, not for citation or circulation
Back Contents Comments Next