|Where we look at where models come from, distinguishing natural models from mental models and describing some the ways in which mental models evolve or are developed.||
The Origin of Models
Models make us geniuses. In the middle of the 1600s, Isaac Newton struggled for weeks to solve certain problems in celestial mechanics for the first time in human history. In the fall of 1980, freshmen all over the world (I was one of them) solved some of these same problems overnight. This crop of teenage minds were not mutant geniuses who could surpass Newton in a few hours. Instead, their feats were possible because they were given the models developed by Newton and his successors.
Newton didn't do it all by himself either. He had a community of colleagues who each contributed pieces of the model which Newton tied together. Galileo introduced the separation of horizontal and vertical motion. Descartes linked the arithmetic of calculation and the geometry of space. Robert Hooke proposed how distance might diminish the effect of gravity. And the list goes on.
And I (and my fellow freshmen) didn't get our models directly from Newton. Generation after generation of physicists and teachers clarified and honed Newton's model so that it could fit in a relatively short program of lectures to audiences of high-school students. This lineage of teachers of teachers of teachers, mostly unsung, probably had as much to do with my and my fellow student's overnight brilliance as did Newton's original formulation.
The power of models complicates the assignment of credit. With the right model, one can solve a problem in moments rather than hours. Most of Newton's labors were in finding the right model and in this search, he considered many incomplete or wrong models. Historians can reconstruct some of these models, but schools don't generally teach them. Indeed, if schools taught these models as convincingly as Newton understood them at the time, introductory physics would take decades to teach rather than months. Though this isn't practical, such a program's middle-aged graduates would certainly have tremendous respect for Newton's acheivement!
So who gets the credit for the overnight brilliance of these college freshmen? Some of the credit should go to the bright students and some to their teachers, but most of it belongs to the models which they applied in forming their solutions. The question then is who gets the credit for the models? Where did the models come from?
This chapter considers the origin of the models we think with. Previous chapters presented micro-histories of particular models such as the inertial frame or the ecology of the coral reef. In this chapter, we abstract from stories like these to describe some of the ways that models --- with their components of reference, imagination, and convenience --- change.
In telling this story, I draw on the work of many researchers looking at how computers might learn. Because computers are largely unconscious and unaware, getting computers to learn requires programmers to be conscious "for" their computers. This consciousness illuminates many things about the processes by which our models change.
Because models are interfaces between systems, there need not be a designer or even an explicit intention for a model to be created. When connected systems are changing and evolving by themselves, the connection between them may naturally take on the systematicity and structure --- reference, imagination, and convenience --- which characterizes a model.
These "natural models" are typically the result of what is called co-evolution in biology. Co-evolution describes evolutionary changes in organisms based on their co-existence with other organisms. Natural symbiosis, like the relation between E. Coli bacteria and our intestines, is one product of co-evolution, where organisms serve each others purposes directly. Models, where interfaces between systems (which might be organisms) serve purposes, are another product.
In contrast to this natural evolution, some models are produced by the operations on and with other models. When a biologist creates a model of some new organism, she looks for patterns of growth, reproduction, and responsiveness based on her models of other organisms. When a programmer implements a computer model of (for instance) calendar dates, she is coordinating her own model of the calendar with her model of the computer's operations.
These "mental models" are usually based on the co-existence and interconnection of several models at the same time. They also require either the reflection or awareness we described in The Awareness of Models.
The line between natural and mental models is sometimes a gray one, much like the distinction between signs and symbols. The bug detectors and shore detectors in the frog's eye may appear to be natural models on one time scale (the life of the frog) because they do not change much during the frog's adult life. However, on a much longer scale (the evolution of the frog and its eye), we might see the distribution and organization of retinal cells changing as reflecting genetic variations and their successful recurrence. But the distinction between natural and mental models is still an important one for focusing this discussion.
This chapter considers the origin of "mental models" from the interaction and variation among and between models. Even with this reduction, we will only scratch the surface of the processes that underly these origins.
Just as models are useful in dealing with the world around us, models of our models are useful in dealing with our models. By using models of our models, we can change our models in more effective ways. And like other sorts of models, our models of our models hide and highlight different things in order to simplify our thoughts and actions.
For a simple example, suppose we are developing a model of animals in the natural world. Let's call this the animal-model. This model refers to these animals and what we can observe about them: size, shape, color, locomotion, diet, residence (where they sleep), means of reproduction, etc. The imaginative element of this model makes guesses about some of these properties based on others: based on the appearance of the animal, we might imagine where it sleeps or what it eats.
There is also a model of this model, which I'll call the model-model. In this model, we know what the features of the animal-model are and what their relations to each other might be. We can use the model-model to extend the imaginative element of the animal-model, considering that an animal with sharp front teeth might eat meat. We might consider this because our model-model suggests a connection: sharp teeth are good for tearing muscle. Or we might just consider it because we've seen a number of meat-eaters with such teeth. In either case, having a model of our model allows us to consider extensions to the model.
Many of our possible extensions will fail. Seeing a series of black swans, we might imagine that all swans are black. But a single white swan would burst our bubble. Likewise, we might presume --- based on both experience and logic --- that all creatures with wings can fly and live in trees. But upon travelling far south, we might encounter flightless penguins, which would convince us otherwise. But even in these cases, the extension might still be useful: most winged creatures might fly and most swans might be black.
Just because the model-model can generate poor extensions doesn't mean it isn't doing it's job. By picking out features which can be part of good extensions, the model-model is ruling out huge numbers of bad extensions, like "can fly if it has feathers on Friday" or "hides in crevices if it is green on one side". But we must be careful. As we saw in The Case of the Alien World, some outrageous extensions may be quite reasonable in a different context.
Extensions like these transform the imaginative component of the animal model. However, these extensions also make the model more complicated to work with. Another kind of extension can make models more convenient to work with.
As we observe more and more animals and distinguish more and more features and their combinations, the animal-model may get quite complicated. One way to simplify it is to use a process called "chunking." Chunking take sets of features and put them together. For instance, we might notice that the feet of animals fall into four broad categories:
Noticing this, we could use these categories to simplify the classification of animals. Rather than having to list all of the features of their feet, we would simply note the category and build our extensions using the category rather than the more primitive features.
Chunking enhances convenience by making models easier to deal with. Chunking is one of the ways in which symbols emerge in models. The arbitrariness of symbols is important here. It lets the model-model use something new to designate a combination of features which it is useful to think with.
Chunking is not without risk. If we decide there are four categories of feet and a new kind of creature arrives with some very different kind of feet (say, webbed feet), we might mis-classify its feet and then mis-classify the creature itself. But if we are actively using our model-model together with our animal-model, we can avoid this by thinking about whether the categories we are using are still valid.
What makes models stick together? How can a model-model extend the model it is modelling? In How Models Work and the cases which followed it, I discussed the importance of systematicity: that interfaces used as models be consistent in their connections. But systematicity is not a very demanding constraint: there are many ways to be systematic. Because of this, different kinds of models "stick together" in different ways, all of which preserve some sort of systematicity.
The examples from the animal-model were stuck together by "conjunction": if a creature had claws and teeth, it probably eats meat. The animal-model might also stick together by a kind of voting: if a creature has at least two of the features teeth, claws, stingers, and barbs, then it probably can hurt me. There are many other ways of managing these combinations.
We may not even need to know the means of combination to use a model-model. When learning a physical skill, like swimming or batting, we can have a model-model which tells us which things matter (lift your head to the side, don't look at the bat) without knowing how they are put together when we learn the skill. Some computers, likewise, can learn everyday tasks (like how to pronounce words or recognize shapes) by using complex mathematical combinations of input features. These mathematical combinations are unlike conventional programs, but they accomplish their task quite well. And they can be trained and tweaked (given inputs and outputs) automatically, without a human programmer. Many scientists actually think that our own brains use the same sorts of functions at their most basic levels, and they may be right. But having a partial model-model is still important and useful, even when we don't know the exact nature of our model's underlying "glue".
While model-models propose kinds of changes, our process of change may also need to determine which changes are good and worth keeping. Usually, in one way or another, these judgements involve still other models. In addition, since models always have purposes, these judgements must be tied to those purposes.
In science, one common criterion of judgement is whether a theory or model "fits the data". Here, "the data" provides an alternative model of what the model is supposedly describing. By comparing what the model says the data will be and what the data actually are, we can judge whether one model "fits the data" better, worse, or as well as another.
A related criterion in science is that theories can be "experimentally validated" when some physical experiment gets at aspects of what a model is describing. The physical experiment is a model whose "reference relations" are very complicated, but by comparing it to a model, we may be able to judge whether one model improves on another.
But the results of judgement may be ambiguous. A transformed model may fit one part of the data better yet fit another part worse. An experiment may validate one model while proving another model --- which had been taken for granted --- invalid. In this case, we can question whether the original model or the experimental model is at fault. As we mentioned in The Case of the Special Theory, when Albert Michelson first failed to measure the speed of the Earth by ether drift, he attributed it to a experimental error rather than an erroneous model.
Discussing experimentation brings us to the problem of changing reference with a model-model. Because the model-model is about a model by itself and not about a model plus the world it describes, most model-models cannot change the way in which models refer. Chunking, which makes new convenient terms in models, changes convenience but does not change reference.
How would we change or extend reference in the animal-model? How would we add a new feature, like has crest on head? If this is a genuine feature, there would need to be some creatures which have crest on head and some which do not. And to add this feature to the animal-model, we would need to have some other model (maybe the what-it-looks-like-model) which already had crests or the means to describe them.
The problem is that a model-model which could change reference would need to be a model-model for the what-it-looks-like-model and for the animal-model. It would have to be a lot more complicated than a model for just the animal-model. And, as we saw in the previous chapters, models usually get their effectiveness from describing less rather than more. So a model-model for changing or inventing new sorts of reference relations would have to be less effective at making models more convenient or imaginative.
Good models and good model-models rule out many combinations by restricting reference. But a model which is trying to change reference cannot be as cavalier about ignorning possible combinations as a model which is just looking for new combinations.
Fortunately, there are other ways to think about changing reference. Because reference involves connections to the world, these methods are often very different from those used in increasing convenience or extending imagination.
In The Case of the Alien World, we briefly discussed how European expeditions of conquest and colonization incidentally revaled an unexpected natural diversity. This diversity challenged prevailing models of the natural world and led to a different way of thinking about species and their relations to one another.
Physically moving can force us to change reference relations simply because there are new things to refer to. Many of us have had the experience that a move or change in position --- from country to country or job to job or role to role --- brings about changes in our models of where we were before. This is the reason that we learn new things by sending probes to Luna or Mars or the depths of our oceans or the mantle of our own planet. When we have new things to explain, our explanations get exercised and improved.
Other forms of activity also change the kinds of things we can explain. Building artifacts, like snorkeling gear or microscopes or super-colliders, can reveal new worlds as surely as travelling a thousand miles. New kinds of activities, like listening carefully to children's thoughts or speaking to homeless people, can do the same. With new tools or in new activities, we learn that there are new kinds of distinctions, new kinds of features, new differences and similarities to be considered and explained.
The search for these new contexts is seldom blind. But it is not much like the use of a model-model to extend and enhance the animal-model. When we consider hypotheses about animals, we can list and imagine the alternatives. When we strike out into a new context of description and activity, we may not be able to imagine what we will find. But as explorers, we can share notes and exchange thoughts on where we might look.
In science, some of the motion into new contexts is driven by technology. A new material or device is developed that enables experiments or reveals phenomena which were impossible or invisible before. The goal of the material or device may not have been scientific advances, but the fact of the technological advance fueled the scientific advance.
Another part of the change in scientific models comes from a focus on anomaly. Though the rhetoric of science speaks of the centrality of prediction, the activity of science is largely driven by anomaly. A successful experiment produces satisfaction but not a lot of activity. But a scientific experiment which differs from prediction triggers furious activity.
Technological developments and the pursuit of anomaly are both sources of new experiences which lead to development and change in science. As scientists try to make sense of these new experiences, they develop and adjust the models they are thinking with. Some of these developments and adjustments are organized by model-models of the sort we saw above. But there must be a place to start and one important set of starting places comes from metaphor.
In my not-so-recent youth, I received a "paint by numbers" set whose first use occasioned a mix of frustration and awe. The set, like most such sets, had a line drawing broken into numbered regions and a chart associating those numbers with colors. By filling a region with the color which corresponded to its number, a colored image emerged from an original sketch. I recall feeling a mix of frustration at "painting in the lines" (I was young and clumsy) and awed surprise as the colored pattern (a house set in the woods) emerged from my actions despite my clumsiness.
We never arrive naked in the new worlds created by our travels, our technologies, or our experiments. We always come with certain models of different worlds and we proceed to try to use these models to make sense of our new experiences. This is much like "paint by numbers": the models we bring are like line sketches which we fill in with the colors of the world we are trying to understand.
For example, when I learn a new language (natural or artificial), my first understanding is always based on a language I already know: the french word "douce" is like the English word "sweet" or the "C" construct while is much like the "Lisp" construct do. These analogies are neither perfect nor complete, but they serve as a starting place for the models that will eventually develop in their place.
Analogy plays a similar role in science. Initial analogies link atoms to solar systems or circulatory systems to roadways. The value of these analogies are not their truth or even their consistency. Instead, their value was in the provisional reference relations they established and the connections between these. The "solar system model" of the atom, for instance, captures a straightforward solution to the mystery of why the positive nuclei and negative electrons of atoms do not fling apart. Put in this way, the model can be explored, criticized, and transformed, until it looks completely unlike the solar system which inspired it.
Analogy is one of the chief ways in which the reference relations of models emerge and evolve. Once such a tentative model is created, it can be extended, reduced, and changed. These processes are often guided by the sort of model-models we've discussed, which experimentally extend the model's imagination and convenience based on reference relations initially established by analogy.
The origin of models involves many mistakes. Model-models may propose imaginative extensions or supposed conveniences which fail in their application. Analogical models start with absurdities. Any genuine story about the origin of models must be full of failures. And creativity, insofar as it involve the genesis of models, requires the safety to fail and recover.
The safety to fail and recover is the freedom to play. Play is as vital for creativity as breathing is for life. It is through play that we can take the risks and make the mistakes from which powerful models are born. It is through play that we can enter worlds which we do not know or only partially understand.
One of my life treasures is a handful of conversations about creativity with the physicist Richard Feynman. I was a graduate student at the time and our conversations revolved around my attempts to replicate a famous AI program called AM. AM, implemented by AI scientist Douglas Lenat, used computer programs as models for mathematical ideas and proposed new mathematical ideas by extension and transformation of these programs. In this way, a model of the computer programs served as a model-model for the mathematical ideas they described.
AM explored many avenues in its progress and many of these were dead-ends. In essence, it "played" with its programs and looked at the consequences of different changes and extensions. Feynman, in our discussions, thought this sort of "play" lay at the basis of human and scientific creativity as well. Trying different combinations and thinking about the consequences was, for him, the core of creative process.
Play of this sort may not be easy. When one is playing with complex physical or mathematical ideas, it certainly is not. But it may still be fun and this is important because it keeps us trying in the face of failure. The failure rate in the search for new models may be larger than in any other human activity. And the "fun" in play is so important because it keeps us going despite the neccessary and inevitable failures along the way.