test

The habits of highly effective test-takers

The composition section of this year’s college entrance exam was held this morning. Though the exam’s contents were embargoed while the test was in session (under the state secrets law), the questions surfaced on the web and in the evening papers this afternoon.

In Beijing, where 118,000 students sat for the exam, the essay topic was an old self-help chestnut made popular by motivational author Stephen Covey. Here’s the Beijing gaokao version of the “big rocks first” approach to time management, as reported by the Mirror:

During class, the teacher said, “Today we’ll do a little experiment.” Then he put a glass bottle filled with rocks onto the podium, and asked, “Is it full?” The students answered, “Yes.” Then the teacher took a tub of sand out from under the table and slowly poured it into the bottle. Then he asked, “Is it full?” The students thought for a while. Then the teacher took out a pitcher of water and poured it until the water level had reached the top of the bottle. “What does this experiment demonstrate?” the teacher asked. Chatter filled the classroom.

One student said, “Lots of things may look like they’ve reached their limit when there really is lots more space.”

Another student said, “Order is important. If you put the sand in first, the rocks won’t fit at all.”

Another student said, “That’s right. You have to put the rocks in first. Heavy things take precedence.”

Then another student said, “Not necessarily. Is it really impossible if you put in the sand and water first?”

Students were instructed to write at least 800 characters using this text as a starting point.

The choice of text surprised many people, who had guessed that this year’s topic would be related to the recent earthquake in Sichuan (though the earthquake did appear in the national exam paper I; see below). Instead, it was a familiar story that has even appeared as a sample question in test preparation materials, the Beijing Evening News reported.

Big-name writers invited by the evening Mirror to discuss the exam question differed widely in how they saw the question’s difficulty. Mai Jia (Ansuan) thought it was an awful question:

To say that this question discusses a particular truth, then that truth is superficial, something that I think even a kindergartner would understand.

This topic seems to be telling us something, a truth about life; for example, that life choices, priorities, perspective, and making good use of your skills are all important.

But from this perspective, although students can choose the subject and viewpoint of their writing, the question itself is severely limited, leaving test-takers wth little room to elaborate.

Zhou Ruchang, the noted Redologist, had precisely the opposite reaction.

I just found out that this year’s composition question was vastly different from years past. It’s not a single, set sentence, but rather a reflection and explanation of something conceptual. I believe this is an excellent choice. In the past, composition questions were basically lifeless topics that appeared to be testing students’ intelligence and abilities when they were really limiting them. This topic is broad enough to allow students to strike out boldly and create freely. It is a true test of their wisdom and ability.

Historical novelist Eryuehe agreed with Zhou, while Zhang Kangkang came down on Mai Jia’s side.

In other regions:

  • Sichuan: “Resolve” (坚强)
  • Guangdong: “When facing something for the first time, don’t be so quick to say ‘no’.”
  • Liaoning: “A newspaper reported that a few young people were spitting on a bus. Some reactions: (1) No sense of social morality; (2) Too non-conformist; (3) Used to it.”
  • Chongqing: “Living in nature”
  • Tianjin: “Common sense”
  • Shandong: A zen-inspired topic: “The spring comes and the grass grows by itself” (春来草自青).
  • Fujian: “Three people went to a store to buy drinks, and each chose a different kind. The first person chose fruit juice, saying, ‘I like sweet things.’ The second person chose coffee, saying, ‘I like things that are sweet and bitter.’ The third person bought spring water, saying, ‘I like things bland and flavorless’.”
  • Shanghai: Most of the time we are only concerned with ourselves. Write an essay about “Them.”
  • Zhejiang: “Touching the city, experiencing the countryside”
  • Jiangsu: “Curiosity”
  • Hubei: “Passing a tree, a branch blocks your path. Do you snap it off and cast it aside, or do you bend around it? A mangy stray dog approaches. Do you take pity on it and avoid it, or do you give it a swift kick? The elevator door opens. Do you hold back and let others go first, or do you charge forward and push through? You are standing next to a blind person at an intersection. The light changes – do you offer them a hand? How do you brush past other people? How do you accept your change from a streetside vendor? How do you look down to tighten your loose belt? How do you live with yourself when you are alone?”
  • Hunan: A line from a Han Yu poem: “In light rain, the capital streets are slick like butter; The color of the grass is seen at a distance, but not close up.”
  • Jiangxi: “Write a letter from the perspective of a rat, or of a rat’s natural enemy.”
  • Anhui: “Start off with feeling”
  • National (1): “The people’s lives are of paramount importance!” This one has to do with the earthquake relief effort. It starts by noting that “Hu Jintao, Wen Jiabao, and other party, government, and military leaders hastened to the disaster zone to direct the relief effort,” and goes on to give a very brief summary of rescue work, aid, astonishing survival stories, CCTV’s live broadcast, and the national mourning period. It ends with “The same feelings expressed in different ways: monetary donations, blood donations, benefit performances, close attention….” This text was used in Hebei, Henan, Shanxi, Shaanxi, and Guangxi.

Starting trouble for new IITs

The government had announced that it would start six new IITs by July this year. But three of them at Rajasthan, Andhra and Bihar are facing starting trouble and three more are still on paper. With a tight deadline of less than two months, there is still nothing on-ground regarding the three new IITs in Gujarat, Orissa and Punjab.

Ashok Misra, Director, IIT Bombay which is mentoring the new IIT in Gujarat said, “With regard to the new IITs, no funds have been allocated so far. We will have to hire new faculty. It’s a greenfield project. Nothing really exists on ground there. So, we will have to see how we can attract new faculty to those areas, to that IIT.”

Arkss Srinivas, Director, Time Coaching classes said, “None of the three new IITs are ready to start off.”

Lack of infrastructure

That’s not all, IITs in Rajasthan, Andhra Pradesh and Bihar are facing hiccups as well. Despite support from exitsing IITs, there is still no independent infrastructure or lab facilities in these institutes.

Arkss Srinivas, Director, Time Coaching classes said, “The first batch guys are going to be taking a slight risk but in the long process it’s a fifty-fifty game. The fact that they are getting into IIT is always going to be overriding the fact that they don’t have a campus the first three years.”

Need for counselling

Students with ranks up to 8000 in the IIT JEE are currently waiting to be counselled. They will decide on which IITs and which engineering branch they can get into based on their ranks.

Mansi, student, IIT-JEE ranked 6178 in IIT-JEE said, “I’ll go for the older IITs as the brand of the institute matters more to me and the infrastructure and faculty is more established in the seven IITs.”

Ayush, student, IIT-JEE, ranked 3910 in IIT-JEE said, “I don’t mind opting for the new IITs as the branch is more important to me.”

There’s a huge demand for students of an IITians calibre in the Indian industry and the new IITs are a step to meet that need. But the biggest challenge for IITs would be to ensure that the quality is not compromised for quantity.

courtesy

Sruthi Gottipati, CNBC-TV18

OBC students’ performance in JEE-2008

there are  1134 OBC students among the 8652 rank-holders in this year’s JEE.

In the absence of detailed data from the IITs, all we have got are bits of information. Here’s a news report (supposed to have been sourced from DNA, but I could not locate it on the DNA site), which claims that all the 1,134 rank-holders from the OBC category are in the “open merit list”. It goes on to quote some IIT official:

This is 14.35% of the general category. 8,652 candidates have qualified to seek admission to 6,872 seats available. Last year, 13.74% OBC candidates had made it to the IITs without reservation.

“The OBC candidates have secured good ranks and all are in the common merit list,” said a JEE official.

“So, a relaxed criterion may not need to be invoked for them.”

Now, the figure of 14.35% is wrong; 1134 out of 8652 comes out to 13.1 percent. More substantively, the last statement is confusing because iffy statements (such as “may not need to”) are meaningless when the results are already known — either a relaxed criterion was used, or it was not!

Anyways, I think the following interpretation is reasonable: Since the overall quantum of OBC reservation (X percent — I think X is 9 — in existing IITs and 27 percent in the six new IITs) works out to a number smaller than 13.1 percent (and since OBC students already form 13.1 percent of the JEE rank-holders) there was no need to invoke a relaxation in the cut-off marks for OBC students.

Is this interpretation correct?

[Here's yet another bit: I found a blog post that says that someone with an overall rank of 2902 has an OBC rank of 367. Thus, about 12.5 percent of the top 3000 ranks belong to OBCS.]

The news report says that on the first of August, the IITs will make all the JEE-related information public on their website. We’ll have to wait until then …

Physics, Topology, Logic and Computation:

A Rosetta Stone
John C. Baez
Department of Mathematics, University of California
Riverside, California 92521, USA
Mike Stay
Google, 1600 Amphitheatre Pkwy
Mountain View, California 94043, USA
email: baez@math.ucr.edu, stay@google.com
March 15, 2008
1
Introduction
Category theory is a very general formalism, but there is a certain special way that physicists use
categories which turns out to have close analogues in topology, logic and computation. A category has
objects and morphisms, which represent things and ways to go between things. In physics, the objects
are often physical systems, and the morphisms are processes turning a state of one physical system
into a state of another system — perhaps the same one. In quantum physics we often formalize this
by taking Hilbert spaces as objects, and linear operators as morphisms.
Sometime around 1949, Feynman [52] realized that in quantum field theory it is useful to draw
linear operators as diagrams:
This lets us reason with them pictorially. We can warp a picture without changing the operator
it stands for: all that matters is the topology, not the geometry. In the 1970s, Penrose realized
1
background image
that generalizations of Feynman diagrams arise throughout quantum theory, and might even lead to
revisions in our understanding of spacetime [71]. In the 1980s, it became clear that underlying these
diagrams is a powerful analogy between quantum physics and topology! Namely, a linear operator
behaves very much like a `cobordism’ — that is, an n-dimensional manifold going between manifolds
of one dimension less:
String theory exploits this analogy by replacing the Feynman diagrams of ordinary quantum field
theory with 2-dimensional cobordisms, which represent the worldsheets traced out by strings with the
passage of time. The analogy between operators and cobordisms is also important in loop quantum
gravity and — most of all — the more purely mathematical discipline of `topological quantum field
theory’.
Meanwhile, quite separately, logicians had begun using categories where the objects represent propo-
sitions and the morphisms represent proofs. The idea is that a proof is a process going from one propo-
sition (the hypothesis) to another (the conclusion). Later, computer scientists started using categories
where the objects represent data types and the morphisms represent programs. They also started using
`flow charts’ to describe programs. Abstractly, these are very much like Feynman diagrams!
The logicians and computer scientists were never very far from each other. Indeed, the `Curry­
Howard correspondence’ relating proofs to programs has been well-known at least since the early 1970s,
with roots stretching back earlier [33, 48]. But, it is only in the 1990s that the logicians and computer
scientists bumped into the physicists and topologists. One reason is the rise of interest in quantum
cryptography and quantum computation [26]. With this, people begain to think of quantum processes
as forms of information processing, and apply ideas from computer science. It was then realized that
the loose analogy between between flow charts and Feynman diagrams could be made more precise
and powerful with the aid of category theory [3].
By now there is an extensive network of interlocking analogies between physics, topology, logic and
computer science. They suggest that research in the area of common overlap is actually trying to build
a new science: a general science of systems and processes. Building this science will be very difficult.
There are good reasons for this, but also bad ones. One bad reason is that different fields use different
terminology and notation.
The original Rosetta Stone, created in 196 BC, contains versions of the same text in three languages:
demotic Egyptian, hieroglyphic script and classical Greek. Its rediscovery by Napoleon’s soldiers
let modern Egyptologists decipher the hieroglyphs. Eventually this led to a vast increase in our
understanding of Egyptian culture.
At present, the deductive systems in mathematical logic look like hieroglyphs to most physicists.
Similarly, quantum field theory is Greek to most computer scientists, and so on. So, there is a need
2
background image
for a new Rosetta Stone to aid researchers attempting to translate between fields. Table 1 shows our
guess as to what this Rosetta Stone might look like.
Category Theory
Physics
Topology
Logic
Computation
object
system
manifold
proposition
data type
morphism
process
cobordism
proof
program
Table 1: The Rosetta Stone (pocket version)
The rest of this paper expands on this table by comparing how categories are used in physics,
topology, logic, and computation. Unfortunately, these different fields focus on slightly different kinds
of categories. Though most physicists don’t know it, quantum physics has long made use of `compact
symmetric monoidal categories’. Knot theory uses `compact braided monoidal categories’, which are
slightly more general. However, it became clear in the 1990’s that these more general gadgets are
useful in physics too. Logic and computer science used to focus on `cartesian closed categories’ –
where `cartesian’ can be seen, roughly, as an antonym of `quantum’. However, thanks to work on linear
logic and quantum computation, some logicians and computer scientists have dropped their insistence
on cartesianness: now they study more general sorts of `closed symmetric monoidal categories’.
In Section 2 we explain these concepts, how they illuminate the analogy between physics and
topology, and how to work with them using string diagrams. We assume no prior knowledge of category
theory, only a willingness to learn some. In Section 3 we explain how closed symmetric monoidal
categories correspond to a small fragment of ordinary propositional logic, which also happens to be
a fragment of Girard’s `linear logic’ [41]. In Section 4 we explain how closed symmetric monoidal
categories correspond to a simple model of computation. Each of these sections starts with some
background material. In Section 5, we conclude by presenting a larger version of the Rosetta Stone.
Our treatment of all four subjects — physics, topology, logic and computation — is bound to seem
sketchy, narrowly focused and idiosyncratic to practitioners of these subjects. Our excuse is that we
wish to emphasize certain analogies while saying no more than absolutely necessary. To make up for
this, we include many references for those who wish to dig deeper.
2
The Analogy Between Physics and Topology
2.1
Background
Currently our best theories of physics are general relativity and the Standard Model of particle physics.
The first describes gravity without taking quantum theory into account; the second describes all the
other forces taking quantum theory into account, but ignores gravity. So, our world-view is deeply
schizophrenic. The field where physicists struggle to solve this problem is called quantum gravity, since
it is widely believed that the solution requires treating gravity in a way that takes quantum theory
into account.
Nobody is sure how to do this, but there is a striking similarity between two of the main approaches:
string theory and loop quantum gravity. Both rely on the analogy between physics and topology shown
3
background image
Physics
Topology
Hilbert space
(n
- 1)-dimensional manifold
(system)
(space)
operator between
cobordism between
Hilbert spaces
(n
- 1)-dimensional manifolds
(process)
(spacetime)
composition of operators
composition of cobordisms
identity operator
identity cobordism
Table 2: Analogy between physics and topology
in Table 2. On the left we have a basic ingredient of quantum theory: the category Hilb whose objects
are Hilbert spaces, used to describe physical systems, and whose morphisms are linear operators, used
to describe physical processes. On the right we have a basic structure in differential topology: the
category nCob. Here the objects are (n
-1)-dimensional manifolds, used to describe space, and whose
morphisms are n-dimensional cobordisms, used to describe spacetime.
As we shall see, Hilb and nCob share many structural features. Moreover, both are very different
from the more familiar category Set, whose objects are sets and whose morphisms are functions.
Elsewhere we have argued at great length that this is important for better understanding quantum
gravity [8] and even the foundations of quantum theory [9]. The idea is that if Hilb is more like nCob
than Set, maybe we should stop thinking of a quantum process as a function from one set of states
to another. Instead, maybe we should think of it as resembling a `spacetime’ going between spaces of
dimension one less.
This idea sounds strange, but the simplest example is something very practical, used by physicists
every day: a Feynman diagram. This is a 1-dimensional graph going between 0-dimensional collections
of points, with edges and vertices labelled in certain ways. Feynman diagrams are topological entities,
but they describe linear operators. String theory uses 2-dimensional cobordisms equipped with extra
structure — string worldsheets — to do a similar job. Loop quantum gravity uses 2d generalizations of
Feynman diagrams called `spin foams’ [7]. Topological quantum field theory uses higher-dimensional
cobordisms [11]. In each case, processes are described by morphisms in a special sort of category: a
`compact symmetric monoidal category’.
In what follows, we shall not dwell on puzzles from quantum theory or quantum gravity. Instead
we take a different tack, simply explaining some basic concepts from category theory and showing how
Set, Hilb, nCob and categories of tangles give examples. A recurring theme, however, is that Set is
very different from the other examples.
4
background image
To help the reader safely navigate the sea of jargon, here is a chart of the concepts we shall explain
in this section:
categories
monoidal categories
S
S
S
S
S
S
S
S
S
S
S
S
S
S
braided
monoidal categories
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
closed
monoidal categories
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
symmetric
monoidal categories
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
closed braided
monoidal categories
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
compact
monoidal categories
cartesian categories
closed symmetric
monoidal categories
mmm
mmm
mmm
mmm
m
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
compact braided
monoidal categories
cartesian
closed categories
compact symmetric
monoidal categories
The category Set is cartesian closed, while Hilb and nCob are compact symmetric monoidal.
2.2
Categories
Category theory was born around 1945, with Eilenberg and Mac Lane [38] defining `categories’, `func-
tors’ between categories, and `natural transformations’ between functors. By now there are many
introductions to the subject [32, 67, 68], including some available for free online [18, 44]. Nonetheless,
we begin at the beginning:
Definition 1 A category C consists of:
· a collection of objects, where if X is an object of C we write X C, and
· for every pair of objects (X,Y ), a set hom(X,Y ) of morphisms from X to Y . We call this set
hom(X, Y ) a homset. If f
hom(X,Y ), then we write f:X Y.
such that:
· for every object X there is an identity morphism 1
X
: X
X;
5
background image
· morphisms are composable: given f:X Y and g:Y Z, there is a composite morphism
gf : X
Z; sometimes also written g f.
· an identity morphism is both a left and a right unit for composition: if f:X Y, then
f 1
X
= f = 1
Y
f ; and
· composition is associative: (hg)f = h(gf) whenever either side is well-defined.
A category is the simplest framework where we can talk about systems (objects) and processes
(morphisms). To visualize these, we can use `Feynman diagrams’ of a very primitive sort. In appli-
cations to linear algebra, these diagrams are often called `spin networks’, but category theorists call
them `string diagrams’, and that is the term we will use. The term `string’ here has little to do with
string theory: instead, the idea is that objects of our category label `strings’ or `wires’:
X
and morphisms f : X
Y label `black boxes’ with an input wire of type X and an output wire of type
Y :
f
X
Y
We compose two morphisms by connecting the output of one black box to the input of the next. So,
the composite of f : X
Y and g:Y Z looks like this:
f
g
X
Y
Z
Associativity of composition is then implicit:
6
background image
f
g
h
X
Y
Z
W
is our notation for both h(gf ) and (hg)f . Similarly, if we draw the identity morphism 1
X
: X
X as
a piece of wire of type X:
X
then the left and right unit laws are also implicit.
There are countless examples of categories, but we will focus on four:
· Set: the category where objects are sets.
· Hilb: the category where objects are finite-dimensional Hilbert spaces.
· nCob: the category where morphisms are n-dimensional cobordisms.
· Tang
k
: the category where morphisms are k-codimensional tangles.
As we shall see, all four are closed symmetric monoidal categories, at least when k is big enough.
However, the most familiar of the lot, namely Set, is the odd man out: it is `cartesian’.
Traditionally, mathematics has been founded on the category Set, where the objects are sets and
the morphisms are functions. So, when we study systems and processes in physics, it is tempting to
specify a system by giving its set of states, and a process by giving a function from states of one system
to states of another.
However, in quantum physics we do something subtly different: we use categories where objects are
Hilbert spaces and morphisms are bounded linear operators. We specify a system by giving a Hilbert
space, but this Hilbert space is not really the set of states of the system: a state is actually a ray in
Hilbert space. Similarly, a bounded linear operator is not precisely a function from states of one system
to states of another.
In the day-to-day practice of quantum physics, what really matters is not sets of states and functions
between them, but Hilbert space and operators. One of the virtues of category theory is that it frees
us from the `Set-centric’ view of traditional mathematics and lets us view quantum physics on its
7
background image
own terms. As we shall see, this sheds new light on the quandaries that have always plagued our
understanding of the quantum realm [9].
To avoid technical issues that would take us far afield, we will take Hilb to be the category where ob-
jects are finite-dimensional Hilbert spaces and morphisms are linear operators (automatically bounded
in this case). Finite-dimensional Hilbert spaces suffice for some purporses; infinite-dimensional ones
are often important, but treating them correctly would require some significant extensions of the ideas
we want to explain here.
In physics we also use categories where the objects represent choices of space, and the morphisms
represent choices of spacetime. The simplest is nCob, where the objects are (n
- 1)-dimensional
manifolds, and the morphisms are n-dimensional cobordisms. Glossing over some subtleties that a
careful treatment would discuss [74], a cobordism f : X
Y is an n-dimensional manifold whose
boundary is the disjoint union of the (n
- 1)-dimensional manifolds X and Y . Here are a couple of
cobordisms in the case n = 2:
X
Y
f

Y
Z
g

We compose them by gluing the `output’ of one to the `input’ of the other. So, in the above example
gf : X
Z looks like this:
X
Z
gf

Another kind of category important in physics has objects representing collections of particles, and
morphisms representing their worldlines and interactions. Feynman diagrams are the classic example,
but in these diagrams the `edges’ are not taken literally as particle trajectories. An example with closer
ties to topology is Tang
k
.
Very roughly speaking, an object in Tang
k
is a collection of points in a k-dimensional cube, while
a morphism is a `tangle’: a collection of arcs and circles smoothly embedded in a (k + 1)-dimensional
cube, such that the circles lie in the interior of the cube, while the arcs touch the boundary of the cube
only at its top and bottom, and only at their endpoints. A bit more precisely, tangles are `isotopy
classes’ of such embedded arcs and circles: this equivalence relation means that only the topology of
the tangle matters, not its geometry. We compose tangles by attaching one cube to another top to
bottom.
More precise definitions can be found in many sources, at least for k = 2, which gives tangles in a
3-dimensional cube [40, 53, 74, 82, 90, 94]. But since a picture is worth a thousand words, here is a
8
background image
picture of a morphism in Tang
2
:
X
Y
f

Note that we can think of a morphism in Tang
k
as a 1-dimensional cobordism embedded in a k-
dimensional cube. This is why Tang
k
and nCob behave similarly in some respects.
Here are two composable morphisms in Tang
1
:
X
Y
f

Y
Z
g

and here is their composite:
X
Z
gf

Since only the tangle’s topology matters, we are free to squash this rectangle into a square if we want,
but we do not need to.
It is often useful to consider tangles that are decorated in various ways. For example, in an `oriented’
tangle, each arc and circle is equipped with an orientation. We can indicate this by drawing a little
9
background image
arrow on each curve in the tangle. In applications to physics, these curves represent worldlines of
particles, and the arrows say whether each particle is going forwards or backwards in time, following
Feynman’s idea that antiparticles are particles going backwards in time. We can also consider `framed’
tangles. Here each curve is replaced by a `ribbon’. In applications to physics, this keeps track of
how each particle twists. This is especially important for fermions, where a 2 twist acts nontrivially.
Mathematically, the best-behaved tangles are both framed and oriented [11, 82], and these are what
we should use to define Tang
k
. The category nCob also has a framed oriented version. However, these
details will barely matter in what is to come.
It is difficult to do much with categories without discussing the maps between them. A map between
categories is called a `functor’:
Definition 2 A functor F : C
D from a category C to a category D is map sending:
· any object X C to an object F(X) D,
· any morphism f:X Y in C to a morphism F(f):F(X) F(Y ) in D,
in such a way that:
· F preserves identities: for any object X C, F(1
X
) = 1
F (X)
;
· F preserves composition: for any pair of morphisms f:X Y , g:Y Z in C, F(gf) =
F (g)F (f ).
In the sections to come, we will see that functors and natural transformations are useful for putting
extra structure on categories. Here is a rather different use for functors: we can think of a functor
F : C
D as giving a picture, or `representation’, of C in D. The idea is that F can map objects and
morphisms of some `abstract’ category C to objects and morphisms of a more `concrete’ category D.
For example, consider an abstract group G. This is the same as a category with one object and with
all morphisms invertible. The object is uninteresting, so we can just call it
·, but the morphisms are the
elements of G, and we compose them by multiplying them. From this perspective, a representation
of G on a finite-dimensional Hilbert space is the same as a functor F : G
Hilb. Similarly, an action
of G on a set is the same as a functor F : G
Set. Both notions are ways of making an abstract group
more concrete.
Ever since Lawvere’s 1963 thesis on functorial semantics [64], the idea of functors as representations
has become pervasive. However, the terminology varies from field to field. Following Lawvere, logicians
often call the category C a `theory’, and call the functor F : C
D a `model’ of this theory. Other
mathematicians might call F an `algebra’ of the theory. In this work, the default choice of D is usually
the category Set.
In physics, it is the functor F : C
D that is called the `theory’. Here the default choice of D is
either the category we are calling Hilb or a similar category of infinite-dimensional Hilbert spaces. For
example, both `conformal field theories’ [78] and topological quantum field theories [6] can be seen as
functors of this sort.
If we think of functors as models, natural transformations are maps between models:
10
background image
Definition 3 Given two functors F, F : C
D, a natural transformation :F F assigns to
every object X in C a morphism
X
: F (X)
F (X) such that for any morphism f:X Y in C, the
equation
Y
F (f ) = F (f )
X
holds in D. In other words, this square commutes:
F (X)
F (Y )
F (X)
F (Y )
E
F (f )
c
X
c
Y
E
F (f )
(Going across and then down equals going down and then across.)
Definition 4 A natural isomorphism between functors F, F : C
D is a natural transformation
: F
F such that
X
is an isomorphism for every X
C.
For example, suppose F, F : G
Hilb are functors where G is a group, thought of as a category
with one object, say
·. Then, as already mentioned, F and F are secretly just representations of G
on the Hilbert spaces F (
·) and F (·). A natural transformation :F F is then the same as an
intertwining operator from one representation to another: that is, a linear operator
A: F (
·) F (·)
satisfying
AF (g) = F (g)A
for all group elements g.
2.3
Monoidal Categories
In physics, it is often useful to think of two systems sitting side by side as forming a single system.
In topology, the disjoint union of two manifolds is again a manifold in its own right. In logic, the
conjunction of two statement is again a statement. In programming we can combine two data types
into a single `product type’. The concept of `monoidal category’ unifies all these examples in a single
framework.
A monoidal category C has a functor
:C × C C that takes two objects X and Y and puts
them together to give a new object X
Y . To make this precise, we need the cartesian product of
categories:
Definition 5 The cartesian product C
× C of categories C and C is the category where:
· an object is a pair (X,X ) consisting of an object X C and an object X C ;
· a morphism from (X,X ) to (Y,Y ) is a pair (f,f ) consisting of a morphism f:X Y and a
morphism f : X
Y ;
11
background image
· composition is done componentwise: (g,g )(f,f ) = (gf,g f );
· identity morphisms are defined componentwise: 1
(X,X )
= (1
X
, 1
X
).
Mac Lane [66] defined monoidal categories in 1963. The subtlety of the definition lies in the fact
that (X
Y ) Z and X (Y Z) are not usually equal. Instead, we should specify an isomorphism
between them, called the `associator’. Similarly, while a monoidal category has a `unit object’ I, it is
not usually true that I
X and X I equal X. Instead, we should specify isomorphisms I X = X
and X
I = X. To be manageable, all these isomorphisms must then satisfy certain equations:
Definition 6 A monoidal category consists of:
· a category C,
· a tensor product functor :C × C C,
· a unit object I C,
· a natural isomorphism called the associator, assigning to each triple of objects X,Y,Z C an
isomorphism
a
X,Y,Z
: (X
Y ) Z
X (Y Z),
· natural isomorphisms called the left and right unitors, assigning to each object X C isomor-
phisms
l
X
: I
X
X
r
X
: X
I
X,
such that:
· for all X,Y C the triangle equation holds:
(X
I) Y
X
(I Y )
X
Y
E
a
X,I,Y
rr
r
j
r
X
1
Y
¨
¨
¨
%
1
X
l
Y
12
background image
· for all W,X,Y,Z C, the pentagon equation holds:
((W
X) Y ) Z
(W
(X Y )) Z
(W
X) (Y Z)
W
((X Y ) Z)
W
(X (Y Z))
©
a
W
X,Y,Z
rr
rr
rr
r
r
j
a
W,X,Y
1
Z
c
a
W,X
Y,Z
d
d
d
d
d
d
d
d
d
d
a
W,X,Y
Z
¨
¨
¨
¨
¨
¨
¨
¨
%
1
W
a
X,Y,Z
When we have a tensor product of four objects, there are five ways to parenthesize it, and at first
glance the associator lets us build two isomorphisms from W
(X (Y Z)) to ((W X) Y ) Z.
But, the pentagon equation says these isomorphisms are equal. When we have tensor products of even
more objects there are even more ways to parenthesize them, and even more isomorphisms between
them built from the associator. However, Mac Lane showed that the pentagon identity implies these
isomorphisms are all the same. Similarly, if we also assume the triangle equation, all isomorphisms
with the same source and target built from the associator, left and right unit laws are equal.
In a monoidal category we can do processes in `parallel’ as well as in `series’. Doing processes in series
is just composition of morphisms, which works in any category. But in a monoidal category we can also
tensor morphisms f : X
Y and f :X Y and obtain a `parallel process’ f f :X X Y Y .
We can draw this in various ways:
f
X
Y
f
X
Y
=
f
f
X
Y
X
Y
=
f
f
X
X
Y
Y
More generally, we can draw any morphism
f : X
1
··· X
n
Y
1
··· Y
m
13
background image
as a black box with n input wires and m output wires:
f
X
1
X
2
X
3
Y
1
Y
2
We draw the unit object I as a blank space. So, for example, we draw a morphism f : I
X as follows:
f
X
By composing and tensoring morphisms, we can build up elaborate pictures resembling Feynman
diagrams:
f
g
h
j
X
1
X
2
X
3
X
4
Y
1
Y
2
Y
3
Y
4
Z
The laws governing a monoidal category allow us to neglect associators and unitors when drawing
such pictures, without getting in trouble. The reason is that Mac Lane’s Coherence Theorem says
any monoidal category is `equivalent’, in a suitable sense, to one where all associators and unitors are
identity morphisms [66].
We can also deform the picture in a wide variety of ways without changing the morphism it describes.
For example, the above morphism equals this one:
f
g
h
j
X
1
X
2
X
3
X
4
Y
1
Y
2
Y
3
Y
4
Z
14
background image
Everyone who uses string diagrams for calculations in monoidal categories starts by worrying about
the rules of the game: precisely how can we deform these pictures without changing the morphisms
they describe? Instead of stating the rules precisely — which gets a bit technical — we urge you to
explore for yourself what is allowed and what is not. For example, show that we can slide black boxes
up and down like this:
f
g
X
1
Y
1
X
2
Y
2
=
f
g
X
1
Y
1
X
2
Y
2
=
f
g
X
1
Y
1
X
2
Y
2
For a formal treatment of the rules governing string diagrams, try the original papers by Joyal and
Street [50] and the book by Yetter [94].
Now let us turn to examples. Here it is crucial to realize that the same category can often be
equipped with different tensor products, resulting in different monoidal categories:
· There is a way to make Set into a monoidal category where X Y is the cartesian product X ×Y
and the unit object is any one-element set. Note that this tensor product is not strictly associative,
since (x, (y, z)) = ((x, y), z), but there’s a natural isomorphism (X
× Y ) × Z = X × (Y × Z),
and this is our associator. Similar considerations give the left and right unitors. In this monoidal
category, the tensor product of f : X
Y and f :X Y is the function
f
× f :X × X Y × Y
(x, x )
(f(x),f (x )).
There is also a way to make Set into a monoidal category where X
Y is the disjoint union of X
and Y , which we shall denote by X + Y . Here the unit object is the empty set. Again, as indeed
with all these examples, the associative law and left/right unit laws hold only up to natural
isomorphism. In this monoidal category, the tensor product of f : X
Y and f :X Y is the
function
f + f :
X + X
Y + Y
x
f (x)
if x
X,
f (x)
if x
X .
However, in what follows, when we speak of Set as a monoidal category, we always use the
cartesian product!
· There is a way to make Hilb into a monoidal category with the usual tensor product of Hilbert
spaces:
C
n
C
m

=
C
nm
. In this case the unit object I can be taken to be an 1-dimensional
Hilbert space, for example
C.
There is also way to make Hilb into a monoidal category where the tensor product is the direct
sum:
C
n
C
m

=
C
n+m
. In this case the unit object is the zero-dimensional Hilbert space,
{0}.
However, in what follows, when we speak of Hilb as a monoidal category, we always use the usual
tensor product!
15
background image
· The tensor product of objects and morphisms in nCob is given by disjoint union. For example,
the tensor product of these two morphisms:
X
Y
f

X
Y
f

is this:
X
X
Y
Y
f
f

· The category Tang
k
is monoidal when k
1, where the the tensor product is given by disjoint
union. For example, given these two tangles:
X
Y
f

X
Y
f

their tensor product looks like this:
X
X
Y
Y
f
f

The example of Set with its cartesian product is different from our other three main examples,
because the cartesian product of sets X
×X comes equipped with functions called `projections’ to the
sets X and X :
X
X
× X
X

p
E
p
16
background image
Our other main examples lack this feature — though Hilb made into a monoidal category using
has
projections. Also, every set has a unique function to the the one-element set:
!
X
: X
I.
Again, our other main examples lack this feature, though Hilb made into a monoidal category using
has it. A fascinating feature of quantum mechanics is that we make Hilb into a monoidal category
using
instead of , even though the latter approach would lead to a category more like Set.
We can isolate the special features of the cartesian product of sets and its projections, obtaining a
definition that applies to any category:
Definition 7 Given objects X and X in some category, we say an object X
× X equipped with
morphisms
X
X
× X
X

p
E
p
is a cartesian product (or simply product) of X and X if for any object Q and morphisms
Q
X
X
©
f
dd
f
there exists a unique morphism g: Q
X × X making the following diagram commute:
Q
X
X
× X
X
©
f
d
d
d
d
d
f
c
g

p
E
p
(That is, f = pg and f = p g.) We say a category has binary products if every pair of objects has
a product.
The product may not exist, and it may not be unique, but when it exists it is unique up to a canonical
isomorphism. This justifies our speaking of `the’ product of objects X and Y when it exists, and
denoting it as X
× Y .
The definition of cartesian product, while absolutely fundamental, is a bit scary at first sight. To
illustrate its power, let us do something with it: combine two morphisms f : X
Y and f :X Y
into a single morphism
f
× f :X × X Y × Y .
The definition of cartesian product says how to build a morphism of this sort out of a pair of morphisms:
17
background image
namely, morphisms from X
× X to Y and Y . If we take these to be fp and f p , we obtain f × f :
X
× X
Y
Y
× Y
Y
©
f p
d
d
d
d
f p
c
f
×f

p
E
p
Next, let us isolate the special features of the one-element set:
Definition 8 An object 1 in a category C is terminal if for any object Q
C there exists a unique
morphism from Q to 1, which we denote as !
Q
: Q
1.
Again, a terminal object may not exist and may not be unique, but it is unique up to a canonical
isomorphism. This is why we can speak of `the’ terminal object of a category, and denote it by a
specific symbol, 1.
We have introduced the concept of binary products. One can also talk about n-ary products for
other values of n, but a category with binary products has n-ary products for all n
1, since we can
construct these as iterated binary products. The case n = 1 is trivial, since the product of one object
is just that object itself (up to canonical isomorphism). The remaining case is n = 0. The zero-ary
product of objects, if it exists, is just the terminal object. So, we make the following definition:
Definition 9 A category has finite products if it has binary products and a terminal object.
A category with finite products can always be made into a monoidal category by choosing a specific
product X
× Y to be to the tensor product X Y , and choosing a specific terminal object to be the
unit object. It takes a bit of work to show this! A monoidal category of this form is called cartesian.
In a cartesian category, we can `duplicate and delete information’. In general, the definition of
cartesian products gives a way to take two morphisms f
1
: Q
X and f
2
: Q
Y and combine them
into a single morphism from Q to X
×Y . If we take Q = X = Y and take f
1
and f
2
to be the identity,
we obtain the diagonal or duplication morphism:
X
: X
X × X.
In the category Set one can check that this maps any element x
X to the pair (x,x). In general, we
can draw the diagonal as follows:
X
X
X
18
background image
Similarly, we call the unique map to the terminal object
!
X
: X
1
the deletion morphism, and draw it as follows:
!
X
Note that we draw the unit object as an empty space.
A fundamental fact about cartesian categories is that duplicating something and then deleting
either copy is the same as doing nothing at all! In string diagrams, this says:
!
X
X
X
=
X
=
!
X
X
X
We leave the proof as an exercise for the reader.
Many of the puzzling features of quantum theory come from the noncartesianness of the usual
tensor product in Hilb. For example, in a cartesian category, every morphism
g
X
X
is actually of the form
f
X
f
X
In the case of Set, this says that every point of the set X
×X comes from a point of X and a point of
X . In physics, this would say that every state g of the combined system X
X is built by combining
19
background image
states of the systems X and X . Bell’s theorem [17] says that is not true in quantum theory. The
reason is that quantum theory uses the noncartesian monoidal category Hilb!
Also, in quantum theory we cannot freely duplicate or delete information. Wootters and Zurek [93]
proved a precise theorem to this effect, focused on duplication: the `no-cloning theorem’. One can also
prove a `no-deletion theorem’. Again, these results rely on the noncartesian tensor product in Hilb.
2.4
Braided Monoidal Categories
In physics, there is often a process that lets us `switch’ two systems by moving them around each other.
In topology, there is a tangle that describes the process of switching two points:
In logic, we can switch the order of two statements in a conjunction: the statement `X and Y ‘ is
isomorphic to `Y and X’. In computation, there is a simple program that switches the order of two
pieces of data. A monoidal category in which we can do this sort of thing is called `braided’:
Definition 10 A braided monoidal category consists of:
· a monoidal category C,
· a natural isomorphism called the braiding that assigns to every pair of objects X,Y C an
isomorphism
b
X,Y
: X
Y Y X,
such that the hexagon equations hold:
X
(Y Z)
(X
Y ) Z
(Y
X) Z
(Y
Z) X
Y
(Z X)
Y
(X Z)
(X
Y ) Z
X
(Y Z)
X
(Z Y )
Z
(X Y )
(Z
X) Y
(X
Z) Y
E
a
-1
X,Y,Z
c
b
X,Y
Z
E
b
X,Y
1
Z
c
a
Y,X,Z

a
-1
Y,Z,X

1
Y
b
X,Z
E
a
X,Y,Z
c
b
X
Y,Z
E
1
X
b
Y,Z
c
a
-1
X,Z,Y

a
Z,X,Y

b
Z,X
1
Y
20
background image
The first hexagon equation says that switching the object X past Y
Z all at once is the same as
switching it past Y and then past Z (with some associators thrown in to move the parentheses). The
second one is similar: it says switching X
Y past Z all at once is the same as doing it in two steps.
In string diagrams, we draw the braiding b
X,Y
: X
Y Y X like this:
X
Y
We draw its inverse b
-1
X,Y
like this:
Y
X
This is a nice notation, because it makes the equations saying that b
X,Y
and b
-1
X,Y
are inverses `topo-
logically true’:
X
X
Y
Y
=
X
Y
=
Y
Y
X
X
Here are the hexagon equations as string diagrams:
X
X
Y
Z
Y
Z
=
X
Y
Z
Y
X
Z
21
background image
X
Y
X
Y
Z
Z
=
Y
Z
X
Y
X
Z
For practice, we urge you to prove the following equations:
f
g
X
X
Y
Y
=
g
f
X
X
Y
Y
Z
X
Y
X
Y
Z
=
Y
Z
X
X
Y
Z
If you get stuck, here are some hints. The first equation follows from the naturality of the braiding.
The second is called the Yang­Baxter equation and follows from a combination of naturality and
the hexagon equations [51].
Next, here are some examples. There can be many different ways to give a monoidal category a
braiding, or none. However, most of our favorite examples come with well-known `standard’ braidings:
· Any cartesian category automatically becomes braided, and in Set with its cartesian product,
this standard braiding is given by:
b
X,Y
: X
× Y Y × X
(x, y)
(y,x).
22
background image
· In Hilb with its usual tensor product, the standard braiding is given by:
b
X,Y
: X
Y Y X
x
y y x.
· The monoidal category nCob has a standard braiding where b
X,Y
is diffeomorphic to the disjoint
union of cylinders X
× [0,1] and Y × [0,1]. For 2Cob this braiding looks as follows when X and
Y are circles:
X
Y
Y
X
b
X,Y

· The monoidal category Tang
k
has a standard braiding when k
2. For k = 2 this looks as
follows when X and Y are each a single point:
X
Y
Y
X
b
X,Y

The example of Tang
k
illustrates an important pattern. Tang
0
is just a category, because in
0-dimensional space we can only do processes in `series’: that is, compose morphisms. Tang
1
is a
monoidal category, because in 1-dimensional space we can also do processes in `parallel’: that is, tensor
morphisms. Tang
2
is a braided monoidal category, because in 2-dimensional space there is room to
move one object around another. Next we shall see what happens when space has 3 or more dimensions!
2.5
Symmetric Monoidal Categories
Sometimes switching two objects and switching them again is the same as doing nothing at all. In-
deed, this situation is very familiar. So, the first braided monoidal categories to be discovered were
`symmetric’ ones [66]:
Definition 11 A symmetric monoidal category is a braided monoidal category where the braiding
satisfies b
X,Y
= b
-1
Y,X
.
23
background image
So, in a symmetric monoidal category,
X
Y
Y
X
=
X
Y
or equivalently:
X
Y
=
Y
X
Any cartesian category automatically becomes a symmetric monoidal category, so Set is symmetric.
It is also easy to check that Hilb, nCob are symmetric monoidal categories. So is Tang
k
for k
3.
Interestingly, Tang
k
`stabilizes’ at k = 3: increasing the value of k beyond this value merely gives
a category equivalent to Tang
3
. The reason is that we can already untie all knots in 4-dimensional
space; adding extra dimensions has no real effect. In fact, Tang
k
for k
3 is equivalent to 1Cob. This
is part of a conjectured larger pattern called the `Periodic Table’ of n-categories [11]. A piece of this
is shown in Table 3.
An n-category has not only morphisms going between objects, but 2-morphisms going between
morphisms, 3-morphisms going between 2-morphisms and so on up to n-morphisms. In topology we
can use n-categories to describe tangled higher-dimensional surfaces [12], and in physics we can use
them to describe not just particles but also strings and higher-dimensional membranes [11, 13]. The
Rosetta Stone we are describing concerns only the n = 1 column of the Periodic Table. So, it is
probably just a fragment of a larger, still buried n-categorical Rosetta Stone.
2.6
Closed Categories
In quantum mechanics, one can encode a linear operator f : X
Y into a quantum state using a
technique called `gate teleportation’ [45]. In topology, there is a way to take any tangle f : X
Y and
bend the input back around to make it part of the output. In logic, we can take a proof that goes from
some assumption X to some conclusion Y and turn it into a proof that goes from no assumptions to
the conclusion `X implies Y ‘. In computer science, we can take any program that takes input of type
X and produces output of type Y , and think of it as a piece of data of a new type: a `function type’.
The underlying concept that unifies all these examples is the concept of a `closed category’.
Given objects X and Y in any category C, there is a set of morphisms from X to Y , denoted
hom(X, Y ). In a closed category there is also an object of morphisms from X to Y , which we denote
24
background image
n = 0
n = 1
n = 2
k = 0
sets
categories
2-categories
k = 1
monoids
monoidal
monoidal
categories
2-categories
k = 2
commutative
braided
braided
monoids
monoidal
monoidal
categories
2-categories
k = 3
`’
symmetric
sylleptic
monoidal
monoidal
categories
2-categories
k = 4
`’
`’
symmetric
monoidal
2-categories
k = 5
`’
`’
`’
k = 6
`’
`’
`’
Table 3: The Periodic Table: conjectured descriptions of (n + k)-categories with only one j-morphism
for j < k.
by X
Y . (Many other notations are also used.) In this situation we speak of an `internal hom’,
since the object X
Y lives inside C, instead of `outside’, in the category of sets.
Closed categories were introduced in 1966, by Eilenberg and Kelly [37]. While these authors were
able to define a closed structure for any category, it turns out that the internal hom is most easily
understood for monoidal categories. The reason is that when our category has a tensor product, it
is closed precisely when morphisms from X
Y to Z are in natural one-to-one correspondence with
morphisms from Y to X
Z. In other words, it is closed when we have a natural isomorphism
hom(X
Y,Z) = hom(Y,X Z)
f
~
f
For example, in the category Set, if we take X
Y to be the cartesian product X × Y , then X
Z
is just the set of functions from Y to Z, and we have a one-to-one correspondence between
· functions f that eat elements of X × Y and spit out elements of Z
and
· functions
~
f that eat elements of Y and spit out functions from X to Z.
This correspondence goes as follows:
~
f (x)(y) = f (x, y).
25
background image
Before considering other examples, we should make the definition of `closed monoidal category’
completely precise. For this we must note that for any category C, there is a functor
hom: C
op
× C Set.
Definition 12 The opposite category C
op
of a category C has the same objects as C, but a mor-
phism f : x
y in C
op
is a morphism f : y
x in C, and the composite gf in C
op
is the composite f g
in C.
Definition 13 For any category C, the hom functor
hom: C
op
× C Set
sends any object (X, Y )
C
op
× C to the set hom(X,Y ), and sends any morphism (f,g) C
op
× C
to the function
hom(f, g): hom(X, Y )
hom(X ,Y )
h
ghf
when f : X
X and g:Y Y as morphisms in C.
Definition 14 A monoidal category C is left closed if there is an internal hom functor
: C
op
× C C
together with a natural isomorphism c called currying that assigns to any objects X, Y, Z
C a
bijection
c
X,Y,Z
: hom(X
Y,Z)
hom(X,Y
Z)
It is right closed if there is an internal hom functor as above and a natural isomorphism
c
X,Y,Z
: hom(X
Y,Z)
hom(Y,X Z).
The term `currying’ is mainly used in computer science, after the work of Curry [33]. In the rest
of this section we only consider right closed monoidal categories. Luckily, there is no real difference
between left and right closed for a braided monoidal category, as the braiding gives an isomorphism
X
Y = Y X.
All our examples of monoidal categories are closed, but we shall see that, yet again, Set is different
from the rest:
· The cartesian category Set is closed, where X
Y is just the set of functions from X to Y . In
Set or any other cartesian closed category, the internal hom X
Y is usually denoted Y
X
. To
minimize the number of different notations and emphasize analogies between different contexts,
we shall not do this: we shall always use X
Y . To treat Set as left closed, we define the curried
version of f : X
× Y Z as above:
~
f(x)(y) = f (x, y).
To treat it as right closed, we instead define it by
~
f(y)(x) = f (x, y).
This looks a bit awkward, but it will be nice for string diagrams.
26
background image
· The symmetric monoidal category Hilb with its usual tensor product is closed, where X Y is
the set of linear operators from X to Y , made into a Hilbert space in a standard way. In this
case we have an isomorphism
X
Y

= X
Y
where X
is the dual of the Hilbert space X, that is, the set of linear operators f : X
C, made
into a Hilbert space in the usual way.
· The monoidal category Tang
k
(k
1) is closed. As with Hilb, we have
X
Y

= X
Y
where X
is the orientation-reversed version of X.
· The symmetric monoidal category nCob is also closed; again
X
Y

= X
Y
where X
is the (n
- 1)-manifold X with its orientation reversed.
Except for Set, all these examples are actually `compact’. This basically means that X
Y is
isomorphic to X
Y , where X
is some object called the `dual’ of X. But to make this precise, we
need to define the `dual’ of an object in an arbitrary monoidal category.
To do this, let us generalize from the case of Hilb. As already mentioned, each object X
Hilb has
a dual X
consisting of all linear operators f : X
I, where the unit object I is just C. There is thus
a linear operator
e
X
: X
X
I
x
f f(x)
called the counit of X. Furthermore, the space of all linear operators from X to Y
Hilb can be
identified with X
Y . So, there is also a linear operator called the unit of X:
i
x
: I
X
X
c
c 1
X
sending any complex number c to the corresponding multiple of the identity operator.
The significance of the unit and counit become clearer if we borrow some ideas from Feynman.
In physics, if X is the Hilbert space of internal states of some particle, X
is the Hilbert space for
the corresponding antiparticle. Feynman realized that it is enlightening to think of antiparticles as
particles going backwards in time. So, we draw a wire labelled by X
as a wire labelled by X, but
with an arrow pointing `backwards in time’: that is, up instead of down:
X
=
X
(Here we should admit that most physicists use the opposite convention, where time marches up the
page. Since we read from top to bottom, we prefer to let time run down the page.)
27
background image
If we draw X
as X going backwards in time, we can draw the unit as a cap:
X
X
and the counit as a cup:
X
X
In Feynman diagrams, these describe the creation and annihilation of virtual particle-antiparticle pairs!
It then turns out that the unit and counit satisfy two equations, the zig-zag equations:
X
X
=
X
X
X
=
X
Verifying these is a fun exercise in linear algebra, which we leave to the reader. If we write these
equations as commutative diagrams, we need to include some associators and unitors, and they become
a bit intimidating:
X
I
X
(X
X)
(X
X
)
X
X
I
X
I
X
(X
X) X
X
(X X
)
X
X
I
E
1
X
i
X
c
r
X
E
a
-1
X,X
,X
c
e
X
1
X

l
X
E
i
X
1
X
c
l
X
E
a
X
,X,X
c
1
X
e
X

r
X
28
background image
But, they really just say that zig-zags in string diagrams can be straightened out.
This is particularly vivid in examples like Tang
k
and nCob. For example, in 2Cob, taking X to be
the circle, the unit looks like this:
I
X
X
i
X

while the counit looks like this:
X
X
I
e
X

In this case, the zig-zag identities say we can straighten a wiggly piece of pipe.
Now we are ready for some definitions:
Definition 15 Given objects X
and X in a monoidal category, we call X
a right dual of X, and
X a left dual of X
, if there are morphisms
i
X
: I
X
X
and
e
X
: X
X
I,
called the unit and counit respectively, satisfying the zig-zag equations.
One can show that the left or right dual of an object is unique up to canonical isomorphism. So, we
usually speak of `the’ right or left dual of an object, when it exists.
Definition 16 A monoidal category C is compact if every object X
C has both a left dual and a
right dual.
Often the term `autonomous’ is used instead of `compact’ here. Many authors reserve the term `com-
pact’ for the case where C is symmetric or at least braided; then left duals are the same as right duals,
and things simplify [40]. To add to the confusion, compact symmetric monoidal categories are often
called simply `compact closed categories’.
A partial explanation for the last piece of terminology is that any compact monoidal category is
automatically closed! For this, we define the internal hom on objects by
X
Y = X
Y.
29
background image
We must then show that the
operation extends naturally to a functor :C C, so that is actually
a functor. Finally, we must check that there is a natural isomorphism
hom(X
Y,Z) = hom(Y,X
Z)
In terms of string diagrams, this isomorphism takes any morphism
f
X
Y
Z
and bends back the input wire labelled X to make it an output:
f
X
Y
Z
Now, in a compact monoidal category, we have:
X
Z
=
X
Z
But in general, closed monoidal categories don’t allow arrows pointing up! So for these, drawing the
internal hom is more of a challenge. We can use the same style of notation as long as we add a
decoration — a clasp — that binds two strings together:
X
Z
:=
X
Z
Only when our closed monoidal category happens to be compact can we eliminate the clasp.
Suppose we are working in a closed monoidal category. Since we draw a morphism f : X
Y Z
like this:
f
X
Y
Z
30
background image
we can draw its curried version ~
f : Y
X
Z by bending down the input wire labelled X to make
it part of the output:
f
X
Y
Z
Note that where we bent back the wire labelled X, a cap like this appeared:
X
X
Closed monoidal categories don’t really have a cap unless they are compact. So, we drew a bubble
enclosing f and the cap, to keep us from doing any illegal manipulations. In the compact case, both
the bubble and the clasp are unnecessary, so we can draw ~
f like this:
f
X
Y
Z
An important special case of currying gives the name of a morphism f : X
Y ,
f : I
X
Y.
This is obtained by currying the morphism
f r
x
: I
X Y.
In string diagrams, we draw f as follows:
f
X
Y
In the category Set, the unit object I is the one-element set. So, a morphism from I to any set Q
picks out a point of Q. In particular, the name f : I
X
Y picks out the element of X
Y
corresponding to the function f : X
Y . More generally, in any cartesian closed category, a morphism
from 1 to an object Q is called a point of Q. So, even in this case, we can say the name of a morphism
f : X
Y is a point of X
Y .
Something similar works for Hilb, though this example is compact rather than cartesian. In Hilb,
the unit object I is just
C. So, a nonzero morphism from I to any Hilbert space Q picks out a nonzero
31
background image
vector in Q, which we can normalize to obtain a state in Q: that is, a unit vector. In particular,
the the name of a nonzero morphism f : X
Y gives a state of X
Y . This method of encoding
operators as states is the basis of `gate teleportation’ [45].
Currying is a bijection, so we can also uncurry:
c
-1
X,Y,Z
: hom(Y, X
Z)
hom(X Y,Z)
g
g.
Since we draw a morphism g: Y
X Z like this:
g
X
Y
Z
we draw its `uncurried’ version g: X
Y Z by bending the output X up to become an input:
g
X
Y
Z
Again, we must put a bubble around the `cup’ formed when we bend down the wire labelled Y , unless
we are in a compact monoidal category.
A good example of uncurrying is the evaluation morphism:
ev
X,Y
: X
(X
Y )
Y.
This is obtained by uncurrying the identity
1
X
Y
: (X
Y )
(X
Y ).
In Set, ev
X,Y
takes any function from X to Y and evaluates it at any element of X to give an element
of Y . In terms of string diagrams, the evaluation morphism looks like this:
ev
X
X
Y
Y
=
X
X
Y
Y
32
background image
In any closed monoidal category, we can recover a morphism from its name using evaluation. More
precisely, this diagram commutes:
X
I
X
X
(X
Y )
Y
c
1
X
f

r
-1
c
f
E
ev
X,Y
Or, in terms of string diagrams:
f
X
X
Y
Y
=
f
X
Y
We leave the proof of this as an exercise. In general, one must use the naturality of currying. In the
special case of a compact monoidal category, there is a nice picture proof! Simply pop the bubbles and
remove the clasp:
f
X
X
Y
Y
=
f
X
Y
The result then follows from one of the zig-zag identities.
In our rapid introduction to string diagrams, we have not had time to illustrate how these diagrams
become a powerful tool for solving concrete problems. So, here are some starting points for further
study:
33
background image
· Representations of Lie groups play a fundamental role in quantum physics, especially gauge
field theory. Every Lie group has a compact symmetric monoidal category of finite-dimensional
representations. In his book Group Theory, Cvitanovic [34] develops detailed string diagram
descriptions of these representation categories for the classical Lie groups SU(n), SO(n), SU(n)
and also the more exotic `exceptional’ Lie groups. His book also illustrates how this technology
can be used to simplify difficult calculations in gauge field theory.
· Quantum groups are a generalization of groups which show up in 2d and 3d physics. The big
difference is that a quantum group has compact braided monoidal category of finite-dimensional
representation. Kauffman’s Knots and Physics [54] is an excellent introduction to how quantum
groups show up in knot theory and physics; it is packed with string diagrams. For more details
on quantum groups and braided monoidal categories, see the book by Kassel [53].
· Kauffman and Lins [55] have written a beautiful string diagram treatment of the category of
representations of the simplest quantum group, SU
q
(2). They also use it to construct some
famous 3-manifold invariants associated to 3d and 4d topological quantum field theories: the
Witten­Reshetikhin­Turaev, Turaev­Viro and Crane­Yetter invariants. In this example, string
diagrams are often called `q-deformed spin networks’ [84]. For generalizations to other quantum
groups, see the more advanced texts by Turaev [90] and by Bakalov and Kirillov [14]. The
key ingredient is a special class of compact braided monoidal categories called `modular tensor
categories’.
· Kock [59] has written a nice introduction to 2d topological quantum field theories which uses
diagrammatic methods to work with 2Cob.
· Abramsky, Coecke and collaborators [2, 3, 4, 28, 30, 31] have developed string diagrams as a tool
for understanding quantum computation. The easiest introduction is Coecke’s `Kindergarten
quantum mechanics’ [29].
2.7
Dagger Categories
Our discussion would be sadly incomplete without an important admission: nothing we have done so
far with Hilbert spaces used the inner product! So, we have not yet touched on the essence of quantum
theory.
In fact, everything we have said about Hilb applies equally well to Vect: the category of finite-
dimensional vector spaces and linear operators. Both Hilb and Vect are compact symmetric monoidal
categories. In fact, these compact symmetric monoidal categories are `equivalent’ in a certain precise
sense [67].
So, what makes Hilb different? In terms of category theory, the special thing is that we can take
the Hilbert space adjoint of any linear operator f : X
Y between finite-dimensional Hilbert spaces,
getting an operator f
: Y
X. This ability to `reverse’ morphisms makes Hilb into a `dagger category’:
Definition 17 A dagger category is a category C such that for any morphism f : X
Y in C there
is a specified morphism f
: Y
X such that
(gf )
= f
g
34
background image
for every pair of composable morphisms f and g, and
(f
)
= f
for every morphism f .
Equivalently, a dagger category is one equipped with a functor
:C C
op
that is the identity on
objects and satisfies (f
)
= f for every morphism.
In fact, all our favorite examples of categories can be made into dagger categories, except for Set:
· There is no way to make Set into a dagger category, since there is a function from the empty set
to the 1-element set, but none the other way around.
· The category Hilb becomes a dagger category as follows. Given any morphism f:X Y in Hilb,
there is a morphism f
: Y
X, the Hilbert space adjoint of f, defined by
f
, = , f
for all
X, Y .
· For any k, the category Tang
k
becomes a dagger category where we obtain f
: Y
X by
reflecting f : X
Y in the vertical direction, and then switching the direction of the little arrows
denoting the orientations of arcs and circles.
· For any n, the category nCob becomes a dagger category where we obtain f
: Y
X by switch-
ing the input and output of f : X
Y , and then switching the orientation of each connected
component of f . Again, a picture speaks a thousand words:
X
Y
f

Y
X
f

In applications to physics, this dagger operation amounts to `switching the future and the past’.
In all the dagger categories above, the dagger structure interacts in a nice way with the monoidal
structure and also, when it exists, the braiding. One can write a list of axioms characterizing how this
works [2, 3, 80]. So, it seems that the ability to `reverse’ morphisms is another way in which categories
of a quantum flavor differ from the category of sets and functions. This has important implications for
the foundations of quantum theory [9] and also for topological quantum field theory [11], where dagger
categories seem to be part of larger story involving `n-categories with duals’ [12]. However, this story
is still poorly understood — there is much more work to be done.
35
background image
3
Logic
3.1
Background
Symmetric monoidal closed categories show up not only in physics and topology, but also in logic. We
would like to explain how. To set the stage, it seems worthwhile to sketch a few ideas from 20th-century
logic.
Modern logicians study many systems of reasoning beside ordinary classical logic. Of course, even
classical logic comes in various degrees of strength. First there is the `propositional calculus’, which
allows us to reason with abstract propositions X, Y, Z, . . . and these logical connectives:
and
or
implies
not
¬
true
false
Then there is the `predicate calculus’, which also allows variables like x, y, z, . . ., predicates like P (x)
and Q(x, y, z), and the symbols `for all’ (
) and `there exists’ (), which allow us to quantify over
variables. There are also higher-order systems that allow us to quantify over predicates, and so on.
To keep things simple, we mainly confine ourselves to the propositional calculus in what follows. But
even here, there are many alternatives to the `classical’ version!
The most-studied of these alternative systems are weaker than classical logic: they make it harder
or even impossible to prove things we normally take for granted. One reason is that some logicians
deny that certain familiar principles are actually valid. But there are also subtler reasons. One is
that studying systems with rules of lesser strength allows for a fine-grained study of precisely which
methods of reasoning are needed to prove which results. Another reason — the one that concerns us
most here — is that dropping familiar rules and then adding them back in one at at time sheds light
on the connection between logic and category theory.
For example, around 1907 Brouwer [47] began advocating `intuitionism’. As part of this, he raised
doubts about the law of excluded middle, which amounts to a rule saying that from
¬¬X we can deduce
X. One problem with this principle is that proofs using it are not `constructive’. For example, we may
prove by contradiction that some equation has a solution, but still have no clue how to construct the
solution. For Brouwer, this meant the principle was invalid.
Anyone who feels the law of excluded middle is invalid is duty-bound to study intuitionistic logic.
But, there is another reason for studying this system. Namely: we do not really lose anything by
dropping the law of excluded middle! Instead, we gain a fine-grained distinction: the distinction
between a direct proof of X and a proof by contradiction, which yields merely
¬¬X. If we do not care
about this distinction we are free to ignore it, but there is no harm in having it around.
In the 1930’s, this idea was made precise by G¨
odel [43] and Gentzen [87]. They showed that we
can embed classical logic in intuitionistic logic. In fact, they found a map sending any formula X of
the propositional calculus to a new formula X
, such that X is provable classically if and only if X
is
provable intuitionistically. (More impressively, this map also works for the predicate calculus.)
36
background image
Later, yet another reason for being interested in intuitionistic logic became apparent: its connection
to category theory. In its very simplest form, this connection works as follows. Suppose we have a set
of propositions X, Y, Z, . . . obeying the laws of the intuitionistic propositional calculus. We can create
a category C where these propositions are objects and there is at most one morphism from any object
X to any object Y : a single morphism when X implies Y , and none otherwise!
A category with at most one morphism from any object to any other is called a preorder. In the
propositional calculus, we often treat two propositions as equal when they both imply each other. If
we do this, we get a special sort of preorder: one where isomorphic objects are automatically equal.
This special sort of preorder is called a partially ordered set, or poset for short. Posets abound in
logic, precisely because they offer a simple framework for understanding implication.
If we start from a set of propositions obeying the intuitionistic propositional calculus, the resulting
category C is better than a mere poset. It is also cartesian, with X
Y as the product of X and Y , and
as the terminal object! To see this, note that any proposition Q has a unique morphism to X
Y
whenever it has morphisms to X and to Y . This is simply a fancy way of saying that Q implies X
Y
when it implies X and implies Y . It is also easy to see that
is terminal: anything implies the truth.
Even better, the category C is cartesian closed, with X
Y as the internal hom. The reason is
that
X
Y implies Z iff Y implies X Z.
This automatically yields the basic property of the internal hom:
hom(X
Y,Z) = hom(Y,X
Z).
Indeed, if the reader is puzzled by the difference between `X implies Y ‘ and X
Y , we can now
explain this more clearly: the former involves the homset hom(X, Y ) (which has one element when X
implies Y and none otherwise), while the latter is the internal hom, an object in C.
So, C is a cartesian closed poset. But, it also has one more nice property, thanks to the presence of
and We have seen that and make the category C cartesian; and satisfy exactly analogous
rules, but with the implications turned around, so they make C
op
cartesian.
And that is all! In particular, negation gives nothing more, since we can define
¬X to be X F,
and all its intuitionistically valid properties then follow. So, the kind of category we get from the
intuitionistic propositional calculus by taking propositions as objects and implications as morphisms
is precisely a Heyting algebra: a cartesian closed poset C such that C
op
is also cartesian.
Heyting, a student of Brouwer, introduced Heyting algebras in intuitionistic logic before categories
were even invented. So, he used very different language to define them. But, the category-theoretic
approach to Heyting algebras illustrates the connection between cartesian closed categories and logic.
It also gives more evidence that dropping the law of excluded middle is an interesting thing to try.
Since we have explained the basics of cartesian closed categories, but not said what happens when
the opposite of such a category is also cartesian, in the sections to come we will take a drastic step
and limit our discussion of logic even further. We will neglect
and , and concentrate only on the
fragment of the propositional calculus involving
, and .
Even here, it turns out, there are interesting things to say — and interesting ways to modify the
usual rules. This will be the main subject of the sections to come. But to set the stage, we need to
say a bit about proof theory.
37
background image
Proof theory is the branch of mathematical logic that treats proofs as mathematical entities worthy
of study in their own right. It lets us dig deeper into the propositional calculus by studying not merely
whether or not some assumption X implies some conclusion Y , but the whole set of proofs leading
from X to Y . This amounts to studying not just posets (or preorders), but categories that allow many
morphisms from one object to another.
In Hilbert’s approach to proof, there were many axioms and just one rule to deduce new theorems:
modus ponens, which says that from X and `X implies Y ‘ we can deduce Y . Most of modern proof
theory focuses on another approach, the `sequent calculus’, due to Gentzen [87]. In this approach there
are few axioms but many inference rules.
An excellent introduction to the sequent calculus is the book Proofs and Types by Girard, Lafont
and Taylor, freely available online [42]. Here we shall content ourselves with some sketchy remarks. A
`sequent’ is something like this:
X
1
, . . . , X
m
Y
1
, . . . , Y
n
where X
i
and Y
i
are propositions. We read this sequent as saying that all the propositions X
i
, taken
together, can be used to prove at least one of the propositions Y
i
. This strange-sounding convention
gives the sequent calculus a nice symmetry, as we shall soon see.
In the sequent calculus, an `inference rule’ is something that produces new sequents from old. For
example, here is the left weakening rule:
X
1
, . . . , X
m
Y
1
, . . . , Y
n
X
1
, . . . , X
m
, A
Y
1
, . . . , Y
n
This says that from the sequent above the line we can get the sequent below the line: we can throw
in the extra assumption A without harm. Thanks to the strange-sounding convention we mentioned,
this rule has a mirror-image version called right weakening:
X
1
, . . . , X
m
Y
1
, . . . , Y
n
X
1
, . . . , X
m
Y
1
, . . . , Y
n
, A
In fact, Gentzen’s whole setup has this mirror symmetry! For example, his rule called left contraction:
X
1
, . . . , X
m
, A, A
Y
1
, . . . , Y
n
X
1
, . . . , X
m
, A
Y
1
, . . . , Y
n
has a mirror partner called right contraction:
X
1
, . . . , X
m
Y
1
, . . . , Y
n
, A, A
X
1
, . . . , X
m
Y
1
, . . . , Y
n
, A
Similarly, this rule for `and’
X
1
, . . . , X
m
, A
Y
1
, . . . , Y
n
X
1
, . . . , X
m
, A
B Y
1
, . . . , Y
n
has a mirror partner for `or’:
38
background image
X
1
, . . . , X
m
,
Y
1
, . . . , Y
n
, A
X
1
, . . . , X
m
Y
1
, . . . , Y
n
, A
B
Logicians now realize that this mirror symmetry arises from the duality between a category and its
opposite. Unfortunately, since we have decided to study
and but not their mirror partners and
, this duality will not be visible in the sections to come.
Gentzen used sequents to write inference rules for the classical propositional calculus, and also the
classical predicate calculus. Now, in these forms of logic we have
X
1
, . . . , X
m
Y
1
, . . . , Y
n
if and only if we have
X
1
··· X
m
Y
1
··· Y
n
.
So, why did Gentzen use sequents with a list of propositions on each side of the
symbol, instead just
a single proposition? The reason is that this let him use only inference rules having the `subformula
property’. This says that every proposition in the sequent above the line appears as part of some
proposition in the sequent below the line. So, a proof built from such inference rules becomes a `tree’
where all the propositions further up the tree are subformulas of those below.
This idea has powerful consequences. For example, in 1936 Gentzen was able prove the consistency
of Peano’s axioms of arithmetic! His proof essentially used induction on trees (Readers familiar with

odel’s second incompleteness theorem should be reassured that this sort of induction cannot itself be
carried out in Peano arithmetic.)
The most famous rule lacking the subformula property is the `cut rule’:
X
1
, . . . , X
m
Y
1
, . . . , Y
k
, A
X
m+1
, . . . , X
n
, A
Y
k+1
, . . . , Y
X
1
, . . . , X
n
Y
1
, . . . , Y
From the two sequents on top, the cut rule gives us the sequent below. Note that the intermediate step
A does not appear in the sequent below. It is `cut out’. So, the cut rule lacks the subformula property.
But, one of Gentzen’s great achievements was to show that any proof in the classical propositional (or
even predicate) calculus that can be done with the cut rule can also be done without it. This is called
`cut elimination’.
Gentzen also wrote down inference rules suitable for the intuitionistic propositional and predicate
calculi. These rules lack the mirror symmetry of the classical case. But in the 1980s, this symmetry
was restored by Girard’s invention of `linear logic’ [41].
Linear logic lets us keep track of how many times we use a given premise to reach a given conclusion.
To accomplish this, Girard introduced some new logical connectives! For starters, he introduced `linear’
connectives called
and , and a logical constant called I. These act a bit like , and . However,
they satisfy rules corresponding to a symmetric monoidal category instead of a cartesian closed category.
In particular, from X we can prove neither X
X nor I. So, we cannot freely `duplicate’ and `delete’
propositions using these new connectives. This is reflected in the fact that linear logic drops Gentzen’s
contraction and weakening rules.
By itself, this might seem unbearably restrictive. However, Girard also kept the connectives
,
and in his system, still satisfying the usual rules. And, he introduced an operation called the
39
background image
`exponential’, !, which takes a proposition X and turns it into an `arbitrary stock of copies of X’. So,
for example, from !X we can prove 1, and X, and X
X, and X X X, and so on.
Full-fledged linear logic has even more connectives than we have described here. It seems baroque
and peculiar at first glance. It also comes in both classical and intuitionistic versions! But, just as
classical logic can be embedded in intuitionistic logic, intuitionistic logic can be embedded in intuitionistic
linear logic [41]. So, we do not lose any deductive power. Instead, we gain the ability to make even
more fine-grained distinctions.
In what follows, we discuss the fragment of intuitionistic linear logic involving only
, and I.
This is called `multiplicative intuititionistic linear logic’ [46, 75]. It turns out to be the system of logic
suitable for closed symmetric monoidal categories — nothing more or less.
3.2
Proofs as Morphisms
In Section 2 we described categories with various amounts of extra structure, starting from categories
pure and simple, and working our way up to monoidal categories, braided monoidal categories, sym-
metric monoidal categories, and so on. Our treatment only scratched the surface of an enormously rich
taxonomy. In fact, each kind of category with extra structure corresponds to a system of logic with its
own inference rules!
To see this, we will think of propositions as objects in some category, and proofs as giving morphisms.
Suppose X and Y are propositions. Then, we can think of a proof starting from the assumption X
and leading to the conclusion Y as giving a morphism f : X
Y . (In Section 3.3 we shall see that a
morphism is actually an equivalence class of proofs — but for now let us gloss over this issue.)
Let us write X
Y when, starting from the assumption X, there is a proof leading to the conclusion
Y . An inference rule is a way to get new proofs from old. For example, in almost every system of logic,
if there is a proof leading from X to Y , and a proof leading from Y to Z, then there is a proof leading
from X to Z. We write this inference rule as follows:
X
Y
Y
Z
X
Z
We can call this cut rule, since it lets us `cut out’ the intermediate step Y . It is a special case of
Gentzen’s cut rule, mentioned in the previous section. It should remind us of composition of morphisms
in a category: if we have a morphism f : X
Y and a morphism g:Y Z, we get a morphism
gf : X
Z.
Also, in almost every system of logic there is a proof leading from X to X. We can write this as
an inference rule that starts with nothing and concludes the existence of a proof of X from X:
X
X
This rule should remind us of how every object in category has an identity morphism: for any object
X, we automatically get a morphism 1
X
: X
X. Indeed, this rule is sometimes called the identity
rule.
If we pursue this line of thought, we can take the definition of a closed symmetric monoidal category
and extract a collection of inference rules. Each rule is a way to get new morphisms from old in a
40
background image
closed symmetric monoidal category. There are various superficially different but ultimately equivalent
ways to list these rules. Here is one:
X
X
(i)
X
Y
Y
Z
X
Z
(
)
W
X
Y
Z
W
Y X Z
(
)
W
(X
Y ) Z
W
X
(Y Z)
(a)
X
I
Y
X
Y
(l)
X
Y
I
X
Y
(r)
W
X
Y
W
Y
X
(b)
X
Y Z
Y
X
Z
(c)
Double lines mean that the inverse rule also holds. We have given each rule a name, written to the right
in parentheses. As already explained, rules (i) and (
) come from the presence of identity morphisms
and composition in any category. Rules (
), (a), (l), and (r) come from tensoring, the associator,
and the left and right unitors in a monoidal category. Rule (b) comes from the braiding in a braided
monoidal category, and rule (c) comes from currying in a closed monoidal category.
Now for the big question: what does all this mean in terms of logic? These rules describe a small
fragment of the propositional calculus. To see this, we should read the connective
as `and’, the
connective
as `implies’, and the proposition I as `true’.
In this interpretation, rule (c) says we can turn a proof leading from the assumption `Y and X’ to
the conclusion Z into a proof leading from X to `Y implies Z’. It also says we can do the reverse. This
is true in classical, intuitionistic and linear logic, and so are all the other rules. Rules (a) and (b) say
that `and’ is associative and commutative. Rule (l) says that any proof leading from the assumption
X to the conclusion `true and Y ‘ can be converted to a proof leading from X to Y , and vice versa.
Rule (r) is similar.
What do we do with these rules? We use them to build `deductions’. Here is an easy example:
(i)
X
Y
X
Y
(c
-1
)
X
(X
Y )
Y
First we use the identity rule, and then the inverse of the currying rule. At the end, we obtain
X
(X
Y )
Y.
This should remind us of the evaluation morphisms we have in a closed monoidal category:
ev
X,Y
: X
(X
Y )
Y.
In terms of logic, the point is that we can prove Y from X and `X implies Y ‘. This fact comes in
handy so often that we may wish to abbreviate the above deduction as an extra inference rule — a
rule derived from our basic list:
(ev)
X
(X
Y )
Y
41
background image
This rule is called modus ponens.
In general, a deduction is a tree built from inference rules. Branches arise when we use the (
) or
(
) rules. Here is an example:
(i)
(A
B) C (A B) C
(a)
(A
B) C A (B C)
A
(B C) D
(
)
(A
B) C D
Again we can abbreviate this deduction as a derived rule. In fact, this rule is reversible:
A
(B C) D
()
(A
B) C D
For a more substantial example, suppose we want to show
(X
Y )
(Y
Z)
X
Z.
The deduction leading to this will not even fit on the page unless we use our abbreviations:
(ev)
X
(X
Y )
Y
(i)
Y
Z
Y
Z
(
)
(X
(X
Y ))
(Y
Z)
Y
(Y
Z)
(ev)
Y
(Y
Z)
Z
(X
(X
Y ))
(Y
Z)
Z
(
-1
)
X
((X
Y )
(Y
Z))
Z
(c)
(X
Y )
(Y
Z)
X
Z
Since each of the rules used in this deduction came from a way to get new morphisms from old in a
closed monoidal category (we never used the braiding), it follows that in every such category we have
internal composition morphisms:
·
X,Y,Z
: (X
Y )
(Y
Z)
X
Z.
These play the same role for the internal hom that ordinary composition
:hom(X,Y ) × hom(Y,Z) hom(X,Z)
plays for the ordinary hom.
We can go ahead making further deductions in this system of logic, but the really interesting thing
is what it omits. For starters, it omits the connective `or’ and the proposition `false’. It also omits two
inference rules we normally take for granted — namely, contraction:
X
Y
()
X
Y
Y
and weakening:
42
background image
X
Y
(!)
X
I
which are closely related to duplication and deletion in a cartesian category. Omitting these rules is
a distinctive feature of linear logic [41]. The word `linear’ should remind us of the category Hilb. As
noted in Section 2.3, this category with its usual tensor product is noncartesian, so it does not permit
duplication and deletion. But, what does omitting these rules mean in terms of logic?
Ordinary logic deals with propositions, so we have been thinking of the above system of logic in
the same way. Linear logic deals not just with propositions, but also other resources — for example,
physical things! Unlike propositions in ordinary logic, we typically can’t duplicate or delete these other
resources. In classical logic, if we know that a proposition X is true, we can use X as many or as few
times as we like when trying to prove some proposition Y . But if we have a cup of milk, we can’t use
it to make cake and then use it again to make butter. Nor can we make it disappear without a trace:
even if we pour it down the drain, it must go somewhere.
In fact, these ideas are familiar in chemistry. Consider the following resources:
H
2
= one molecule of hydrogen
O
2
= one molecule of oxygen
H
2
O
= one molecule of water
We can burn hydrogen, combining one molecule of oxygen with two of hydrogen to obtain two molecules
of water. A category theorist might describe this reaction as a morphism:
f : O
2
(H
2
H
2
)
H
2
O
H
2
O.
A linear logician might write:
O
2
(H
2
H
2
)
H
2
O
H
2
O
to indicate the existence of such a morphism. But, we cannot duplicate or delete molecules, so for
example
H
2
H
2
H
2
and
H
2
I
where I is the unit for the tensor product: not iodine, but `no molecules at all’.
In short, ordinary chemical reactions are morphisms in a symmetric monoidal category where objects
are collections of molecules. As chemists normally conceive of it, this category is not closed. So, it
obeys an even more limited system of logic than the one we have been discussing, a system lacking the
connective
. To get a closed category — in fact a compact one — we need to remember one of the
great discoveries of 20th-century physics: antimatter. This lets us define Y
Z to be `anti-Y and Z’:
Y
Z = Y
Z
Then the currying rule holds:
Y
X Z
X
Y
Z
43
background image
Most chemists don’t think about antimatter very often — but particle physicists do. They don’t use
the notation of linear logic or category theory, but they know perfectly well that since a neutrino and
a neutron can collide and turn into a proton and an electron:
n p e,
then a neutron can turn into a neutrino together with a proton and an electron:
n
(p e).
This is an instance of the currying rule, rule (c).
3.3
Logical Theories from Categories
We have sketched how different systems of logic naturally arise from different types of categories. To
illustrate this idea, we introduced a system of logic with inference rules coming from ways to get new
morphisms from old in a closed symmetric monoidal category. One could substitute many other types
of categories here, and get other systems of logic.
To tighten the connection between proof theory and category theory, we shall now describe a recipe
to get a logical theory from any closed symmetric monoidal category. For this, we shall now use X
Y
to denote the set of proofs — or actually, equivalence classes of proofs — leading from the assumption
X to the conclusion Y . This is a change of viewpoint. Previously we would write X
Y when this set
of proofs was nonempty; otherwise we would write X
Y . The advantage of treating X
Y as a set
is that this set is precisely what a category theorist would call hom(X, Y ): a homset in a category.
If we let X
Y stand for a homset, an inference rule becomes a function from a product of homsets
to a single homset. For example, the cut rule
X
Y
Y
Z
(
)
X
Z
becomes another way of talking about the composition function
X,Y,Z
: hom(X, Y )
× hom(Y,Z) hom(X,Z),
while the identity rule
(i)
X
X
becomes another way of talking about the function
i
X
: 1
hom(X,X)
that sends the single element of the set 1 to the identity morphism of X. (Note: the set 1 is a zero-fold
product of homsets.)
Next, if we let inference rules be certain functions from products of homsets to homsets, deductions
become more complicated functions of the same sort built from these basic ones. For example, this
deduction:
44
background image
(i)
X
I X I
(r)
X
I X
(i)
Y
Y
(
)
(X
I) Y X Y
specifies a function from 1 to hom((X
I) Y,X Y ), built from the basic functions indicated by
the labels at each step. This deduction:
(i)
(X
I) Y (X I) Y
(a)
(X
I) Y X (I Y )
(i)
I
Y I Y
(r)
I
Y Y
(i)
X
X
(
)
X
(I Y ) X Y
(
)
(X
I) Y X Y
gives another function from 1 to hom((X
I) Y,X Y ).
If we think of deductions as giving functions this way, the question arises when two such functions
are equal. In the example just mentioned, the triangle equation in the definition of monoidal category
(Definition 6):
(X
I) Y
X
(I Y )
X
Y
E
a
X,I,Y
rr
r
j
r
X
1
Y
¨
¨
¨
%
1
X
l
Y
says these two functions are equal. Indeed, the triangle equation is precisely the statement that these
two functions agree! (We leave this as an exercise for the reader.)
So: even though two deductions may look quite different, they may give the same function from a
product of homsets to a homset if we demand that these are homsets in a closed symmetric monoidal
category. This is why we think of X
Y as a set of equivalence classes of proofs, rather than proofs:
it is forced on us by our desire to use category theory. We could get around this by using a 2-category
with proofs as morphisms and `equivalences between proofs’ as 2-morphisms [77]. This would lead
us further to the right in the Periodic Table (Table 3). But let us restrain ourselves and make some
definitions formalizing what we have done so far.
From now on we shall call the objects X, Y, . . . `propositions’, even though we have seen they may
represent more general resources. Also, purely for the sake of brevity, we use the term `proof’ to mean
`equivalence class of proofs’. The equivalence relation must be coarse enough to make the equations in
the following definitions hold:
Definition 18 A closed monoidal theory consists of the following:
· A collection of propositions. The collection must contain a proposition I, and if X and Y are
propositions, then so are X
Y and X
Y .
· For every pair of propositions X,Y, a set X Y of proofs leading from X to Y . If f X Y,
then we write f : X
Y .
· Certain functions, written as inference rules:
45
background image
X
X
(i)
X
Y
Y
Z
X
Z
(
)
W
X
Y
Z
W
Y X Z
(
)
W
(X
Y ) Z
W
X
(Y Z)
(a)
X
I
Y
X
Y
(l)
X
Y
I
X
Y
(r)
X
Y Z
Y
X
Z
(c)
A double line means that the function is invertible. So, for example, for each triple X, Y, Z we
have a function
X,Y,Z
: (X
Y )
× (Y Z) (X Z)
and a bijection
c
X,Y,Z
: (X
Y Z) (Y X
Z).
· Certain equations that must be obeyed by the inference rules. The inference rules () and (i) must
obey equations describing associativity and the left and right unit laws. Rule (
) must obey an
equation saying it is a functor. Rules (a), (l), (r), and (c) must obey equations saying they are
natural transformations. Rules (a), (l), (r) and (
) must also obey the triangle and pentagon
equations.
Definition 19 A closed braided monoidal theory is a closed monoidal theory with this additional
inference rule:
W
X
Y
W
Y
X
(b)
We demand that this rule give a natural transformation satisfying the hexagon equations.
Definition 20 A closed symmetric monoidal theory is a closed braided monoidal theory where
the rule (b) is its own inverse.
These are just the usual definitions of various kinds of closed category — monoidal, braided monoidal
and symmetric monoidal — written in a new style. This new style lets us build such categories from
logical systems. To do this, we take the objects to be propositions and the morphisms to be equivalence
classes of proofs, where the equivalence relation is generated by the equations listed in the definitions
above.
However, the full advantages of this style only appear when we dig deeper into proof theory, and
generalize the expressions we have been considering:
X
Y
to `sequents’ like this:
X
1
, . . . , X
n
Y.
46
background image
Loosely, we can think of such a sequent as meaning
X
1
··· X
n
Y.
The advantage of sequents is that they let us use inference rules that — except for the cut rule and
the identity rule — have the `subformula property’ mentioned near the end of Section 3.1.
Formulated in terms of these inference rules, the logic of closed symmetric monoidal categories goes
by the name of `multiplicative intuitionistic linear logic’, or MILL for short [46, 75]. There is a `cut
elimination’ theorem for MILL, which says that with a suitable choice of other inference rules, the cut
rule becomes redundant: any proof that can be done with it can be done without it. This is remarkable,
since the cut rule corresponds to composition of morphisms in a category. One consequence is that in
the free symmetric monoidal closed category on any set of objects, the set of morphisms between any
two objects is finite. There is also a decision procedure to tell when two morphisms are equal. For
details, see Trimble’s thesis [88] and the papers by Jay [49] and Soloviev [83]. Also see Kelly and Mac
Lane’s coherence theorem for closed symmetric monoidal categories [56], and the related theorem for
compact symmetric monoidal categories [57].
MILL is just one of many closely related systems of logic. Most include extra features, but some
subtract features. Here are just a few examples:
· Algebraic theories. In his famous thesis, Lawvere [64] defined an algebraic theory to be a
cartesian category where every object is an n-fold cartesian power X
× ··· × X (n 0) of a
specific object X. He showed how such categories regarded as logical theories of a simple sort
– the sort that had previously been studied in `universal algebra’ [23]. This work initiated the
categorical approach to logic which we have been sketching here. Crole’s book [32] gives a gentle
introduction to algebraic theories as well as some richer logical systems. More generally, we can
think of any cartesian category as a generalized algebraic theory.
· Intuitionistic linear logic (ILL). ILL supplements MILL with the operations familiar from intu-
itionistic logic, as well as an operation ! turning any proposition (or resource) X into an `indefinite
stock of copies of X’. Again there is a nice category-theoretic interpretation. Bierman’s thesis
[21] gives a good overview, including a proof of cut elimination for ILL and a proof of the result,
originally due to Girard, that intuitionistic logic can be be embedded in ILL.
· Linear logic (LL). For full-fledged linear logic, the online review article by Di Cosmo and Miller
[35] is a good place to start. For more, try the original paper by Girard [41] and the book by
Troelstra [89]. Blute and Scott’s review article [22] serves as a Rosetta Stone for linear logic and
category theory, and so do the lectures notes by Schalk [75].
· Intuitionistic Logic (IL). Lambek and Scott’s classic book [62] is still an excellent introduction
to intuitionistic logic and cartesian closed categories. The online review article by Moschovakis
[70] contains many suggestions for further reading.
To conclude, let us say precisely what an `inference rule’ amounts to in the setup we have described.
We have said it gives a function from a product of homsets to a homset. While true, this is not the last
word on the subject. After all, instead of treating the propositions appearing in an inference rule as
fixed, we can treat them as variable. Then an inference rule is really a `schema’ for getting new proofs
from old. How do we formalize this idea?
47
background image
First we must realize that X
Y is not just a set: it is a set depending in a functorial way on X
and Y . As noted in Definition 13, there is a functor, the `hom functor’
hom: C
op
× C Set,
sending (X, Y ) to the homset hom(X, Y ) = X
Y . To look like logicians, let us write this functor as
.
Viewed in this light, most of our inference rules are natural transformations. For example, rule (a)
is a natural transformation between two functors from C
op
× C
3
to Set, namely the functors
(W, X, Y, Z)
W (X Y ) Z)
and
(W, X, Y, Z)
W X (Y Z).
This natural transformation turns any proof
f : W
(X Y ) Z)
into the proof
a
X,Y,Z
f : W
X (Y Z).
The fact that this transformation is natural means that it changes in a systematic way as we vary
W, X, Y and Z. The commuting square in the definition of natural transformation, Definition 3, makes
this precise.
Rules (l), (r), (b) and (c) give natural transformations in a very similar way. The (
) rule gives a
natural transformation between two functors from C
op
× C × C
op
× C to Set, namely
(W, X, Y, Z)
(W X) × (Y Z)
and
(W, X, Y, Z)
W Y X Z
This natural transformation sends any element (f, g)
hom(W,X) × hom(Y,Z) to f g.
The identity and cut rules are different: they do not give natural transformations, because the top
line of these rules has a different number of variables than the bottom line! Rule (i) says that for each
X
C there is a function
i
X
: 1
X X
picking out the identity morphism 1
X
. What would it mean for this to be natural in X? Rule (
) says
that for each triple X, Y, Z
C there is a function
:( X Y ) × (Y Z) X Z.
What would it mean for this to be natural in X, Y and Z? The answer to both questions involves a
generalization of natural transformations called `dinatural’ transformations [66].
As noted in Definition 3, a natural transformation : F
G between two functors F,G:C D
makes certain squares in D commute. If in fact C = C
op
1
× C
2
, then we actually obtain commuting
cubes in D. Namely, the natural transformation assigns to each object (X
1
, X
2
) a morphism
X
1
,X
2
such that for any morphism (f
1
: Y
1
X
1
, f
2
: X
2
Y
2
) in C, this cube commutes:
48
background image
G(Y
1
, X
2
)
G(Y
1
, Y
2
)
F (Y
1
, X
2
)
F (Y
1
, Y
2
)
G(X
1
, X
2
)
G(X
1
, Y
2
)
F (X
1
, X
2
)
F (X
1
, Y
2
)
E
G(1
Y1
,f
2
)
c
G(f
1
,1
Y2
)
c
F (f
1
,1
X2
)
E
F (1
Y1
,f
2
)
Y1,X2
G(f
1
,1
X2
)
c
c
F (f
1
,1
Y2
)
Y1,Y2
G(1
X1
,f
2
)
E
E
F (1
X1
,f
2
)
X1,X2
X1,Y2
If C
1
= C
2
, we can choose a single object X and a single morphism f : X
Y and use it in both
slots. As shown in Figure 1, there are then two paths from one corner of the cube to the antipodal
corner that only involve for repeated arguments: that is,
X,X
and
Y,Y
, but not
X,Y
or
Y,X
.
These paths give a commuting hexagon.
This motivates the following:
Definition 21 A dinatural transformation : F
G between functors F,G:C
op
×C D assigns
to every object X in C a morphism
X
: F (X, X)
G(X,X) in D such that for every morphism
f : X
Y in C, the hexagon in Figure 1 commutes.
In the case of the identity rule, this commuting hexagon follows from the fact that the identity
morphism is a left and right unit for composition: see Figure 2. For the cut rule, this commuting
hexagon says that composition is associative: see Figure 3.
So, in general, the sort of logical theory we are discussing involves:
· A category C of propositions and proofs.
· A functor :C
op
× C Set sending any pair of propositions to the set of proofs leading from
one to the other.
· A set of dinatural transformations describing inference rules.
49
background image
G(Y, X)
G(Y, Y )
F (Y, X)
F (Y, Y )
G(X, X)
G(X, Y )
F (X, X)
F (X, Y )
c
G(f,1
Y
)
c
F (f,1
X
)
E
F (1
Y
,f )
Y,Y
E
G(1
X
,f )
X,X
Figure 1: A natural transformation between functors F, G: C
op
× C D gives a commuting cube in D
for any morphism f : X
Y , and there are two paths around the cube that only involve for repeated
arguments.
50
background image
Y
Y
1
Y
1
·
1
·
X
X
1
X
X
Y
f
1
X
= 1
Y
f
1
·
c
-f
c
1
1
E
1
1
i
Y
E
f
-
i
X
Figure 2: Dinaturality of the (i) rule, where f : X
Y . Here · 1 denotes the one element of the
one-element set.
X
Z
h
(f g)
(X
W )
× (Y Z)
(g, h)
(X
Y )
× (Y Z)
(f
g,h)
X
Z
(h
f) g
X
Z
(h
f) g = h (f g)
(X
W )
× (W Z)
(g, h
f)
c
1
X Z
c
(1
X W
,
-f)
E
(f
-,1
X W
)
¨¨
¨¨
¨
B
E
1
X Z
¨¨
¨¨
¨
B
Figure 3: Dinaturality of the cut rule, where f : W
Y, g:X W, h:Y Z.
51
background image
4
Computation
4.1
Background
In the 1930s, while Turing was developing what are now called `Turing machines’ as a model for
computation, Church and his student Kleene were developing a different model, called the `lambda
calculus’ [27, 58]. While a Turing machine can be seen as an idealized, simplified model of computer
hardware, the lambda calculus is more like a simple model of software.
By now the are many careful treatments of the lambda calculus in the literature, from Barendregt’s
magesterial tome [15] to the classic category-theoretic treatment of Lambek and Scott [62], to Selinger’s
elegant online notes [79]. So, we shall content ourselves with a quick sketch.
Poetically speaking, the lambda calculus describes a universe where everything is a program and
everything is data — programs are data — and we can take any program and apply it to any piece of
data to get a new piece of data. More prosaically, everything is a `-term’, or `term’ for short. These
are defined inductively:
· Variables: there is a countable set of `variables’, which are all terms.
· Application: if f and g are terms, we can `apply’ f to g and obtain a term (fg).
· Lambda-abstraction: if x is a variable and t is a term, there is a term (x.t).
We think of (x.t) as the program that, given x as input, returns t as output. For example, if x is a
variable and f is a term, the term
(x.(f x))
stands for the program that, given x as input, returns (f x) as output. But this is just a fancy way of
talking about f ! So, the lambda calculus has a rule saying
(x.(f x)) = f.
But beware: this rule is not an equation in the usual mathematical sense. Instead, it is a `rewrite
rule’: given the term on the left, we are allowed to rewrite it and get the term on the right. There are
also other rewrite rules. Starting with a term and repeatedly applying rewrite rules is how we take a
program and letting it run.
The lambda calculus is a very simple formalism. However, starting from just this, Church and
Kleene were able to build up Boolean logic, the natural numbers and their usual operations, and so
on. For example, they defined `Church numerals’ as follows:
0 = (x.(y.y))
1 = (x.(y.(xy)))
2 = (x.(y.(x(xy))))
3 = (x.(y.(x(x(xy)))))
and so on. Thus, the Church numeral n is the program that `takes any program to the nth power’: if
you give it any program x as input, it returns the program that applies x n times to whatever input y
it receives. This makes it easy to define terms for addition, multiplication, and so on, and recover their
52
background image
usual properties. With more cleverness, Church and Kleene were able to write terms corresponding to
more complicated functions. They eventually came to believe that all computable functions f :
N N
can be defined in the lambda calculus.
Meanwhile, G¨
odel was developing another approach to computability, the theory of `recursive func-
tions’. Around 1936, Kleene proved that the functions definable in the lambda calculus were the same
as G¨
odel’s recursive functions. In 1937 Turing described his `Turing machines’, and used these to give
yet another definition of computable functions. This definition was later shown to agree with the other
two. Thanks to this and other evidence, it is now widely accepted that the lambda calculus can define
any function that can be computed by any systematic method. We say it is `Turing complete’.
It took a while for computer scientists to profit from Church and Kleene’s insights. However, in 1965
Landin pointed out a powerful analogy between the lambda calculus and the programming language
ALGOL [63]. Landin’s paper was very influential. It led to a renewed burst of work on the lambda
calculus which continues to this day. By now, a number of computer languages are explicitly based
on ideas from the lambda calculus. The most famous of these include Lisp, ML and Haskell. These
languages, called `functional programming languages’, are beloved by theoretical computer scientists
for their conceptual clarity. In fact, for many years, everyone majoring in computer science at MIT
has been required to take an introductory course that involves programming in Scheme, a dialect of
LISP. The cover of the textbook for this course [1] even has a big on the cover!
Languages of a different sort — `imperative programming languages’ — are more popular among
working programmers. Examples include FORTRAN, BASIC, and C. In imperative programming,
a program consists of statements that tell the computer what to do. In functional programming, a
program describes a function, and running it evaluates this function.
Here we are mainly interested in `typed’ functional programming languages. These are more regi-
mented than the original lambda calculus. In the original lambda calculus, any term can be applied to
any other. In real-world programming, such an unstructured framework easily leads to mistakes. It is
better to treat data as coming in various `types’, such as integers, floating-point numbers, alphanumeric
strings, and so on. Then, we can demand that each program only accept input of a specified type and
produce output of a specified type.
This idea is formalized by the `typed’ lambda calculus, where every term has a type. It also
corresponds to a basic idea in category theory, where every morphism is like a black box with input
and output wires of specified types:
f
X
Y
and it makes no sense to hook two black boxes together unless the output of the first has the same
type as the input of the next:
53
background image
f
g
X
Y
Z
However, in the lambda calculus the basic operation is not composition, but application. As we shall
see, this fits nicely into the framework of closed categories.
Indeed, in 1980 Lambek found a way to interpret the typed lambda calculus in terms of cartesian
closed categories [61]. This idea led to a productive line of research blending category theory and
computer science. There is no way we can summarize the resulting enormous body of work, though it
constitutes a crucial aspect of the Rosetta Stone. Two good starting points for further reading are the
textbook by Crole [32] and the online review article by Scott [73].
Our goal is more limited: we want to explain how morphisms in a closed symmetric monoidal
category can be seen as programs written in a very limited sort of programming language. Unlike the
lambda calculus, this language forbids duplication and deletion of data except when expressly permit-
ted. The reason is that while every object in a cartesian category comes equipped with `duplication’
morphisms, a monoidal category typically lacks these. As we saw in Section 2.3, a great example is the
category Hilb with its usual tensor product. So, in quantum computation we cannot freely duplicate
or delete data. This makes it interesting to envisage programming languages with this constraint built
in.
Various flavors of `linear’ or `quantum’ lambda calculus have already been studied, for example by
Benton, Bierman de Paiva and Hyland [19], Dorca and van Tonder [91], and Selinger and Valiron [81].
In the sections that follow we shall take a slightly different tack and consider a linear version, not of
the lambda calculus, but of `combinatory logic’.
Combinatory logic was born in a 1924 paper by Sch¨
onfinkel [76], and extensively developed by Curry
[33]. In retrospect, we can see it as a stripped-down version of the lambda calculus that completely
avoids the use of variables. Starting from a basic stock of terms called `combinators’, the only way to
build new ones is application: we can apply any term a to any term b and get a term (ab).
To build a Turing-complete programming language in such a impoverished setup, we need a sufficient
stock of combinators. In fact, three are enough. The first, called I, acts like the identity, since it comes
with the rewrite rule:
(Ia) = a
for every term a. The second, called K, gives a constant function (Ka) for each term a. In other
words, it comes with a rewrite rule saying
(Ka)b = a.
54
background image
for every term b. The third, called S, is the tricky one. It takes three terms, applies the first to the
third, and applies the result to the second applied to the third:
(((Sa)b)c) = ((ac)(bc)).
As an illustration of how these rules work, let us apply ((SK)K) to any term a:
(((SK)K)a) = ((Ka)(Ka)) = a.
This is the same as (Ia). So, we say ((SK)K) and I are `extensionally equivalent’. This means that I
is redundant! We included it just to make this point. Or, consider ((SI)I):
(((SI)I)a) = ((Ia)(Ia)) = (aa)
So, ((SI)I) takes any term a and applies it to itself. This is a very powerful feature, but it leads to
an infinite loop when we apply ((SI)I) to itself, since we get… nothing but ((SI)I)) applied to itself!
We can avoid infinite loops in a `typed’ combinatory logic, but there is a price to pay, since any Turing
complete programming langage must allow nonterminating programs. A good compromise is to use a
typed system with extra features that permit looping.
We can embed combinatory logic into the untyped lambda calculus by defining I, K, and S as
follows:
I
= (x.x)
K
= (x.(y.x))
S
= (x.(y.(z.((xz)(yz))))).
The rewrite rules for these combinators then follow from rewrite rules in the lambda calculus. The
more suprising fact is that any function computable using the lambda calculus can also be computed
using just I, K and S!
We can make this a bit more precise as follows. All the variables in the lambda calculus formulas
for I, K, and S are dummy variables. More generally, in the lambda calculus we define a `combinator’
to be a term in which all variables are dummy variables. Two combinators c and d are `extensionally
equivalent’ if they give the same result on any input: that is, for any term t, we can apply lambda
calculus rewrite rules to (ct) and (dt) in a way that leads to the same term. It is then a theorem that
any combinator in the lambda calculus is extensionally equivalent to one built from I, K, and S using
just application.
In the sections to come, we shall describe a linear version of typed combinatory logic, suitable for
closed symmetric monoidal categories. It is a bit irksome to avoid duplication or deletion of data in
the lambda calculus, since we need to count how many times each variable gets used. It is easier in
combinatory logic: we just need to avoid combinators that duplicate or delete data. For example, the
combinator I is okay:
(Ia) = a.
The combinator K is not, since it deletes b here:
((Ka)b) = a.
The combinator S is not okay either, since it duplicates c here:
(((Sa)b)c) = ((ac)(bc)).
55
background image
Luckily, we can read off a list of acceptable combinators directly from the definition of `closed symmetric
monoidal category’. The resulting system is not very powerful — but it will let us show how all the
key concepts from previous sections have analogues in the world of computation.
4.2
Categories in Functional Programming
In functional programming, the fundamental operation is application. Mathematicians apply a function
f to an argument x and write the result as f (x); in functional programming we apply a program f to
a piece of data x and write the result as (f x).
To actually evaluate expressions like (f x), we need `rewrite rules’. For example, suppose we have
programs double and increment, which compute the obvious functions from integers to integers. Then
we might have rewrite rules
(increment 3) = 4
and
(double 4) = 8
We can simplify more complex expressions by repeatedly using these rules:
(double (increment 3)) = (double 4) = 8
We can handle functions that take more than one argument using a trick discussed in Section 2.6:
`currying’. This turns a function of several arguments into a function that takes the first argument
and returns a function of the remaining arguments. So, for example, we could have a program plus
that adds integers as follows:
((plus 3) 5) = 8
where (plus 3) stands for a program that adds 3 to its input.
With currying, we can define increment and double in terms of addition and multiplication. In
other words, we can have rewrite rules
increment = (plus 1)
double = (times 2)
Then, for example, we have
(double (increment 3))
= (double ((plus 1) 3))
= ((times 2) ((plus 1) 3))
= ((times 2) 4)
= 8
56
background image
So far we have focused on programs that take integers as input. In reality, programming involves
many other kinds of data. For example, suppose we are writing a program that also involves days of
the week. It would not make sense to write (double Tuesday), because Tuesday is not a number. We
might choose to represent Tuesday by a number in some program, but doubling that number doesn’t
have a good interpretation: is the first day of the week Sunday or Monday? Is the week indexed from
zero or one? These are arbitrary choices that affect the result, so the expression (double Tuesday)
has no well-defined meaning.
We can avoid ill-defined expressions of this sort by `type checking.’ To do this, every piece of data
should have a `type,’ and our type checker (usually part of a compiler) should complain if we try to
apply a function to a piece of data of the wrong type. For example, the number 248 is an integer, while
Tueday
is a day of the week; in programming we denote this by something like:
248:integer
Tuesday:day
(This notation is used in Ada, Pascal and some other languages. Other notations are also in widespread
use.)
Given some types, we can build up other types in a variety of ways. For example, given types X
and Y, there is a new type for functions that take data of type X as input and return data of type Y as
output. This sort of type is called a function type. In the computer science literature, it would be
denoted X -> Y. However, to be consistent with the rest of our paper, we will write this function type
as X
Y
. The functions increment and double map integers to integers, so we write
increment:integer
integer
double:integer
integer
Addition and multiplication both have two inputs and one output:
plus:integer
(integer
integer)
times:integer
(integer
integer)
(The parentheses here don’t stand for application; they are used for grouping.)
In fact, everything we have done so far can be formalized in terms of a closed monoidal category.
Given such a category, we can call its objects data types. We can call a morphism f : X
Y a
program that takes data of type X as input and produces data of type Y as output. The simplest
sort of program is a piece of data: we call x: I
X is a datum of type X, and indicate this by writing
x:X
.
So, when we wrote Tuesday:day above, we can think of this as another way of saying
Tuesday: I
day.
This funny-looking morphism has an easy interpretation in the category Set: here I would be the
one-element set, `day’ would be the 7-element set consisting of days of the week, and `Tuesday’ would
be the function picking out this particular day of the week. But, we could equally well be working in
some category of data types and programs, and that is the interpretation we are trying to stress here.
57
background image
We explained in Section 2.6 that in a closed monoidal category, any morphism f : X
Y has a
`name’:
f : I
X
Y.
Conversely, any morphism g: I
X
Y is the name of a unique morphism f : X
Y . This makes
precise the idea that `programs are data’. Namely: any program that takes data of type X as input
and outputs data of type Y can be reinterpreted as a datum of type X
Y
, and conversely.
In functional programming, we use this fact to systematically work with the names of morphisms
rather than the morphisms themselves. This is why `application’ becomes more fundamental than
`composition’. For example, suppose we have a program:
f
X
Y
and this datum x of type X:
x
X
In category theory, we can feed the datum x into the program by composing them:
x
f
X
Y
In functional programming, we instead `apply’ the name of f to x, as follows:
x
f
X
X
Y
Y
58
background image
However, the results agree:
x
f
X
X
Y
Y
=
x
f
X
Y
The functional programming approach looks awkward when drawn using string diagrams, but simple
when described on its own terms, especially if we write something short for f , say g. Then we are
simply applying g to x and obtaining (g x). In the language of category theory, this amounts to using
a natural transformation called application:
app
X,Y
: hom(1, X
Y )
× hom(1,X) hom(1,Y )
( f , x)
fx
which can be defined in any closed monoidal category.
Computer scientists call ways of getting new types from old `type constructors’. From the category-
theoretic viewpoint these correspond to functors (though in practice, often just their value on objects
is given). For example, we have already seen that for any pair of types X and Y there is a function type
X
Y
. In category theory, this corresponds to the internal hom functor
: C
op
× C C.
But there is also another functor inherent in the definition of `closed monoidal category’, the tensor
product:
:C × C C.
This gives another type constructor: for any types X and Y there is a product type, denoted X
Y.
Product types give another way to deal with programs that take more than one input. We have
already mentioned a program
plus:integer
(integer
integer)
With the help of product types, we could also have a program
+:
(integer
integer)
integer
Even more importantly, product types let us deal with programs that produce more than one output.
So, for example, we might have a program that takes an integer and duplicates it:
59
background image
duplicate:
integer
(integer
integer)
Indeed, we would necessarily have such a program in any cartesian closed category — but not necessarily
in quantum computation [26, 93].
In computer science, `polymorphism’ refers to the ability of certain programs to accept data, not
just of a single fixed type, but of a variable type. The definition of closed monoidal category leads
us naturally to polymorphism. For example, given data x:X and y:Y, we would like a program that
combines them into an ordered pair: a datum x
y: X Y. But, it would be tiresome to use
a separate program to do this for each choice of types X and Y. So, we will use just one program: a
`polymorphic combinator’. We call it
, and write:
:( X
(Y
(X
Y))
Here X and Y are not fixed types, but variables standing for arbitrary types. To define how
behaves,
we use the following rewrite rule:
((
x) y) = x y
All this seems reasonable, but how was it forced upon us by the the definition of `closed monoidal
category’ ? To answer this, we need to think about dinatural transformations. In Section 3.3 we
saw that in logic, dinatural transformations give us inference schemas. In computation, they give us
polymorphic combinators!
For example,
comes from a dinatural transformation which can be defined in any closed monoidal
category. To see how this works, it will be slightly prettier if we henceforth switch to using left closed
monoidal categories, where currying goes like this:
c
X,Y,Z
: hom(X
Y,Z)
hom(X,Y
Z).
With this convention, if we curry the identity
1
X
Y
: X
Y X Y
we get a morphism called coevaluation:
coev
X,Y
: X
(Y
(X
Y )).
Currying once more, we get a morphism
X,Y
: I
(X (Y
(X
Y ))).
We urge the reader to check that this extends to a dinatural transformation! Translating into the
language of computer science, this gives the polymorphic combinator
:( X
(Y
(X
Y))
We can continue along these lines, transcribing all the dinatural transformations inherent in the
definition of `closed monoidal category’ into polymorphic combinators. We can also work out rewrite
rules for these. For example, the dinatural transformation for composition gives us this:
60
background image
compose:(Y
Z)
((X
Y)
(X
Z))
(((compose g) f) x) = (g (f x))
while the one for identity morphisms gives us this:
id:X
X
(id x) = x
We should also have a polymorphic combinator called curry:
curry:((X
Y)
Z)
(X
(Y
Z))
(((curry f) x) y) = (f (x
y))
Since currying is a bijection, this should have an inverse, uncurry:
uncurry:(X
(Y
Z))
((X
Y)
Z)
((uncurry f) (x
y)) = ((f x) y)
This lets us start with a program
plus:integer
(integer
integer)
and define a new program
+:
(integer
integer)
integer
using the rewrite rule
+ = (uncurry plus)
Then we have
(+ (1
3))
= ((uncurry plus) (1
3))
= ((plus 1) 3)
= 4
The associator also gives a polymorphic combinator:
assoc:((X
Y) Z)
(X
(Y Z))
(assoc (x
y) z)) = x (y z)
And, since the associator is invertible, we should also have
unassoc:(X
(Y Z))
((X
Y) Z)
(assoc x
(y z)) = (x y) z
61
background image
Furthermore, corresponding to the unit object in a monoidal category, we should have a type I called
the unit type. This type comes with a datum unit:I, and we have combinators corresponding to the
right and left unit laws, along with the obvious rewrite rules:
left:(I
X)
X
(left (unit
x)) = x
right:(X
I)
X
(right (X
unit)) = x
These too need inverses.
So far we have not mentioned the braiding. But at this point, we hope the reader sees the pattern
and is ready for a more formal treatment.
4.3
Combinator Calculi from Categories
In Section 3.3 we translated the concept of `closed symmetric monoidal category’ into the language
of logic. We obtained a system of logic that works in any such category. Like pidgin English, this
system is not very expressive. Its only virtue is that being so rudimentary, it works in a vast variety of
situations: not just `classical’ ones, but also intuitionistic, linear and quantum ones — and even purely
topological ones, like the category nCob.
A similar translation into the language of computer science will give us a rudimentary programming
language. This will let us see any morphism in any closed symmetric monoidal category as a kind of
`program’.
We have already sketched how this should work. Suppose C is a closed symmetric monoidal
category. Then any object X
C gives a type X. Any morphism x:I X gives a datum of type X. All
the functors inherent in the definition of `symmetric closed monoidal category’ give type constructors,
and all the dinatural transformations give polymorphic combinators. Finally, all the equations between
morphisms in C give rewrite rules.
We begin by doing this translation for just a closed monoidal category. A datum will be an
equivalence class of `terms’:
Definition 22 A closed monoidal combinator calculus consists of:
· a collection of types closed under the following type constructors:
­ I is a type;
­ if X and Y are types, then X
Y
is a type;
­ if X and Y are types, then X
Y is a type.
· a collection of terms, each of some type, such that:
­ if X and Y are types and f:X
Y
and x:X are terms, then (f x):Y is a term;
­ if X and Y are types and x:X and y:Y are terms, then x
y:X Y is a term;
62
background image
­ unit:I is a term.
­ if X, Y, and Z are types, then the following polymorphic combinators give terms:
id:X
X
compose:(Y
Z)
((X
Y)
(X
Z))
curry:((X Y)
Z)
(X
(Y
Z))
uncurry:((X
Y)
Z)
((X
Y)
Z)
left:(I X)
X
unleft:X
(I
X)
right:(X I)
X
unright:X
(X
I)
· a collection of type-preserving rewrite rules, including:
­ (id t) = t
­ (((compose t) u) v) = (t (u v))
­ (((curry t) u) v) = (t (u
v))
­ ((uncurry t) (u
v)) = ((t u) v)
­ (left (unit
t)) = t
­ (unleft t) = unit
t
­ (right (t
unit)) = t
­ (unright t) = t
unit
where the variables t, u, v are understood to refer to the subterms in those positions.
We can freely pass back and forth between closed monoidal categories and closed monoidal combi-
nator calculi. However, the details deserve a little explanation, since morphisms come from equivalence
classes of terms.
First, given a closed monoidal category C, there is a way to build a closed monoidal combinator
calculus. The types in this calculus are the objects of C and all expressions generated from these
using the type constructors
and
. When X is an object of C, the terms x:X are the morphisms
x: I
X. Terms of other types are generated following the rules listed above. In addition to the
rewrite rules listed above, we need rules that say how application and tensoring work in C. Given
morphisms x: I
X and g:I X
Y in C, we can apply g to x and get a morphism y: I
Y .
These give terms x:X, g:X
Y
, and y:Y, and we include a rewrite rule:
(g x) = y
Similarly, given morphisms x: I
X and y:I Y , tensoring gives an object Z = X Y and a
morphism z: I
Z. These give terms x:X, y:Y, and z:Z, and we include a rewrite rule:
x
y = z
63
background image
Second, from a closed monoidal combinator calculus, there is a way to build a closed monoidal
category C. An object X
C is just a type X. A morphism of the form x:X Y is an equivalence
class of terms of type X
Y
. The equivalence relation is generated as follows:
· if the terms x:X and x :X are related by a rewrite rule, they are equivalent;
· if the terms x:X and x :X are equivalent and the terms f:X
Y
and f :X
Y
are equivalent,
then (f x):Y and (f
x ):Y
are equivalent;
· if the terms x:X and x :X are equivalent and the terms y:Y and y :Y are equivalent, then x
y:
X
Y and x y : X Y are equivalent.
We define composition of morphisms using compose, define identity morphisms using id, and make C
into a closed monoidal category using the other features of the calculus: the term constructors
and
, and the various polymorphic combinators and rewrite rules.
Next we include the braiding. For this, we follow an approach going back to Sch¨
onfinkel. Namely,
we use the braiding to construct a natural transformation
((X
Y ) Z) ((Y X) Z)
which in turn gives a natural transformation
(X
(Y
Z))
(Y
(X
Z))
which we can curry to give a dinatural transformation:
braid
X,Y,Z
: I
(X
(Y
Z))
(Y
(X
Z))
This gives a polymorphic combinator which Sch¨
onfinkel called C. We instead call it braid:
braid:(X
(Y
Z))
(Y
(X
Z))
At this point, a subtlety arises. The obvious rewrite rule for this combinator is
(((braid f) x) y) = ((f y) x)
But this is only suited to symmetric monoidal categories, since the terms (braid (braid f)) and f
are equivalent in the sense described above. To describe nonsymmetric braided monoidal categories
using combinatory logic, it seems we need to specify rewrite rules in a way that depends explicitly on
the desired braiding:
Definition 23 A closed monoidal combinator calculus is braided if whenever X, Y, and Z are types,
then
braid:(X
(Y
Z))
(Y
(X
Z))
is a term, and braid obeys rewrite rules corresponding to the hexagon identities in the definition of
`braided monoidal category’.
64
background image
Definition 24 A closed braided monoidal combinator calculus is symmetric if it is equipped with the
rewrite rule
(braid (braid f)) = f
Such calculi give closed braided or symmetric monoidal categories, and conversely. For more details,
see the work of Abramsky, Haghverdi and Scott [5] on `linear combinatory algebra’, and Bierman’s
reformulation of intuitionistic linear logic in terms of a `linear combinator calculus’ [21]. Our setup is
just a piece of what they have done.
By some adding extra features to a closed monoidal combinator calculus, we can ensure that the
corresponding closed monoidal category is cartesian. The resulting formalism is just another way of
thinking about the simply typed lambda calculus. If we also have just one generating type X, and a
combinator
:X
(X
X)
with an inverse
:( X
X)
X
we recover the original untyped lambda calculus, since now any term x:X can be `applied’ to any other
term y:X as follows:
((
x) y)
So, these more familiar frameworks for studying computation can be seen as special cases of the one
described here.
5
Conclusions
In this paper we sketched how category theory can serve to clarify the analogies between physics,
topology, logic and computation. Each field has its own concept of `thing’ (object) and `process’
(morphism) — and these things and processes are organized into categories that share many common
features. To keep our task manageable, we focused on those features that are present in every closed
symmetric monoidal category. Table 4, an expanded version of the Rosetta Stone, shows some of the
analogies we found.
However, we only scratched the surface! There is much more to say about categories equipped with
extra structure, and how we can use them to strengthen the ties between physics, topology, logic and
computation — not to mention what happens when we go from categories to n-categories. But the
real fun starts when we exploit these analogies to come up with new ideas and surprising connections.
Here is an example.
In the late 1980s, Witten [92] realized that string theory was deeply connected to a 3d topological
quantum field theory and thus the theory of knots and tangles [60]. This led to a huge explosion
65
background image
Category Theory
Physics
Topology
Logic
Computation
object X
Hilbert space X
manifold X
proposition X
data type X
morphism
operator
cobordism
proof
program
f : X
Y
f : X
Y
f : X
Y
f : X
Y
f:
X -> Y
tensor product
Hilbert space
disjoint union
conjunction
product
of objects:
of joint system:
of manifolds:
of propositions:
of data types:
X
Y
X
Y
X
Y
X
Y
X
Y
tensor product of
parallel
disjoint union of
proofs carried out
programs executing
morphisms: f
g processes: f g cobordisms: f g
in parallel: f
g in parallel: f g
internal hom:
Hilbert space of
disjoint union of
conditional
function type:
X
Y
`anti-X and Y ‘:
orientation-reversed
proposition:
X -> Y
X
Y
X and Y : X
Y
X
Y
Table 4: The Rosetta Stone (larger version)
of work, which was ultimately distilled into a beautiful body of results focused on a certain class of
compact braided monoidal categories called `modular tensor categories’ [14, 90].
All this might seem of purely theoretical interest, were it not for the fact that superconducting thin
films in magnetic fields seem to display an effect — the `fractional quantum Hall effect’ — that can be
nicely modelled with the help of such categories [85, 86]. In a nutshell, the idea is that excitations of
these films can act like particles, called `anyons’. When two anyons trade places, the result depends
on how they go about it:
=
So, collections of anyons are described by objects in a braided monoidal category! The details depend
on things like the strength of the magnetic field; the range of possibilities can be worked out with the
help of modular tensor categories [69, 72].
So far this is all about physics and topology. Computation entered the game around 2000, when
Freedman, Kitaev, Larsen and Wang [39] showed that certain systems of anyons could function as
`universal quantum computers’. This means that, in principle, arbitrary computations can be carried
out by moving anyons around. Doing this in practice will be far from easy. However, Microsoft has set
up a research unit called Project Q attempting to do just this. After all, a working quantum computer
could have huge practical consequences.
But regardless of whether topological quantum computation ever becomes practical, the implica-
66
background image
tions are marvelous. A simple diagram like this:
can now be seen as a quantum process, a tangle, a computation — or an abstract morphism in any
braided monoidal category! This is just the sort of thing one would hope for in a general science of
systems and processes.
Acknowledgements
We owe a lot to participants of the seminar at UCR where some of this material was first presented:
especially David Ellerman, Larry Harper, Tom Payne — and Derek Wise, who took notes [10]. MS
would like to thank Google for letting him devote 20% of his time to this research, and Ken Shirriff for
helpful corrections. This paper was also vastly improved by comments by Andrej Bauer, Tim Chevalier,
Derek Elkins, Matt Hellige, Robin Houston, Theo Johnson­Freyd, J¨
urgen Koslowski, Todd Trimble,
Dave Tweed, and other regulars at the n-Category Caf´e.
References
[1] H. Abelson, G. J. Sussman and J. Sussman, Structure and Interpretation of Computer Programs,
MIT Press, 1996. Available at http://mitpress.mit.edu/sicp/.
[2] S. Abramsky, Abstract scalars, loops, and free traced and strongly compact closed categories, in
Proceedings of CALCO 2005, Lecture Notes in Computer Science 3629, Springer, Berlin, 2005,
1­31. Also available at
http://web.comlab.ox.ac.uk/oucl/work/samson.abramsky/calco05.pdf
.
[3] S. Abramsky and B. Coecke, A categorical semantics of quantum protocols, available at
arXiv:quant-ph/0402130
.
[4] S. Abramsky and R. Duncan, A categorical quantum logic, to appear in Mathematical Structures
in Computer Science, 2006. Also available as arXiv:quant-ph/0512114.
[5] S. Abramsky, E. Haghverdi and P. Scott, Geometry of interaction and linear combinatory algebras,
Math. Struct. Comp. Sci. 12 (2002), 625­665. Also available at
http://citeseer.ist.psu.edu/491623.html
.
[6] M. F. Atiyah, Topological quantum field theories, Publ. Math. IHES Paris 68 (1989), 175­186.
M. F. Atiyah, The Geometry and Physics of Knots, Cambridge U. Press, Cambridge, 1990.
67
background image
[7] J. Baez, An introduction to spin foam models of quantum gravity and BF theory, in Geometry
and Quantum Physics, eds. H. Gausterer and H. Grosse, Springer, Berlin, 2000, pp. 25­93. Also
available at arXiv:gr-qc/9905087.
[8] J. Baez, Higher-dimensional algebra and Planck-scale physics, in Physics Meets Philosophy at the
Planck Length, eds. C. Callender and N. Huggett, Cambridge U. Press, Cambridge, 2001, pp.
177­195. Also available as arXiv:gr-qc/9902017.
[9] J. Baez, Quantum quandaries: a category-theoretic perspective, in Structural Foundations of
Quantum Gravity, eds. S. French, D. Rickles and J. Saatsi, Oxford U. Press, Oxford, 2006, pp.
240-265. Also available as arXiv:quant-ph/0404040.
[10] J. Baez, Classical versus quantum computation. Seminar notes by D. Wise available at
http://math.ucr.edu/home/baez/qg-fall2006/
and
http://math.ucr.edu/home/baez/qg-winter2007
.
[11] J. Baez and J. Dolan, Higher-dimensional algebra and topological quantum field theory, Jour.
Math. Phys. 36 (1995), 6073­6105. Also available as arXiv:q-alg/9503002.
[12] J. Baez and L. Langford, Higher-dimensional algebra IV: 2-tangles, Adv. Math. 180 (2003), 705­
764. Also available as arXiv:q-alg/9703033.
[13] J. Baez and A. Lauda, A prehistory of n-categorical physics, to appear in proceedings of
Deep Beauty: Mathematical Innovation and the Search for an Underlying Intelligibility of
the Quantum World, Princeton, October 3, 2007, ed. Hans Halvorson. Also available at
http://math.ucr.edu/home/baez/history.pdf
.
[14] B. Bakalov and A. Kirillov, Jr., Lectures on Tensor Categories and Modular Functors, Amer-
ican Mathematical Society, Providence, Rhode Island, 2001. Preliminary version available at
http://www.math.sunysb.edu/
kirillov/tensor/tensor.html.
[15] H. Barendregt, The Lambda Calculus, its Syntax and Semantics, North­Holland, Amsterdam,
1984.
[16] M. Barr and C. Wells, Toposes, Triples and Theories, Springer Verlag, Berlin, 1983. Revised and
corrected version available at http://www.cwru.edu/artsci/math/wells/pub/ttt.html.
[17] J. S. Bell, On the Einstein-Podolsky-Rosen paradox, Physics 1 (1964), 195­200.
[18] J. L. Bell, The Development of Categorical Logic, available at
http://publish.uwo.ca/
jbell/catlogprime.pdf.
[19] N. Benton, G. M. Bierman, V. de Paiva and J. M. E. Hyland, Linear lambda-calculus
and categorical models revisited, in Computer Science Logic (CSL’92), Selected Papers, Lec-
ture Notes in Computer Science 702, Springer, Berlin, 1992, pp. 61­84. Also available at
http://citeseer.ist.psu.edu/benton92linear.html
[20] N. Benton, G. Bierman, V. de Paiva and M. Hyland, Term Assignment for Intuitionistic Linear
Logic, Technical Report 262, University of Cambridge Computer Laboratory, August 1992. Also
available at http://citeseer.ist.psu.edu/1273.html.
68
background image
[21] G. Bierman, On Intuitionistic Linear Logic, PhD Thesis, Cambridge University. Available at
http://research.microsoft.com/
gmb/Papers/thesis.pdf.
[22] R. Blute and P. Scott, Category theory for linear logicians, in Linear Logic in Computer Science,
eds. T. Ehrhard, J.-Y. Girard, P. Ruet, P. Scott, Cambridge U. Press, Cambridge, 2004, pp. 3­64.
Also available as http://www.site.uottawa.ca/
phil/papers/catsurv.web.pdf.
[23] S. N. Burris and H. P. Sankappanavar, A Course in Universal Algebra, Springer, Berlin, 1981.
Also available as http://www.math.uwaterloo.ca/
snburris/htdocs/ualg.html.
[24] V. Chari and A. Pressley, A Guide to Quantum Groups, Cambridge University Press, Cambridge,
1995.
[25] E. Cheng and A. Lauda, Higher-Dimensional Categories: an Illustrated Guidebook. Available at
http://www.dpmms.cam.ac.uk/
elgc2/guidebook/.
[26] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge
U. Press, Cambridge, 2000.
[27] A. Church, An unsolvable problem of elementary number theory, Amer. Jour. Math. 58 (1936),
345­363.
[28] B. Coecke, De-linearizing linearity: projective quantum axiomatics from strong compact closure,
Proceedings of the 3rd International Workshop on Quantum Programming Languages (QPL 2005),
Elsevier, 2007, pp. 49­72. Also available as arXiv:quant-ph/0506134.
[29] B. Coecke, Kindergarten quantum mechanics, to appear in Proceedings of QTRF-III. Also avail-
able as arXiv:quant-ph/0510032.
[30] B. Coecke and E. O. Paquette, POVMs and Naimark’s theorem without sums, to appear in Pro-
ceedings of the 4th International Workshop on Quantum Programming Languages. Also available
as arXiv:quant-ph/0608072.
[31] B. Coecke and D. Pavlovic, Quantum measurements without sums, to appear in The Mathematics
of Quantum Computation and Technology, eds. Chen, Kauffman and Lomonaco, Taylor and
Francis. Also available as arXiv:quant-ph/0608035.
[32] R. L. Crole, Categories for Types, Cambridge U. Press, Cambridge, 1993.
[33] H. B. Curry and R. Feys, Combinatory Logic Vol. I, North­Holland, Amsterdam, 1958.
[34] P.
Cvitanovic,
Group Theory,
Princeton U.
Press,
Princeton,
2003. Available at
http://www.nbi.dk/GroupTheory/
.
[35] R. Di Cosmo and D. Miller, Linear logic, Stanford Encyclopedia of Philosophy, available at
http://plato.stanford.edu/entries/logic-linear/
.
[36] M. Dorca and A. van Tonder, Quantum computation, categorical semantics and linear logic,
available as arXiv:quant-ph/0312174.
[37] S. Eilenberg and G. M. Kelly, Closed categories, in Proceedings of the Conference on Categorical
Algebra (La Jolla, 1965), Springer, Berlin, 1966, pp. 421­562.
69
background image
[38] S. Eilenberg and S. Mac Lane, General theory of natural equivalences, Trans. Amer. Math. Soc.
58 (1945), 231­294.
[39] M. Freedman, A. Kitaev, M. Larsen and Z. Wang, Topological quantum computation, available
as arXiv:quant-ph/0101025.
M. Freedman, A. Kitaev and Z. Wang, Simulation of topological field theories by quantum com-
puters, Comm. Math. Phys. 227 (2002), 587­603. Also available as arXiv:quant-ph/0001071.
M. Freedman, A. Kitaev and Z. Wang, A modular functor which is universal for quantum com-
putation, Comm. Math. Phys. 227 (2002), 605­622. Also available as arXiv:quant-ph/0001108.
[40] P. Freyd and D. Yetter, Braided compact monoidal categories with applications to low dimensional
topology, Adv. Math. 77 (1989), 156­182.
[41] J.-Y. Girard,
Linear logic,
Theor. Comp. Sci. 50 (1987),
1­102. Also available at
http://iml.univ-mrs.fr/ girard/linear.pdf
.
[42] J.-Y. Girard, Y. Lafont and P. Taylor, Proofs and Types, Cambridge U. Press, Cambridge, 1990.
Also available at http://www.monad.me.uk/stable/Proofs%2BTypes.html.
[43] K. G¨
odel, Zur intuitionistischen Arithmetik und Zahlentheorie, Ergebnisse eines mathematischen
Kolloquiums 4 (1933), 34­38.
[44] R. Goldblatt, Topoi: the Categorial Analysis of Logic, North­Holland, New York, 1984. Also
available at http://cdl.library.cornell.edu/cgi-bin/cul.math/docviewer?did=Gold010.
[45] D. Gottesman and I. L. Chuang, Quantum teleportation is a universal computational primitive,
Nature 402 (1999), 390­393. Also available as arXiv:quant-ph/9908010.
[46] M. Hasegawa, Logical predicates for intuitionistic linear type theories, Typed Lambda
Calculi and Applications:
4th International Conference, TLCA ‘99, ed. J.-Y. Girard,
Lecture Notes in Computer Science 1581, Springer, Berlin, 1999. Also available at
http://citeseer.ist.psu.edu/187161.html
.
[47] A. Heyting, ed., L. E. J. Brouwer: Collected Works 1: Philosophy and Foundations of Mathemat-
ics, Elsevier, Amsterdam, 1975.
[48] W. A. Howard, The formulae-as-types notion of constructions, in To H. B. Curry: Essays on Com-
binatory Logic, Lambda Calculus and Formalism, eds. J. P. Seldin and J. R. Hindley, Academic
Press, New York, 1980, pp. 479­490.
[49] C. B. Jay, The structure of free closed categories, Jour. Pure Appl. Alg. 66 (1990), 271­285.
[50] A. Joyal and R. Street, The geometry of tensor calculus I, Adv. Math. 88 (1991), 55­113.
A. Joyal and R. Street, The geometry of tensor calculus II. Available at
http://www.math.mq.edu.au/ street/GTCII.pdf
.
[51] A. Joyal and R. Street, Braided monoidal categories, Macquarie Math Reports 860081 (1986).
Available at http://rutherglen.ics.mq.edu.au/
street/JS86.pdf.
A. Joyal and R. Street, Braided tensor categories, Adv. Math. 102 (1993), 20­78.
70
background image
[52] D. Kaiser, Drawing Theories Apart: The Dispersion of Feynman Diagrams in Postwar Physics,
U. Chicago Press, Chicago, 2005.
[53] C. Kassel, Quantum Groups, Springer, Berlin, 1995.
[54] L. H. Kauffman, Knots and Physics, World Scientific, Singapore, 1991.
[55] L. H. Kauffman and S. Lins, Temperley­Lieb Recoupling Theory and Invariants of 3-Manifolds,
Princeton U. Press, Princeton, 1994.
[56] G. M. Kelly and S. Mac Lane, Coherence in closed categories, Jour. Pure Appl. Alg. 1 (1971),
97­140 and 219.
[57] G. M. Kelly and M. L. Laplaza, Coherence for compact closed categories, Jour. Pure Appl. Alg.
19 (1980), 193­213.
[58] S. Kleene, -definability and recursiveness, Duke Math. Jour. 2 (1936), 340­353.
[59] J. Kock, Frobenius Algebras and 2D Topological Quantum Field Theories, London Mathematical
Society Student Texts 59, Cambridge U. Press, Cambridge, 2004.
[60] T. Kohno, ed., New Developments in the Theory of Knots, World Scientific, Singapore, 1990.
[61] J. Lambek, From -calculus to cartesian closed categories, in To H. B. Curry: Essays on Com-
binatory Logic, Lambda Calculus and Formalism, eds. J. P. Seldin and J. R. Hindley, Academic
Press, New York, 1980, pp. 375­402.
[62] J. Lambek and P. J. Scott, Introduction to Higher-order Categorical Logic, Cambridge U. Press,
Cambridge, 1986.
[63] P. Landin, A correspondence between ALGOL 60 and Church’s lambda-notation, Comm. ACM 8
(1965), 89­101, 158­165.
[64] F. W. Lawvere, Functorial Semantics of Algebraic Theories, Ph.D. Dissertation, Columbia Uni-
versity, 1963. Also available at
http://www.tac.mta.ca/tac/reprints/articles/5/tr5abs.html
.
[65] T. Leinster, A survey of definitions of n-category, Th. Appl. Cat. 10 (2002), 1­70. Also available
as arXiv:math/0107188.
[66] S. Mac Lane, Natural associativity and commutativity, Rice Univ. Stud. 49 (1963) 28­46.
[67] S. Mac Lane, Categories for the Working Mathematician, Springer, Berlin, 1998.
[68] C. McLarty, Elementary Categories, Elementary Toposes, Clarendon Press, Oxford, 1995.
[69] G. Moore and N. Read, Nonabelions in the the fractional quantum Hall effect, Nucl. Phys. B 360
(1991), 362­396.
[70] J. Moschovakis, Intuitionistic logic Stanford Encyclopedia of Philosophy, available at
http://plato.stanford.edu/entries/logic-intuitionistic/
.
71
background image
[71] R. Penrose, Applications of negative dimensional tensors, in Combinatorial Mathematics and its
Applications, ed. D. Welsh. Academic Press, 1971, pp. 221­244.
R. Penrose, Angular momentum: an approach to combinatorial space-time, in Quantum Theory
and Beyond, ed. T. Bastin. Cambridge U. Press, 1971, pp. 151­180.
R. Penrose, On the nature of quantum geometry, in Magic Without Magic, ed. J. Klauder. Free-
man, 1972, pp. 333­354.
R. Penrose, Combinatorial quantum theory and quantized directions, in Advances in Twistor
Theory, eds. L. Hughston and R. Ward. Pitman Advanced Publishing Program, 1979, pp. 301­
317.
[72] E. Rowell, R. Stong and Z. Wang, On classification of modular tensor categories, available as
arXiv:0712.1377
.
[73] P. Scott, Some aspects of categories in computer science, in Handbook of Algebra, Vol. 2, ed. M.
Hazewinkel, Elsevier, Amsterdam, 2000. Also available at
http://www.site.uottawa.ca/ phil/papers/handbook.ps
.
[74] S. Sawin, Links, quantum groups and TQFTs, Bull. Amer. Math. Soc. 33 (1996), 413-445. Also
available as arXiv:q-alg/9506002.
[75] A. Schalk, What is a categorical model for linear logic? Available at
http://www.cs.man.ac.uk/
schalk/notes/llmodel.pdf.
[76] M. Sch¨
onfinkel, ¨
Uber die Bausteine der mathematischen Logik, Math. Ann. 92 (1924), 305­316.
Also available as On the building blocks of mathematical logic, trans. S. Bauer-Mengelberg, in A
Source Book in Mathematical Logic, 1879-1931, ed. J. van Heijenoort, Harvard U. Press, Cam-
bridge, Massachusetts, 1967, pp. 355­366.
[77] R. A. G. Seely, Weak adjointness in proof theory, Applications of Sheaves, Lecture Notes in
Mathematics 753, Springer, Berlin, 697­701. Also available at
http://www.math.mcgill.ca/
rags/WkAdj/adj.pdf.
[78] G. Segal, The definition of a conformal field theory, in Topology, Geometry and Quantum Field
Theory: Proceedings of the 2002 Oxford Symposium in Honour of the 60th Birthday of Graeme
Segal, ed. U. L. Tillmann, Cambridge U. Press, 2004.
[79] P. Selinger, Lecture notes on the lambda calculus, available at
http://www.mscs.dal.ca/
selinger/papers/#lambdanotes.
[80] P. Selinger, Dagger compact closed categories and completely positive maps, Proceedings of the
3rd International Workshop on Quantum Programming Languages (QPL 2005), Elsevier, 2007,
pp. 139­163. Also available at http://www.mscs.dal.ca/
selinger/papers/#dagger.
[81] P. Selinger and B. Valiron, A lambda calculus for quantum computation with classical control,
Math. Struct. Comp. Sci. 13 (2006), 527­552. Also available at
http://www.mathstat.dal.ca/
selinger/papers/#qlambda.
[82] M.-C. Shum, Tortile tensor categories, Jour. Pure Appl. Alg. 93 (1994), 57­110.
72
background image
[83] S. Soloviev, Proof of a conjecture of S. Mac Lane, Ann. Pure Appl. Logic 90 (1997), 101­162.
[84] L. Smolin, The future of spin networks, The Geometric Universe: Science, Geometry, and the
Work of Roger Penrose, eds. S. Hugget, P. Tod, and L. J. Mason, Oxford U. Press, Oxford, 1998.
Also available as arXiv:gr-qc/9702030.
[85] A. Stern, Anyons and the quantum Hall effect ­ a pedagogical review, Ann. Phys. 323 (2008),
204­249. available as arXiv:0711.4697.
[86] M. Stone, ed., Quantum Hall Effect, World Scientific, Singapore, 1992.
[87] M. E. Szabo, ed., Collected Papers of Gerhard Gentzen, North­Holland, Amsterdam, 1969.
[88] T. Trimble, Linear Logic, Bimodules, and Full Coherence for Autonomous Categories, Ph.D. thesis,
Rutgers University, 1994.
[89] A. S. Troelstra, Lectures on Linear Logic, Center for the Study of Language and Information,
Stanford, California, 1992.
[90] V. G. Turaev, Quantum Invariants of Knots and 3-Manifolds, de Gruyter, Berlin, 1994.
[91] A. van Tonder, A lambda calculus for quantum computation, SIAM Jour. Comput. 33 (2004),
1109­1135. Also available as arXiv:quant-ph/0307150.
[92] E. Witten, Quantum field theory and the Jones polynomial, Comm. Math. Phys. 121 (1989),
351­399.
[93] W. K. Wootters and W. H. Zurek, A single quantum cannot be cloned, Nature 299 (1982), 802­
803.
[94] D. N. Yetter, Functorial Knot Theory: Categories of Tangles, Coherence, Categorical Deforma-
tions, and Topological Invariants, World Scientific, Singapore, 2001.
73
Document Outline

* Introduction
* The Analogy Between Physics and Topology
o Background
o Categories
o Monoidal Categories
o Braided Monoidal Categories
o Symmetric Monoidal Categories
o Closed Categories
o Dagger Categories
* Logic
o Background
o Proofs as Morphisms
o Logical Theories from Categories
* Computation
o Background
o Categories in Functional Programming
o Combinator Calculi from Categories
* Conclusions

logic ,phisics and history ,lecture by some punk

old man resting his forehead on his hand like he’s got a headache after working that he owes a million dollars back taxes

Karl Popper (1966)
Objective Knowledge
A Realist View of Logic, Physics, and History (1966)

Source: Objective Knowledge (1972) publ. Clarendon Press. The second last chapter is reproduced here.

MAN, some modern philosophers tell us, is alienated from his world: he is a stranger and afraid in a world he never made. Perhaps he is; yet so are animals, and even plants. They too were born, long ago, into a physico-chemical world, a world they never made. But although they did not make their world, these living things changed it beyond all recognition and, indeed, remade the small corner of the universe into which they were born. Perhaps the greatest of these changes was made by the plants. They radically transformed the chemical composition of the earth’s whole atmosphere. Next in magnitude are perhaps the achievements of some marine animals which built coral reefs and islands and mountain ranges of limestone. Last came man, who for a long time did not change his environment in any remarkable way, apart from contributing, by deforestation, to the spread of the desert. Of course, he did build a few pyramids; but only during the last century or so , did he begin to compete with the reef-building corals. Still more recently he began to undo the work of the plants by slightly, though significantly, raising the carbon dioxide content of the atmosphere.

Thus, we have not made our world. So far we have not even changed it much, compared with the changes achieved by animals and plants. Yet we have created a new kind of product or artefact which promises in time to work changes in our corner of the world as great as those worked by our predecessors, the oxygen-producing plants, or the island-building corals. These new products, which are decidedly of our own making, are our myths, our ideas, and especially our scientific theories: theories about the world we live in.

I suggest that we may look upon these myths, these ideas and theories, as some of the most characteristic products of human activity. Like tools, they are organs evolving outside our skins. They are exosomatic artefacts. Thus we may count among these characteristic products especially what is called ‘human knowledge’; where we take the word ‘knowledge’ in the objective or impersonal sense, in which it may be said to be contained in a book; or stored in a library; or taught in a university.

When referring to human knowledge, I shall usually have this objective sense of the word ‘knowledge’ in mind. This allows us to think of knowledge produced by men as analogous to the honey produced by bees: the honey is made by bees, stored by bees, and consumed by bees; and the individual bee which consumes honey will not, in general, consume only the bit it has produced itself: honey is also consumed by the drones which have not produced any at all (not to mention that stored treasure of honey which the bees may lose to bears or beekeepers). It is also interesting to note that, in order to keep up its powers to produce more honey, each working bee has to consume honey, some of it usually produced by other bees.

All this holds, by and large, with slight differences, for oxygen-producing plants and for theory-producing men: we, too, are not only producers but consumers of theories; and we have to consume other people’s theories, and sometimes perhaps our own, if we are to go on producing.

‘To consume’ means here, first of all, ‘to digest’, as in the case of the bees. But it means more: our consumption of theories, whether those produced by other people or by ourselves, also means criticising them, changing them, and often even demolishing them, in order to replace them by better ones.

All these are operations which are necessary for the growth of our knowledge; and I again mean here, of course, knowledge in the objective sense.

I suggest that it looks at present as if it is this growth of human knowledge, the growth of our theories, which turns out, human history into a chapter so radically new in the history of the universe, and also in the history of life on earth.

All three of these histories-the history of the universe, the history of life on earth, and the history of man and of the growth of his knowledge-are, of course, themselves chapters of our knowledge. Consequently, the last of these chapters-that is, the history of knowledge-will consist of knowledge about knowledge. It will have to contain, at least implicitly, theories about theories, and especially theories about the way in which theories grow.

I shall, therefore, before going any further into my topic, present a general tetradic schema which I have found more and more useful as a description of the growth of theories. It is as follows:

P1 » TT » EE » P2.

Here ‘P’ stands for ‘problem’; ‘TT’ stands for ‘tentative theory’; and ‘EE’ stands for ‘(attempted) error-elimination’, especially by way of critical discussion. My tetradic schema is an attempt to show that the result of criticism, or of error-elimination, applied to a tentative theory, is as a rule the emergence of a new problem; or, indeed, of several new problems. Problems, after they have been solved and their solutions properly examined, tend to beget problem-children: new problems, often of greater depth and ever greater fertility than the old ones. ‘This can be seen especially in the physical sciences; and I suggest that we can best gauge the progress made in any science by the distance in depth and expectedness between P1 and P2: the best tentative theories (and all theories are tentative are those which give rise to the deepest and most unexpected problems.

My tetradic schema can be elaborated in various ways; for example, by writing it as follows:

» TTa » EEa » P2a

P1 » TTb » EEb » P2b

» TTn » EEn » P2n

In this form the schema would indicate that, if we can, we should propose many theories as attempts to solve some given problem, and that we should critically examine each of our tentative solutions. We then find that each gives rise to new problems; and we may follow up those which promise the most novel and most interesting new problem: if the new problem, P2b, say, turns out to be merely the old P, in disguise, then we say that our theory only manages to shift the problem a little; and in some cases we may take this as a decisive objection to the tentative theory, TTb.

This shows that error-elimination is only part of our critical discussion: our critical discussion of the competing tentative theories may compare them, and assess them, from many different points of view. The decisive point is, of course, always: how well does our theory solve its problems; that is, P1.

At any rate, one of the things we wish to achieve is to learn something new. According to our schema, progressiveness is one of the things we demand of a good tentative theory: and it is brought out by the critical discussion of it: the theory is progressive if our discussion shows that it has really made a deference to the problem we wanted to solve; that is, if the newly emerging problems are different from the old ones.

If the newly emerging problems are different then we can hope to learn a great many new things when we proceed to solve them in turn.

Thus my tetradic schema can be used to describe the emergence of new problems and, consequently, the emergence of new solutions-that is, new theories; and I even want to present it as an attempt to make sense of the admittedly vague idea of emergence-as an attempt to speak of emergence in a rational manner. I should like to mention that it can be applied not only to the emergence of new scientific problems and, consequently, new scientific theories, but to the emergence of new forms of behaviour, and even new forms of living organisms.

Let me give you an example. P1 may be, say, a certain problem concerning the survival of a species, such as the problem of reproduction, of producing offspring. According to Darwin, this survival problem has found a good solution if the species survives; any other tentative solution will be eliminated by the disappearance of both the solution and the species.

According to my schema, the attempted error-elimination, that is, the struggle for survival-will bring out the inherent weakness of each of the proposed solutions in the form of a new problem. For example, the new problem may be that the parent organisms and their offspring are threatening to suffocate one another. This new problem may in turn be solved; for example, the organisms may develop a method of scattering or disseminating their offspring; or else the new problem may be solved by the establishment of a common economy, comprising several organisms. Perhaps the transition from unicellular to multicellular organisms proceeded in this way.

However this may be, my schema shows that there may be more than Darwin’s alternative, ’survive or perish’, inherent in the process of error-elimination: error-elimination may bring out new emerging problems, specifically related to the old problem and to the tentative solution.

In what follows I shall use my schema, sometimes only implicitly; and I shall refer to emergence, assuming that my schema makes this idea sufficiently respectable within what I hope will be a rational discussion. I propose to deal with some aspects of the growth of knowledge under four headings:

1. Realism and Pluralism: Reduction versus Emergence.
2. Pluralism and Emergence in History.
3. Realism and Subjectivism in Physics.
4. Realism in Logic.

Berkeley – Quine – Carnap
1. Realism and Pluralism: Reduction versus Emergence

Man produces not only scientific theories but many other ideas – for example, religious or poetical myths or, say, plots for stories.

What is the characteristic difference between a scientific theory and a work of fiction? It is not, I hold, that the theory is possibly true while the descriptions in the story are not true, although truth and falsity have something to do with it. The difference is, I suggest, that the theory and the story are embedded in different critical traditions. They are meant to be “judged by quite different traditional standards (even though these standards may have something in common).

What characterises the theory is that it is offered as a solution to a scientific problem; that is, either a problem that has arisen before, in the critical discussion of earlier tentative theories, or (perhaps) a problem discovered by the author of the theory now offered, but discovered within the realm of the problems I and solutions belonging to the scientific tradition.

However, I am not leaving it at that. For the scientific tradition in its turn is, or was until recently, characterised by what may be called scientific realism. That is to say, it was inspired by the ideal of finding true solutions to its problems: solutions which corresponded to the facts.

This regulative ideal of finding theories which correspond to the facts is what makes the scientific tradition a realist tradition: it distinguishes between the world of our theories and the world of facts to which these theories belong.

Moreover, the natural sciences with their critical methods of problem solving, and some of the social sciences too, especially history and economics, have represented for quite a long time our best efforts in problem solving and fact finding (by fact finding I mean, of course, the discovery of statements or theories which correspond to facts). Thus these sciences contain, by and large, the best statements and theories from the point of view of truth; that is, those giving the best description of the world of facts, or of what one calls ‘reality’.

Now let us look at certain relations that hold between some of these sciences.

Take physics and chemistry for example; sciences which make assertions about all physical things and physical states, including living organisms.

Physics and chemistry are not very different, and there seems to be no great difference in the kind of things to which they apply, except that chemistry, as it is usually understood, becomes inapplicable at very high temperatures and also, perhaps, at very low ones. It therefore would not be very surprising if the hopes, held for a long time, that chemistry can be reduced to physics, were to come true, as indeed they seem to be doing.

Here we have a real paradigm case of a ‘reduction’; by a reduction I mean, of course, that all the findings of chemistry can be fully explained by (that is to say, deduced from) the principles of physics.

Although such a reduction would not be very surprising, it would be a very great scientific success. It would not only be an exercise in unification, but a real advance in understanding the world.

Let us assume that this reduction has been carried out completely. This might give us some hope that we may also reduce one day all the biological sciences to physics.

Now this would be a spectacular success, far greater than the reduction of chemistry to physics. Why? Because the kind of things to which physics and chemistry apply are really very similar from the start. Only think how difficult it would be to say whether the atomic theory is a physical or a chemical theory. In fact, for a long time it was both; and it is this common bond which provides the link which may lead, or perhaps has led, to their unification.

With living organisms the situation is different. They are, no doubt, subject to all kinds of physical and biological laws. Yet there appears to be some prima facie difference between living organisms and non-living things. Admittedly, we learn from science that there are transitory or intermediate stages, and also intermediate systems; and this gives us hope that a reduction might be achieved one day. Moreover, it seems not at all improbable that recent tentative theories about the origin of life on earth might be successfully put to the test, and that we might be able to create primitive living organisms artificially.

But even this would not necessarily mean a complete reduction. This is shown by the fact that chemists were able to create all sorts of chemicals, inorganic and organic, before understanding even their chemical composition, to say nothing about their physical structure. Thus even the control of chemical processes by purely physical means is not as such equivalent to a reduction of chemistry to physics. Reduction means much more. It means theoretical understanding: the theoretical penetration of the new field by the old field.

Thus we might find a recipe for creating some primitive forms of life from non-living matter without understanding, theoretically, what we were doing. Admittedly, this would be a tremendous encouragement to all those who seek for a reduction, and rightly so. But the way to a reduction might still be long; and we could not know whether it was not even impassable: there may be no theoretical reduction of biology to physics, just as there seems to be neither a theoretical reduction of mechanics to electrodynamics, nor a theoretical reduction the other way round.

If the situation is such that, on the one hand, living organisms may originate by a natural process from non-living systems, and that, on the other hand, there is no complete theoretical understanding of life possible in physical terms, then we might speak of life as an emergent property of physical bodies, or of matter.

Now I want to make it quite clear that as a rationalist I wish and hope to understand the world and that I wish and hope for a reduction. At the same time, I think it quite likely that there may be no reduction possible; it is conceivable that life is an emergent property of physical bodies.

My point here is that those believers in reduction who, for some philosophical or other reason, adopt a priori the dogmatic position that reduction must be possible, in a way destroy their triumph should reduction ever be achieved. For what will then be achieved ought to have been achieved all the time; so their triumph will be only the uninteresting one of having been proved right by events.

Only those who assert that the question cannot be settled a priori can claim that any successful reduction would be a tremendous discovery.

I have dwelt on this point so long because it has some bearing on the position of the next rung of the ladder-the emergence of consciousness.

There are philosophers, called ‘radical behaviourists’ or ‘physicalists’, who think that they have a priori reasons, such as Ockham’s razor, for asserting that our introspection of mental states or events, and our reports about mental states or events, are simply introspections and reports about ourselves qua physical systems: they are reports about physical states of these systems.

Two philosophers expected here this morning have defended such a view with brilliant arguments. They are Herbert Feigl and Willard Van Orman Quine. I should like to make a few critical remarks about their views.

Quine says, with a reference to Carnap and Feigl, that if theoretical progress can be ‘achieved by … positing distinctive mental states … behind physical behaviour, surely as much … could be achieved by positing….. certain correlative physiological states and events instead…. Lack of a detailed physiological explanation of the states is scarcely an objection to acknowledging them as states of human bodies…. The bodily states exist anyway; why add the others?’

Let me point out that Quine speaks here as a realist: ‘The bodily states exist anyway’, he says. Nevertheless, from the point of view I am adopting here, he is not what I should call a ’scientific realist’: he does not wait to see whether science will achieve a reduction here, as perhaps it may one day; instead he applies Ockham’s razor, pointing out that mental entities are not necessary for the theory.

But who knows what Ockham or anybody else might mean here by necessity? If mental entities or, better, mental states should exist – and I myself do not doubt that they do exist, then positing mental states is necessary for any true explanation of them; and should they one day be reduced to physical states, hen this will be a tremendous success. But there will be no success at all if we reject their existence by merely noting that we can explain things without them, by the simple method of confining ourselves to physical things and their behaviour.

To sum up my argument in brief: philosophical speculations of a materialistic or physicalistic character are very interesting, and may even be able to point the way to a successful scientific eduction. But they should be frankly tentative theories (as think Feigl’s theories are). Some physicalists do not, however, consider their theories as tentative, but as proposals to express everything in a physicalistic language; and they think these proposals have much in their favour because they are undoubtedly convenient: inconvenient problems such as the body-mind problem do indeed, most conveniently, disappear. So these physicalists think that there can be no doubt that these problems should be eliminated as pseudo-problems.

To this I would reply that by the same method we could have eliminated a priori all chemical states and problems connected with them: we could have said that they were obviously physical, and that there was no need to specify them in detail: that all we needed to do was to postulate the existence of some physical state correlative to each chemical state. : I think it is clear that the general adoption of such a proposal would have led to the attitude of not looking for the detailed reduction of chemistry to physics. No doubt, it would have dissolved the analogue of the body-mind problem-the problem of the relation of physics to chemistry; but the solution would have been linguistic; and as a consequence we should not have learned anything about the real world.

All this leads me to assert that realism should be at least tentatively pluralistic, and that realists should subscribe to the following pluralistic postulate:

We must beware of solving, or dissolving, factual problems

linguistically; that is, by the all too simple method of refusing to talk about them. On the contrary, we must be pluralists, at least to start with: we should first emphasise the difficulties, even if they look insoluble, as the body-mind problem may look to some.

If we can then reduce or eliminate some entities by way of scientific reduction, let us do so by all means, and be proud of the gain in understanding.

So I would say: let us work out in every case the arguments for emergence in detail, at any rate before attempting reduction.

To sum up and sharpen the considerations advanced in this section:

The reduction of chemistry to physics, apparently now well on the way, may be described as a paradigm case of a genuine scientific reduction which satisfies all the requirements of a good scientific explanation.

‘Good’ or ’scientific’ reduction is a process in which we learn much that is of great importance: we learn to understand and to explain the theories about the field to be reduced (in this case chemistry) and we learn a great deal about the power of the reducing theories (in this case physics).

It is conceivable, although not yet certain, that the reduction of chemistry to physics will be completely successful. It is also conceivable, though less likely, that we may one day have good reductions of biology, including physiology, to physics, and of psychology to physiology, and thus to physics.

I call bad reduction or ad hoc reduction the method of reduction by merely linguistic devices; for example, the method of physicalism which suggests that we postulate ad hoc the existence of physiological states to explain behaviour which we previously explained by postulating (though not by postulating ad hoc) mental states. Or in other words, by the linguistic device of saying that I report on a physiological state of mine when I report that I now feel that I understand the Schrödinger equation.

This second kind of reduction or the use of Ockham’s razor is bad, because it prevents us from seeing the problem. In ,the picturesque as well as hard-hitting terminology of Imre Lakatos, it is a disastrous case of a ‘degenerating problem shift’; and it may prevent either a good reduction, or the study of emergence, or both.

In order to avoid this disastrous method we must in each case try to learn as much as possible about the field which we hope to reduce. It may be that the field resists reduction; and we may even possess arguments to show why the field cannot be reduced. In this case we may have an example of genuine emergence.

If I may perhaps end my comments on the degenerating of behaviourism (especially linguistic behaviourism) with the following remark.

Behaviourists and materialists are anti-idealists: and they are, rightly, opponents of Berkeley’s ‘esse = percipi’ or

to be = to be observable.

According to them, ‘to be’ is ‘to be material’, ‘to behave as a body in space and time’. Nevertheless, it may be said that they do adhere, unconsciously, to Berkeley’s equation, although they put it in a slightly different verbal form:

to be = to be observed

or perhaps

to be = to be perceived.

For they say that only those things exist which can be observed. They do not realise that all observation involves interpretation in the light of theories, and that what they call ‘observable’ is what is observable in the light of pretty old-fashioned and primitive theories. Though I am all for common sense, I am also for enlarging the realm of common sense by learning from science. At any rate, it is not science but dubious philosophy (or outdated science) which leads to idealism, phenomenalism, positivism; or to materialism and behaviourism, or to any other of anti-pluralism.

Peirce – Heisenberg – Hegel – Ilyenkov – de Saussure
2. Pluralism and emergence in History

I shall not speak about the history of the universe, but only say a few words about the history of life on earth.

It seems that a very promising start has recently been made towards reconstructing the conditions under which life emerged on earth; and I think we may, perhaps, expect some major success soon. But while sanguine about emergence, even experimental emergence, I feel very sceptically inclined about reduction. This is due to certain thoughts of mine about the evolution of life.

It seems to me that evolutionary processes or major evolutionary changes are as unpredictable as historical processes or major historical changes. I hold this view because I am strongly inclined towards an indeterminist view of the world, somewhat more radical than Heisenberg’s: my indeterminism includes the thesis that even classical physics is indeterministic, and is thus more like that of Charles Sanders Peirce, or that of Alfred Landé. And I think that evolution proceeds largely probabilistically, under constantly changing conditions or problem situations, and that every tentative solution, whether more successful or less successful or even completely unsuccessful, creates a new problem situation. This seems to me to prevent a complete reduction as well as a complete understanding of the processes of life, although it does not prevent constant and far-reaching progress towards such understanding. (This argument should not be taken to be like Bohr’s application of his idea of complementarity to living organisms – an argument which seems to me very weak indeed.)

But I want to speak in this section mainly about human history, about the story of mankind. This, as I have indicated, is very largely the history of our knowledge – of our theories about the world – and, of course, of the repercussions of these products, which are of our own making, upon ourselves and our further productions.

It is obvious that one can adopt a physicalist or materialist attitude towards these theoretical products of ours; and it might be suspected that my emphasis upon the objective sense of knowledge – my emphasis upon theories as contained in books collected in libraries and as taught in universities-indicates that I sympathise with the physicalist or materialist interpretation of theories; I mean an interpretation which sees language as consisting of physical objects – noises, or printed letters-and which sees ourselves as conditioned, or dispositioned, to react to these noises or letters with certain characteristic kinds of physical behaviour.

But nothing is further from my intention than to encourage ad hoc reductions of this kind. Admittedly, if forced to choose between any subjectivist or personalist view of human knowledge and the materialist or physicalist view I have just tried to sketch, I should choose the latter; but this is emphatically not the alternative.

The history of ideas teaches us very clearly that ideas emerge in logical or, if the term is preferred, in dialectical contexts. My various schemata such as

P1 » TT » EE » P2

may indeed be looked upon as improvements and rationalisations of the Hegelian dialectical schema: they are rationalisations because they operate entirely within the classical logical organon of rational criticism, which is based upon the so-called law of contradiction; that is to say, upon the demand that contradictions, whenever we discover them, must be eliminated. Critical error-elimination on the scientific level proceeds by way of a conscious search for contradictions.

Thus history, and especially the history of ideas, teaches us that if we want to understand history, we must understand ideas and their objective logical (or dialectical) relationships.

I do not believe that anybody who has ever seriously gone into any chapter of the history of ideas will think that a reduction of these ideas could ever be successful. But I take it as my task here not so much to argue against the possibility of any reduction, as to argue for the recognition of emergent entities, and for the need to recognise and describe these emergent entia before one can seriously think about their possible elimination by way of reduction.

One of my main arguments for the emergent character of theories I have given elsewhere. My argument depends upon the conjecture that there is such a thing as a genuine growth of scientific knowledge; or in practical terms, that tomorrow, or a year hence, we may propose and test important theories of which nobody has seriously thought so far. If there is growth of knowledge in this sense, then it cannot be predictable by scientific means. For he who could so predict today by scientific means our discoveries of tomorrow could make them today; which would mean that there would be an end to the growth of knowledge.

On the other hand, unpredictability in principle has always been considered as the salient point of emergence; and it seems to me that my argument shows at any rate that the growth of knowledge must be unpredictable in principle.

But there are other arguments for the emergent character of theories, or of knowledge in the objective sense. I shall only mention an argument or two against the very popular and very naive view that theories can be reduced to the mental states of those who produce them, or of those who understand them. (Whether or not these mental states themselves may then perhaps be reduced to physical states in turn will not be further discussed.)

The idea that a theory in its objective or logical sense may be reduced to the mental states of those who hold the theory takes, as a rule, the form that the theory just is a thought. But this is a trivial mistake: it is the failure to distinguish between two senses of the word ‘thought’. In its subjective sense, the word ‘thought’ describes a mental experience or a mental process. But two mental experiences or processes, though they may stand in causal relations to each other, cannot stand in logical relations to each other.

Thus, if I say that certain ideas of the Buddha agree with certain ideas of Schopenhauer, or that they contradict certain ideas of Nietzsche, then I am not speaking about the mental thought-processes of these people or about their interrelations. If I say, however, that Nietzsche was influenced by certain ideas of Schopenhauer, then I do mean that certain thought processes of Nietzsche’s were causally influenced by his reading of Schopenhauer. So we have actually these two different worlds, the world of thought-processes, and the world of the products of thought-processes. While the former may stand in causal relationships, the latter stand in logical relationships.

The fact that certain theories are incompatible is a logical fact, and holds quite independently of whether or not anybody has noticed or understood this incompatibility. These purely objective logical relationships are characteristic of the entities which I have called theories, or knowledge, in the objective sense.

This may also be seen from the fact that the person who produces a theory may very often not understand it. Thus it might be argued without paradox that Erwin Schrödinger did not fully understand the Schrödinger equation, at any rate not until Max Born gave his statistical interpretation of it; or that Kepler’s area law was not properly understood by Kepler, who seems to have disliked it.

In fact, understanding a theory is something like an infinite task, so that we may well say that a theory is never fully understood, even though some people may understand some theories extremely well. Understanding a theory has, indeed, much in common with understanding a human personality. We may know or understand a man’s system of dispositions pretty well; that is to say, we may be able to predict how he would act in a number of different situations. But since there are infinitely many possible situations, of infinite variety, a full understanding of a man’s dispositions does not seem to be possible. Theories are similar: a full understanding of a theory would mean understanding all its logical consequences. But these are infinite in a non-trivial sense: there are infinitely many situations of infinite variety to which the theory might be applicable; that is to say, upon which some of its logical consequences may bear; and many of these situations have never been thought of; their possibility may not yet have been discovered. But this means that nobody, neither its creator nor anybody who has tried to grasp it, can have a full understanding of all the possibilities inherent in a theory; which shows again that the theory, in its logical sense, is something objective and something objectively existing – an object that we can study, something that we try to grasp. It is no more paradoxical to say that theories or ideas are our products and yet not fully understood by us than to say that our children are our products and yet not fully understood by us, or that honey is a product of the bee, yet not fully understood by any bee.

Thus, the study of the history of our theories or ideas-and a good case could be made for the view that all human history is largely a history of our theories or ideas-should make us all pluralists. For what exist, for the historian, are people in physical, social, mental, and ideological problem situations; people producing ideas by which they try to solve these problems, ideas which they try to grasp, to criticise, to develop.

The student of the history of ideas will find that ideas have a kind of life (this is a metaphor, of course); that they can be misunderstood, rejected, and forgotten; that they can reassert themselves, and come to life again. Without metaphor, however, we can say that they are not identical with any man’s thought, or belief; that they can exist even if universally misunderstood, and rejected.

All this may be reminiscent of Plato and Hegel. But there are great differences here. Plato’s ‘ideas’ were eternal, unchanging conceptions or notions; Hegel’s were dialectically self-changing conceptions or notions. The ideas which I find most important are not conceptions or notions at all. They correspond not to words but to statements or propositions.

In opposition to Plato and Hegel I consider tentative theories about the world-that is, hypotheses together with their logical consequences-as the most important citizens of the world of ideas; and I do not think (as Plato did) that their strangely non-temporal character makes them eternal and thereby more real than things that are generated and are subject to change, and to decay. On the contrary, a thing that can change and perish should for this very reason be accepted as prima facie real; and even an illusion is, qua illusion, a real illusion.

This is important in connection with the problem of time, and of change.

A historian cannot, I think, accept the doctrine that time and change are illusions; a doctrine upheld by some great physicists and philosophers such as Parmenides, Weyl, and Schrödinger. Nothing is more real than an event, an occurrence; and every event involves some change.

That the pluralistic universe in which the historian lives, with its individual men living individual lives, trying to solve their problems, producing children, and ideas about them, hoping and fearing and deceiving themselves and others, but always theorising, and often seeking not only happiness but also truth-that this pluralistic universe should be successfully ‘reduced’ to one or another kind of monism-this seems to me not only unlikely, but impossible. But this is not my point here. My point is that only after recognising the plurality of what there is in this world can we seriously begin to apply Ockham’s razor. To invert a beautiful formulation of Quine’s, only if Plato’s beard is sufficiently tough, and tangled by many entities, can it be worth our while to use Ockham’s razor. That the razor’s edge will be dulled in being used for this tough job is only to be expected. The job will no doubt be painful. But it is all in a day’s work.

Bohr – Einstein – Heisenberg – Quine – Bridgman – Ilyenkov – Wittgenstein
3. Realism and Subjectivism in Physics

There are two important fields in modern physics in which physicists have allowed subjectivism not only to enter, but to play an essential role: Boltzmann’s theory of the subjectivity of the direction of time, and Heisenberg’s interpretation of the indeterminacy formulae as determining a lower limit to the effect of the observer’s interference with the observed object.

There was also another intrusion of the subject, or of the observer, w hen Einstein brought in the observer in a number of imaginary thought experiments intended to elucidate relativity; but this is a field from which the observer was exorcised, slowly but steadily, by Einstein himself.

I shall not discuss this point further, nor shall I discuss the subjective theory of time which, in trying to tell us that time and change are human illusions, forgets that they are very real illusions which have in no way been reduced to anything else (and which, I conjecture, are not amenable to reduction). I shall not discuss all this because I have done so only recently. I merely want to say a few words about the Heisenberg formulae and their interpretation.

These formulae are usually derived in a fairly complicated manner; there is, for example, an interesting derivation by Weyl and another rather complicated one by Born.

Yet in fact the Heisenberg formula for energy depends neither on wave mechanics nor on Heisenberg’s matrix mechanics; nor do we need the commutation relations (which according to Hills are insufficient for the derivation of the formulae). It simply does not depend on the revolutionary new quantum mechanics of 1925-6, but follows directly from Planck’s old quantum postulate of 1900:

E = hf.

From this we get immediately

(2) DE = h Df.

By using the principle of harmonic resolving power,

(3) Df approx = 1/Dt,

we obtain from (2) and (3)

(4) DE approx = h / Dt,

which leads at once to

(5) DE . Dt approx = h;

that is to say, a form of Heisenberg’s so-called indeterminacy formulae.

In precisely the same way we obtain the Heisenberg formula for position and momentum from Duane’s principle (whose analogy to Planck’s principle has recently been stressed by Alfred Landé). It may be written

(6) Dpi approx = h / Dqi

According to Landé this may be interpreted as follows: a body (such as a grid or a crystal) endowed with the space-periodicity Dqi is entitled to change its momentum pi in multiples of Dpi approx = h / Dqi.

From (6) we obtain at once

(7) Dpi . Dqi approx = h,

which is another form of Heisenberg’s indeterminacy formulae.

Considering that Planck’s theory is a statistical theory, the Heisenberg formulae can be most naturally interpreted as statistical scatter relations, as I proposed more than thirty years ago. That is, they say nothing about the possible precision of measurements, nor anything about limits to our knowledge. But if they are scatter relations, they tell us something about the limits to the homogeneity of quantum-physical states, and therefore, though indirectly, about predictability.

For example, the formula Dpi . Dqi approx = h (which can be obtained from Duane’s principle just as DE . DT approx = h can be obtained from Planck’s principle) tells us, simply, that if we determine the coordinate x of a system (say, an electron) then, upon repetition of the experiment, the momentum will scatter.

Now how can such an assertion be tested? By making a long series of experiments with a fixed shutter opening Dx and by measuring, in every single case, the momentum Px. If these momenta scatter as predicted, then the formula has survived the test. But this shows that in order to test the scatter relations, we have actually measured, in every case, px with a precision far greater than Dpx; for otherwise we could not speak of Dpx, as the scatter of px.

Experiments of the kind described are carried out every day in all physical laboratories. But they refute Heisenberg’s indeterminacy interpretation, since measurements (though not the predictions based upon them) are more precise than this interpretation permits.

Heisenberg himself noted that such measurements are possible, but he said that it was ‘a matter of personal belief’ or personal taste’ whether or not we attach any meaning to them; and ever since this remark they have been universally disregarded as meaningless. But they are not meaningless, for they have a definite function: they are tests of the very formulae in question; that is, of the indeterminacy formulae qua scatter relations.

There is, therefore, no reason whatever to accept either Heisenberg’s or Bohr’s subjectivist interpretation of quantum mechanics. Quantum mechanics is a statistical theory because the problems it tries to solve-spectral intensities, for example -are statistical problems. There is, therefore, no need here for any philosophical defence of its non-causal character.

The irreducibility of statistical theories to deterministic theories (rather than the incompatibility of these two kinds of theories) should, however, be established. Arguments to this effect have been offered by Landé, and very different ones by myself.

To sum up, there is no reason whatsoever to doubt the realistic and objectivistic character of all physics. The role played by the observing subject in modern physics is in no way different from the role he played in Newton’s dynamics or in Maxwell’s theory of the electric field: the observer is, essentially, the man who tests the theory. For this, he needs a lot of other theories, competing theories and auxiliary theories. All this shows that we are not so much observers as thinkers.

Schlick – Brouwer – Quine – Carnap – Dewey – Hegel – Wittgenstein
4. Realism in Logic

I am opposed to looking upon logic as a kind of game. I know about so-called alternative systems of logic and I have actually invented one myself, but alternative systems of logic can be discussed from very different points of view. One might think that it is a matter of choice or convention which logic one adopts. I disagree with this view.

My theory is briefly this. I look upon logic as the theory of deduction or of derivability, or whatever one chooses to call it. Derivability or deduction involves, essentially, the transmission of truth and the retransmission of falsity: in a valid inference truth is transmitted from the premises to the conclusion. This can be used especially in so-called ‘proofs’. But falsity is also retransmitted from the conclusion to (at least) one of the premises, and this is used in disproofs or reputations, and especially in critical discussions.

We have premises and a conclusion; and if we show that the conclusion is false, and assume that the inference is valid, we know that at least one of our premises must be false. This is how logic is constantly used in critical discussion, for in a critical discussion we attempt to show that something is not in order with some assertion. We attempt to show it; and we may not succeed: criticism may be validly answered by counter-criticism.

What I should wish to assert is (1) that criticism is a most important methodological device; and (2) that if you answer criticism by saying, ‘I do not like your logic: your logic may be all right for you, but I prefer a different logic, and according to my logic this criticism is not valid’, then you may undermine the method of critical discussion.

Now I should distinguish between two main uses of logic, namely (1) its use in the demonstrative sciences-that is to say, the mathematical sciences-and (2) its use in the empirical sciences.

In the demonstrative sciences logic is used in the main for proofs-for the transmission of truth-while in the empirical sciences it is almost exclusively used critically-for the retransmission of falsity. Of course, applied mathematics comes in too, in which we implicitly make use of the proofs of pure mathematics, but the role of mathematics in the empirical sciences is somewhat dubious in several respects. (There exists a wonderful article by Schwartz to this effect.)

Thus in the empirical sciences logic is mainly used for criticism; that is, for refutation. (Remember my schema P1 » TT » EE» P2.)

Now, what I wish to assert is this. If we want to use logic in a critical context, then we should use a very strong logic, the strongest logic, so to speak, which is at our disposal; for we want our criticism to be severe. In order that the criticism should be severe we must use the full apparatus; we must use all the guns we have. Every shot is important. It doesn’t matter if we are over-critical: if we are, we shall be answered by counter-criticism.

Thus we should (in the empirical sciences) use the full or classical or two-valued logic. If we do not use it but retreat into the use of some weaker logic – say, the intuitionist logic, or some three-valued logic (as Reichenbach suggested in connection with quantum theory)-then, I assert, we are not critical enough; it is a sign that something is rotten in the state of Denmark (which in this case is the quantum theory in its Copenhagen interpretation, as I indicated earlier).

Now let us look, by contrast, at proofs. Every mathematician knows that considerable interest lies in proving a theorem with the help of a minimum apparatus. A proof which uses stronger means than necessary is mathematically unsatisfactory, and it is always interesting to find the weakest assumptions or minimum means which have to be used in a proof. In other words, we want the proof not only to be sufficient – that is to say valid-but we want it if possible to be necessary, in the sense that a minimum of assumptions have been used in the proof. This, I admit, is a somewhat sophisticated view. In unsophisticated mathematics we are happy and grateful if we can prove anything, but in more sophisticated mathematics we really want to know what is necessary for proving a theorem.

So if one can prove mathematical theorems with methods weaker than the full battery of classical logic, then this is extremely interesting from a mathematical point of view. Thus in proof theory we are interested in weakening if possible our classical logic, and we can, for example, introduce intuitionist logic or some other weaker logic such as positive logic, and investigate how far we can get without using the whole battery.

I think, incidentally, that the term ‘intuitionist logic’ is a misnomer. It is just a name for a very interesting and somewhat weakened form of classical logic invented by Brouwer and formalised by Heyting. I certainly do not want to say anything in favour of the philosophical theory called intuitionism though I should like to say something in favour of the Brouwer-Heyting logic. But I trust it will not be supposed that I am in any sense defending the authority of intuition in philosophy or logic or anywhere else. Leaving aside for the moment Brouwerian logic, one might say that intuitionism is the doctrine that intuitions are not only important but generally reliable. As against this I think that intuitions are very important but that as a rule they do not stand up to criticism. So I am not an intuitionist. However, Brouwerian or so-called ‘intuitionist logic’ is, from the standpoint of the present discussion, important because it is just a part, a genuine part, and thus a weakened form, of classical logic; that is to say, every inference which is valid from the point of view of intuitionist logic is also valid from the point of view of classical logic, while the opposite s not the case: we have inferences which may be validly drawn in classical logic but which are not valid in intuitionist logic. Thus if I can prove a theorem (so far proved only by classical means) with intuitionist logic, I have made a real mathematical discovery; for mathematical discoveries do not consist only in finding new proofs of new theorems, but they consist also in finding new proofs of old theorems; and a new proof of a theorem will be especially interesting if it uses weaker means than the old proof. A proof using stronger means one can always have for the asking, a fortiori; yet finding a weaker proof is a real mathematical achievement.

So intuitionistic logic is a very interesting approach to mathematics because it tries to prove as many mathematical theorems as possible with reduced logical means.

Intuitionistic logic has a further advantage: one can show that in it the so-called ‘law of excluded middle’ is not demonstrable (although it is a well-formed formula of the system) One can also show that if in any system whatsoever some well-formed formula is not demonstrable, then the system must be consistent. Generally speaking, the weaker the logical means we use, the less is the danger of inconsistency the danger that a contradiction is derivable. So intuitionist logic can also be looked upon as an attempt to make more certain that our arguments are consistent and that we do not get into hidden inconsistencies or paradoxes or antinomies. How safe such a weakened logic is, as such, is a question into which I do not want to enter now; but obviously it is at least a little safer than the full classical logic. I do not suppose it is always safe, but that is not my point. My point is this. If you wish to prove, or to establish something, you should use weak means. But for disestablishing it-that is to say, for criticising it-we may use strong means. Of course someone might say, ‘Look here, I can refute you even with weak means; I do not even need to use the whole of intuitionist logic.’ Still, that is not very important. The main thing is that for the rationalist any criticism is welcome-though he may reply to it by criticising the criticism.

Now this rationalist view is a realist view of logic. First, because it looks upon logic partly in connection with the methodology of the natural sciences which, I have tried to argue, is a realistic affair. Secondly, and this is a very special point, because it looks upon logical inference as truth transmitting or falsity re-transmitting; that is to say, it is concerned with the idea of truth.

I would assert that not the least important of the achievements of Alfred Tarski is that by introducing two ideas into logic, he has actually made logic very much a realistic affair. The first is Tarski’s idea (partly anticipated by Bolzano) that logical consequence is truth transmission. The second, I would say, is the rehabilitation of the correspondence theory of truth, the rehabilitation of the idea that truth is simply correspondence with the facts.

I think I may differ here a little from Quine, because I think that this idea of Tarski’s ought to be interpreted as destructive of relativism, and because I think that Tarski’s claim that his theory of truth is an ‘absolutistic’ theory of truth is correct. In order to explain this point, I will recount a very old story with a slightly new point to it. The old story is the story of the three main theories of truth. The new point is the elimination of the word ‘truth’ from the story, and with it, of the appearance that we are dealing here with words, or verbal definitions. However, for this elimination some preparatory discussion is needed.

Of the three main theories of truth, the oldest was the correspondence theory, the theory that truth is correspondence with the facts, or to put it more precisely, that a statement is true if (and only if) it corresponds to the facts, or if it adequately describes the facts. This is the theory which I think Tarski has rehabilitated. The second theory is the so-called coherence theory: a statement is regarded as true if (and only if) it coheres with the rest of our knowledge. The third theory is that truth is pragmatic utility or pragmatic usefulness.

Now, the coherence theory has all sorts of versions of which I shall mention just two. According to the first, truth is coherence with our beliefs, or more precisely, a given statement is true if it coheres with the rest of our beliefs. This I find a bit disconcerting because I do not want to put beliefs into logic, for well-known reasons. (If Peter believes p, and if p and q are interdeducible, we might say that Peter is logically bound to believe q. Yet he may not know that p and q are interdeducible, and he may in fact disbelieve q.)

According to the second version of the coherence theory a certain given statement, of which we do not know whether it is true or not, is to be accepted as true if (and only if) it coheres with the statements we have previously accepted. This version has the effect of making our knowledge utterly conservative: ‘entrenched’ knowledge can hardly be overthrown.

The theory of pragmatic utility is especially concerned with the problem of theories in the natural sciences such as physics. It says that we should accept a physical theory as true if it turns out in tests, and other applications, to be pragmatically useful, or successful.

I propose now to use something like a trick. My trick consists in this. I shall very soon, until very near the end of this paper, stop referring to truth. I shall not any longer ask, ‘What is truth?’ There are several reasons. My main reason is that I believe that ‘What is?’ or ‘What are?’ questions or, in other words, all verbal or definitional questions, should be eliminated. ‘What is?’ or ‘What are?’ questions I regard as pseudo-questions; they do not all seem to be so pseudo, but I do think they all are pseudo-questions. Questions such as, ‘What is life?’ or ‘What is matter?’ or ‘What is mind?’ or ‘What is logic?’ I think should not be asked. They are typically unfruitful questions.

So I think we should also discard the question, ‘What is truth?’ My first reason (just mentioned) for discarding the question ‘What is truth?’ one may call ‘anti-essentialism’. My second reason is even more important. It is that we should altogether avoid, like the plague, discussing the meaning of words. Discussing the meaning of words is a favourite game of philosophy, past and present: philosophers seem to be addicted to the idea that words and their meaning are important, and are the special concern of philosophy.

I will for your convenience present again here-on the next page-a table which I have used before.

On the left we have words or concepts and their meanings, and on the right we have statements or propositions or theories and their truth.
IDEAS
that is
DESIGNATIONS or TERMS
or CONCEPTS STATEMENTS or PROPOSITIONS
or THEORIES
may be formulated in
WORDS ASSERTIONS
which may be
MEANINGFUL TRUE
and their
MEANING TRUTH
May be reduced, by way of
DEFINITIONS DERIVATIONS
to that of
UNDEFINED CONCEPTS PRIMITIVE PROPOSITIONS
The attempt to establish (rather than reduce) by these means their
MEANING TRUTH
leads to an infinite regress

Now I have been taught by the experience of a lifetime in this field that one should always try to get away from the left side of the table and to keep to the right side. One should always keep to assertions, to theories, and the question of their truth. One should never get involved in verbal questions or questions of meaning, and never get interested in words. If challenged by the question of whether a word one uses really means this or perhaps that, then one should say: ‘I don’t know, and I am not interested in meanings; and if you wish, I will gladly accept your terminology.’ This never does any harm. One should never quarrel about words, and never get involved in questions of terminology. One should always keep away from discussing concepts. What we are really interested in, our real problems, are factual problems, or in other words, problems of theories and their truth. We are interested in theories and how they stand up to critical discussion; and our critical discussion is controlled by our interest in truth.

Having said this, I intend now to stop using the word ‘truth’.

Our problem is no longer: Is truth correspondence? Is truth coherence? Is truth usefulness? This being so, how can we formulate our real problem?

Our problem can be sharply formulated only by pointing out that the opponents of the correspondence theories all made an assertion. They all asserted that there cannot be such a thing as the correspondence between a statement and a fact. This is their central assertion. They say that this concept is meaningless (or that it is undefinable, which, incidentally, in my opinion does not matter, since definitions do not matter) In other words, the whole problem arises because of doubts, or scepticism, concerning correspondence: whether there is such a thing as a correspondence between a statement and a fact. It is quite clear that these doubts are serious (especially in view of the paradox of the liar).

It is also quite clear that, but for these doubts, the upholders of the coherence theory and of the theory of pragmatic usefulness would really have nothing to argue against. Nobody denies that pragmatic usefulness and such matters as predictive power are important. But should there exist something like the correspondence of a theory to the facts, then this would obviously be more important than mere self-consistency, and certainly also much more important than coherence with any earlier knowledge’ (or ‘belief’); for if a theory corresponds to the facts but does not cohere with some earlier knowledge, then this earlier knowledge should be discarded.

Similarly, if there exists something like the correspondence of theory to the facts, then it is clear that a theory which corresponds to the facts will be as a rule very useful; more useful, qua theory, than a theory which does not correspond to the facts. (On the other hand, it may be very useful for a criminal before a court of justice to cling to a theory which does not correspond to the facts; but as it is not this kind of usefulness which the pragmatists have in mind, their views raise a question which is very awkward for them: I mean the question, ‘Useful for whom?’.)

Although I am an opponent of pragmatism as a philosophy of science, I gladly admit that pragmatism has emphasised something very important: the question whether a theory has some application, whether it has, for example, predictive power. Praxis, as I have put it somewhere, is invaluable for the theoretician as a spur and at the same time as a bridle: it is a spur because it suggests new problems to us, and it is a bridle because it may bring us down to earth and to reality if we get lost in over-abstract theoretical flights of our imagination. All this is to be admitted. And yet, it is clear that the pragmatist position will be superseded by a realist position if we can meaningfully say that a statement, or a theory, may or may not correspond to the facts.

Thus the correspondence theory does not deny the importance of the coherence and pragmatist theories, though it does imply that they are not good enough. On the other hand, the coherence and pragmatist theories assert the impossibility or meaninglessness of the correspondence theory.

So without ever mentioning the word ‘truth’ or asking, ‘What does truth mean?’ we can see that the central problem of this whole discussion is not the verbal problem of defining ‘truth’ but the following substantial problem: can there be such a thing as a statement or a theory which corresponds to the facts, or which does not correspond to the facts?

Behind the doubts concerning the possibility of speaking about correspondence, there are various strong arguments.

First of all, there are paradoxes or antinomies which arise out of this correspondence idea. Secondly, there are the countless unsuccessful attempts to say more precisely what the correspondence between a statement and a fact consists of There is the attempt of Schlick, who said that correspondence is to be explained by a one-one relationship between the linguistic statement and the fact; that is, by uniqueness. A statement, he said, is ‘true’, or corresponds to the facts, if it stands to the facts of the world in a one-one relationship or in a unique relationship: non-correspondence or ‘falsity’ is the same as ambiguity. Of course, this is an unacceptable view, for many vague and ambiguous statements (such as ‘there are a few people somewhere in America’) may correspond to the facts; and vice versa, every general proposition or theory which corresponds to the facts corresponds to many facts, so that there is not a one-one relationship.

Moreover, a statement which does not correspond to the facts may be quite unambiguous. A murderer may say unambiguously, ‘I have not killed him.’ There is no ambiguity in this assertion; but it does not correspond to the facts. Clearly, Schlick’s attempt to explain correspondence misfires. Another even worse attempt is Wittgenstein’s. Wittgenstein suggested that a proposition is a picture of reality and that correspondence is a relationship very much like the one that holds between the groove on a gramophone record and the sounds which it denotes: a kind of projective relationship between facts and statements. The untenability of this view can easily be shown. One is reminded of the famous story of Livingstone being introduced by an interpreter to a Negro king whom he asked, ‘How are you?’. The Negro king answered with one word, and the interpreter began to talk and talk and talk and talk, for ten minutes, translating the word to Livingstone in the form of a long story of the king’s sorrows. Then Livingstone asked whether the king was in need of medical assistance, and then the king began to talk and talk and talk and talk and talk. And the interpreter translated it with one word: ‘No.’

No doubt this story is invented. But it is well invented; and it illustrates the weakness of the projection theory of language, especially as a theory of the correspondence between a statement and a fact.

But this is not all. The matter is even more serious; namely, Wittgenstein, after having formulated this theory, said that it is impossible to discuss the relationship of language to reality, or to discuss language at all. (Because language cannot be discussed by language.) This is a field in which words fail us. ‘It shows itself’ is his favourite expression to indicate the failure of words. Any attempt to go deeper into the relationship between language and reality or to discuss language more deeply or statements more deeply is, accordingly, bound to be meaningless. And although he says in the Preface of his book, ‘the truth of the thoughts that are here set forth seems to me unassailable and definitive’, he ends up by saying, ‘Anybody who understands me eventually recognises them [the propositions of the Tractatus] as nonsensical.’ (Because talk about language is meaningless.) No doubt this refers, apart from other things, especially to his theory of projection. His remark that his readers will see that what he says is meaningless thus confirms what the opponents of the correspondence theory have always said of the correspondence theory, namely that it is meaningless to speak about the correspondence between a statement and a fact.

So we are back at the real issue. It is this: is there or is there not a tenable correspondence theory? Can we or can we not speak meaningfully of the correspondence between a statement and a fact?

Now my assertion is that Tarski has rehabilitated the correspondence theory. This, I think, is a great achievement, and it is a great philosophical achievement. I say this because it has been denied by many philosophers (for example, by Max Black) that there is something philosophically important in Tarski’s achievement.

The key to the rehabilitation of the correspondence theory is a very simple and obvious observation made by Tarski. That is, if I want to speak about correspondence between a statement S and a fact F, then I have to do so in a language in which I can speak about both: statements such as S, and facts such as F. This seems to be frightfully trivial; but it is nevertheless decisive. It means that the language in which we speak in explaining correspondence must possess the means needed to refer to statements, and to describe facts. If I have a language which has both these means at its disposal, so that it can refer to statements and describe facts, then in this language -the metalanguage-I can speak about correspondence between statements and facts without any difficulty, as we shall see.

A metalanguage is a language in which we talk about some other language. For example, a grammar of the German language, written in English, uses English as a metalanguage in order to talk about German. The language about which we talk in the metalanguage (in this case English) is usually called the ‘object language’ (in this case German). The characteristic thing about a metalanguage is that it contains (metalinguistic) names of words and of statements of the object language, and also (metalinguistic) predicates, such as ‘noun (of the object language)’ or ‘verb (of the object language)’ or ’statement (of the object language)’. If a metalanguage is to suffice for our purpose it must also, as Tarski points out, contain the usual means necessary to speak about at least all those facts about which the object language can speak.

All this is the case if we use English as our metalanguage in order to speak about German (as the object language under investigation).

For example, we shall be able to say in the English metalanguage such things as:

The German words ‘Das Gras ist grün’ form a statement of the German language.

On the other hand, we shall be able to describe in our (English) metalanguage the fact which the German statement ‘Das Gras ist grün’ describes. We can describe this fact in English simply by saying that grass is green.

We can now make a statement in the metalanguage about the correspondence of a statement of the object language to the facts as follows. We can make the assertion: The German statement ‘Das Gras ist grün’ corresponds to the facts if, and only if, grass is green. (Or: ‘. . . only if it is a fact that grass is green.’)

This is very trivial. It is, however, important to realise the following: in our assertion, the words ‘Das Gras ist grün’, put within quotes, function as a metalinguistic (that is, an English) name of a German statement; on the other hand, the English words ‘grass is green’ occur in our assertion above without any quotation marks: they do not function as a name of a statement, but simply as the description of a fact (or alleged fact).

This makes it possible for our assertion to express a relationship between a (German) statement, and a fact. (The fact is neither German nor English, although it is, of course, described or spoken about in our metalanguage, which is English: the fact is non-linguistic, it is a fact of the real world, although we need of course a language if we wish to talk about it.) And what our metalinguistic assertion asserts is that a certain (German) statement corresponds to a certain fact (a non-linguistic fact, a fact of the real world) under conditions which are precisely stated.

We can, of course, replace the German object language by any other-even by English. Thus we can make the metalinguistic assertion:

The English statement ‘Grass is green’ corresponds to the facts if, and only if, grass is green.

This looks even more trivial. But it can hardly be denied; nor can it be denied that it expresses the conditions under which a statement corresponds to the facts.

Generally speaking, let ‘S’ be a (metalinguistic) name of a statement of the object language, and let ‘f’ be the abbreviation of an expression of the metalanguage that describes the (supposed) fact F which S describes. Then we can make the following metalinguistic assertion:

A statement S of the object language corresponds to the facts if, and only if, f. (or: . . . if it is a fact that f.)

Note that while ‘S’ is here a metalinguistic name of a statement, ‘f’ is not a name, but an abbreviation of an expression of the metalanguage describing a certain fact (the fact which we can name ‘F’).

We can now say that what Tarski did was to discover that in order to speak about the correspondence between a statement S and a fact F, we need a language (a metalanguage) in which we can speak about the statement S and state the fact F. (The former we speak about by using the name ‘S’, the latter by using a metalinguistic expression ‘f’ which states or describes F.)

The importance of this discovery is that it dispels all doubt about the meaningfulness of talking about the correspondence of a statement to some fact or facts.

Once this is done, we can, of course, replace the words corresponds to the facts’ by the words ‘is true’.

Tarski, apart from this, introduced a method of giving a definition of truth (in the sense of the correspondence theory) for any consistent formalised system. But this is not, I think, his main achievement. His main achievement is the rehabilitation of talk about correspondence (and truth). Incidentally, he showed under what circumstances such talk may lead to paradoxes, and how we can avoid these paradoxes; and he also showed how in ordinary talk about truth we can, and do, avoid paradoxes.

Once we have settled that we can use ‘truth’ in the sense of the correspondence of statements to facts, there is really nothing of importance to be added about the word ‘truth’. There is no doubt that correspondence to the facts is what we usually call ‘truth’; that in ordinary language it is correspondence that we call ‘truth’, rather than coherence or pragmatic usefulness. A judge who admonishes a witness to speak the truth and nothing but the truth does not admonish the witness to speak what he thinks is useful either for himself or for anybody else. The judge admonishes a witness to speak the truth and nothing but the truth but he does not say, ‘All we require of you is that you do not get involved in contradictions’, which he would say were he a believer in the coherence theory. But this is not what he demands of the witness.

In other words, the ordinary sense of ‘truth’ as it is used in courts of law is, no doubt, correspondence. But my main point is that this may be regarded as an afterthought, and as an unimportant afterthought. For if anybody should want to say, ‘No, in ordinary language, “truth” is used in a different sense’, I should not quarrel with him. I should suggest that we forget all about terminology: I should be prepared to use the terminology of my opponent, pointing out, however, that we have at least these three meanings at our disposal: this is the only thing about which I should be prepared to quarrel; but I should refuse to quarrel about words.

I should point out, though, that the correspondence theory of truth is a realistic theory; that is to say, it makes the distinction, which is a realistic distinction, between a theory and the facts which the theory describes; and it makes it possible to say that a theory is true, or false, or that it corresponds to the facts, thus relating the theory to the facts. It allows us to speak of a reality different from the theory. This is the main thing; it is the main point for the realist. The realist wants to have both a theory and the reality or the facts (don’t call it ‘reality’ if you don’t like it, just call it ‘the facts’) which are different from his theory about these facts, and which lie can somehow or other compare with the facts, in order to find out whether or not it corresponds to them. Of course, the comparison is always extremely difficult.

One last word about Tarski’s theory. Its whole purpose is often misinterpreted: it is wrongly assumed that it is intended to yield a criterion of truth. For-coherence was so intended, and likewise pragmatic usefulness; they strengthened the traditional view that any serious theory of truth should present us with a method of deciding whether or not a given statement is true.

Tarski has proved many things from his definition of truth. Among other things, he has proved that in a sufficiently powerful language (and in every language in which we can formulate mathematical or physical theories) there can be no criterion of truth; that is, no criterion of correspondence: the question of whether a proposition is true is not in general decidable for the languages for which we may form the concept of truth. Thus the concept of truth plays mainly the role of a regulative idea. It helps us in our search for truth that we know there is something like truth or correspondence. It does not give us a means of Finding truth, or of being sure that we have found it even if we have found it. So there is no criterion of truth, and we must not ask for a criterion of truth. We must be content with the fact that the idea of truth as correspondence to the facts has been rehabilitated. This has been done by Tarski; and I think that he has thereby rendered an immense service to the realistic outlook.

Although we have no criterion of truth, and no means of being even quite sure of the falsity of a theory, it is easier to find out that a theory is false than to find out that it is true (as I have explained in detail elsewhere). We have even good reasons to think that most of our theories-even our best theories are, strictly speaking, false; for they oversimplify or idealise the facts. Yet a false conjecture may be nearer or less near to the truth. Thus we arrive at the idea of nearness to the truth, or of a better or less good approximation to the truth; that is, at the idea of ‘verisimilitude’. I have tried to show that this idea can be rehabilitated in a way similar to Tarski’s rehabilitation of the idea of truth as correspondence to the facts. In order to do so I have used mainly the two Tarskian ideas mentioned here. One is the idea of truth. The other is the idea of logical consequence; or more precisely, of the set of logical consequences of a conjecture, or the content of a conjecture.

By incorporating into logic the idea of verisimilitude or approximation to truth, we make logic even more ‘realistic’. For it can now be used to speak about the way in which one theory corresponds better than another to the facts-the facts of the real world.

To sum up. As a realist I look upon logic as the organon of criticisms (rather than of proof) in our search for true and highly informative theories – or at least for new theories that contain more information, and correspond better to the facts, than our older theories. And I look upon criticism, in its turn, as our main instrument in promoting the growth of our knowledge about the world of facts.

How does a roller coaster work?


What you may not realize as you’re cruising down the track at 60 miles an hour is that the coaster has no engine. The car is pulled to the top of the first hill at the beginning of the ride, but after that the coaster must complete the ride on its own. You aren’t being propelled around the track by a motor or pulled by a hitch. The conversion of potential energy to kinetic energy is what drives the roller coaster, and all of the kinetic energy you need for the ride is present once the coaster descends the first hill..


Once you’re underway, different types of wheels help keep the ride smooth. Running wheels guide the coaster on the track. Friction wheels control lateral motion (movement to either side of the track). A final set of wheels keeps the coaster on the track even if it’s inverted. Compressed air brakes stop the car as the ride ends.

oil oil oil

In the context of the recent spike in oil prices, this clip from Glenn Beck show might be of interest to some.

Particularly noteworthy is the fact that the oil industry has been facing a thirty years [sic] moratorium on exploring new fields within the United States. According to the president of Shell Oil, John Hofmeister, the US is dependent on the order of over 60% of its overall consumption on oil imports. The moratorium, though not mentioned in the clip, I suspect has got a great deal to do with environmentalism. Speaking about sponsoring of international terrorism!

Another fact is that the profit margin of oil companies is about 8% on capital invested. There are other extremely interesting facts in the clip as well.

oil

what we forget in practice of jee is

reading between the lines when reading the line:visualization.

looking the problem in the solution once we solve the problem:conceptual validation.

aieee 2008 ranks estimated instantly at www.jeecarnot.com as u feed ur scores

yes we re delayed,since we were keen on being instant and fastest rank projectors narrow enough to make sense, for aieee and exams onwards.the carnot way…….if you now click on markscheck and request a link,the link will take you to the solutions and you ll see your rank the moment you submit your answers in the objective answer sheet given in the right end of the page.we thank you for demanding which motivated us to be better,i acknowledge the contribution of vikas whose insomnia driven mistakes pushed us to avoid human touch in rank estimator:) and apnerve who coded my algo in mins and saved our skin.
at the same time we have enabled the exam alert feature so as you get the relevant information related to your exams and counsellings before it becomes useless.
wishing you all the best,
carnot team.