Special relativity I: spacetime diagrams and the invariance of the interval
In this post I will explain, to my liking, (parts of) two papers which I highly recommend:
Both of them aim to give purely geometric derivations of the essential facts of special relativity, by the use of spacetime diagrams. Mermin's paper is particularly helpful to elucidate the meaning of spacetime diagrams, and that is where we start.
The main postulate of relativity is that all laws of physics look the same in every inertial frame of reference. A useful, but not entirely necessary, second postulate is that there is a thing called 'light' which is easy to produce and which travels in every (inertial) frame of reference at exactly the same speed $c$.
An event is just a name for a point in spacetime. It is something, or perhaps nothing, that happens at a particular location at a particular moment in time. Two different observers might assign different coordinates to events, but they will agree on what the events are. For instance: they may disagree on whether two beams of light reach two different points at the same time (the relativity of simultaneity), but they cannot disagree on whether two beams of light converge onto the same point at the same time! For the latter constitutes an event, which is an absolute notion.
A spacetime diagram is a useful graphical representation of events. To make our lives easier, we will think of a 2-dimensional spacetime (so that we have only one dimension of space), as opposed to our familiar 4-dimensional one. This is not quite enough to describe all interesting relativistic phenomena (see e.g. Thomas precession), but it's plenty enough for a good start. Let us agree, henceforth, that all observers use a particular system of units in which the speed of light is precisely $c = 1$. We will imagine our hypothetical universe has two observers, Alice and Bob, who are moving at uniform speed $v$ with respect to one another.
Imagine a spacetime diagram as an infinite sheet of paper. Alice will use this paper to depict events, so that points in the sheet of paper correspond to events in spacetime; there is much freedom in choosing how this representation works, as we will see. The set of all events which happen at a particular location in space will correspond, in the diagram, to a straight (but not necessarily vertical) line (why a straight line and not something curved? This is a convention, the possibility of which is justified by the translational symmetry of spacetime.) This straight line is called a line of constant position. Two different lines of constant position must be parallel, for a point of intersection would correspond to an event that happens at two different locations in space, contradicting the nature of events. Alice will space these lines according to some scale factor $\lambda$, such that two lines of constant position whose distance within the diagram is $d$ correspond to events that happen at locations of spatial distance $d/\lambda$ (in her frame of reference).
Analogously, the set of all events which occur at a particular moment in time will correspond, in the diagram, to a straight line called a line of constant time. Alice is free to choose the angle $\theta$ these lines make with the lines of constant position, and will (by convention) use the same factor $\lambda$ to convert actual distances in time to distances within the diagram (so that two lines of constant time whose distance within the diagram is $d$ correspond to events that happen at moments $d/\lambda$ apart in time). Each line of constant time crosses each line of constant position at precisely one event, which happens at that time and that location.
Now suppose there are two points $p, q$ in a line of constant time which are at distance $d$ in the diagram. They correspond to two events that happen at the same time and different locations; what is the actual distance $D$ between these locations? To see this, draw lines of constant position through $p$ and $q$: Then $D = d'/\lambda$ where $d'$ is the distance in the diagram between the two lines. Since $d'/d = \sin(\theta)$, it follows that \[ D = \frac{d \sin(\theta)}{\lambda} = \frac{d}{\mu} \] where $\mu = \lambda/\sin(\theta)$ is another scaling factor. This same scaling factor also works for lines of constant position: if $p, q$ are at distance $d$ in a line of constant position, they represent events (at the same location) which are a distance $d/\mu$ in time.
Let us now bring in light. Suppose a beam of light (which we draw as a dashed line) travels between points $p, q$ in one unit of time. Draw lines of constant position and time through $p$ and $q$:
Since these events are one unit of time and one unit of space apart (due to $c = 1$), the diagram figure is a rhombus (i.e. a quadrilateral with all sides of equal length) whose sides have length $\mu$. Therefore the light line bisects the lines of constant position and constant time. As a consequence, it follows that light lines meet each other at right angles: indeed, the other light line through $p$ (not drawn) bisects the angle $\pi - \theta$, so that the angle between the light lines is \[ \frac{\pi - \theta}{2} + \frac{\theta}{2} = \frac{\pi}{2} \]
Now Bob, who is moving at uniform speed $v$ with respect to Alice, (metaphorically) comes in and looks over her diagram. Since each point in Alice's diagram already corresponds to a unique event in spacetime, Bob has no freedom in how to relabel this diagram according to his notions of time and space. Since he moves at uniform velocity with respect to Alice, for a given position (in Bob's frame) the set of all events occuring at that position in space will form a straight line in the diagram. These are Bob's lines of constant position, which are tilted with respect to Alice's lines of constant position by an angle $\alpha$ such that $\tan(\alpha) = v$.
It may be tempting to immediately draw Bob's lines of constant time so as to make the angle with the lines of constant position be bisected by light lines. However, recall by this point Bob has no freedom! Instead, we must prove his lines of constant time satisfy this property. This is done via a very standard argument in relativity, which goes as follows. Suppose Bob is standing in the middle of a train and at point $p$ flashes beams of light towards each end $q, r$ of the train, which are then reflected back to him at point $s$. Let $t$ be the intersection of his line of constant position with the line $qr$.
Since he is standing at the middle of the train and the speed of light is the same in both directions, he considers $q$ and $r$ to be simultaneous events, so that the line $qr$ is a line of constant time. Note that $pqsr$ is a rectangle with $qr$ and $ps$ its diagonals. The angle that the leftmost line of constant position makes with the light line is the same as $\angle tpr$, which is the same as $\angle tqs$ due to these being congruent triangles. Thus indeed the light line $qs$ bisects the angle between Bob's constant position and constant time lines.
Note then that Bob's lines of constant time are not the same as Alice's (unless Bob is stationary with respect to Alice). This means that events which Alice considers to happen simultaneously are considered by Bob to happen at different moments in time. This phenomenon is known as the relativity of simultaneity, and lies at the heart of many of special relativity's apparent paradoxes, a famous example being the Ladder paradox.
Let now $\lambda_B$ be Bob's scaling factor for distances between lines of constant position (which is forced upon him). This scaling factor need not equal Alice's $\lambda_A$. If $\lambda'_B$ is Bob's scaling factor for distances between lines of constant time, we must see $\lambda'_B = \lambda_B$ (as it happens for Alice). Let a beam of light be emitted at a point $p$ and absorbed one unit of Bob's time later at point $q$. We draw lines of constant position and time through $p$ and $q$ to form a parallelogram:
If $\theta_B$ is the angle between Bob's lines of constant position and constant time, then the vertical-ish sides of the parallelogram are equal to $\mu'_B = \lambda'_B/\sin(\theta_B)$, and the horizontal-ish sides are equal to $\mu_B = \lambda_B/\sin(\theta_B)$. Since the diagonal of this parallelogram bisects the angle between its sides, it follows that it must be a rhombus, and hence $\mu_B = \mu'_B$, so that $\lambda'_B = \lambda_B$ as was to be shown.
We thus conclude that Bob's labelings of the Alice's diagram will work exactly the same way as if he had drawn the diagram himself to begin with, so that the procedure for drawing spacetime diagrams is correctly observer-independent. In his paper, Mermin goes on to establish a relationship between the scaling factors: \[ \lambda_A \mu_A = \lambda_B \mu_B \] From this relationship he extracts the invariance of the spacetime interval $(\Delta t)^2 - (\Delta x)^2$. I find his proofs a bit technical and confusing, so now is a good time to switch over to Brill and Jacobson's paper.
Let us agree to orient our spacetime diagrams in such a way so that the light lines make an angle of $45^\circ$ with the vertical direction, and so that the forward direction in time is the upwards direction in the diagram (as we have been doing). Any line whose angle with the vertical direction is less than $45^\circ$ may represent a line of constant position for some inertial observer (e.g. an observer who travels along such a line). Such lines are said to be timelike.
Reciprocally, any line whose angle with the horizontal direction is less than $45^\circ$ may represent a line of constant time for some inertial observer (e.g. the observer who travels along a line symmetric to the given one with respect to the diagonal). Such lines are said to be spacelike. The diagonal lines, which represent possible trajectories of light, are said to be lightlike (or some times null).
Given some timelike segment $pq$, its proper time $(pq)_m$ (the 'm' being for Minkowski) is defined to be the distance in time between $p$ and $q$ with respect to an observer for whom the line $pq$ is a line of constant position (e.g. an observer who travels along $pq$). In other words, it is the temporal distance measured by an observer from whom both events happen at the same point in space. Since all such observers are stationary with respect to one another, this concept is well-defined. We denote by $(pq)_e$ the usual Euclidean length of the segment (within the diagram). Therefore \[ (pq)_e = \mu (pq)_m \] where $\mu$ is the scaling factor for the aforementioned observer.
Similarly, given some spacelike segment $p'q'$, its proper length is defined to be the distance in space between $p'$ and $q'$ with respect to an observer for whom the line $p'q'$ is a line of constant time (i.e. for whom both events happen at the same moment in time). Again one has $(p'q')_e = \mu (p'q')_m$ for such an observer.
The interval between events $p$ and $q$ is defined as $(\Delta t)^2 - (\Delta x)^2$, where $\Delta t$ (resp. $\Delta x$) is the temporal (resp. spatial) distance between $p$ and $q$, in some reference frame; we will show that this is the same for all reference frames. Note that, with respect to an observer for whom $pq$ is a line of constant position, one has $(\Delta t)^2 - (\Delta x)^2 = (pq)_m^2$. Similarly, with respect to an observer for whom $pq$ is a line of constant time, one has $(\Delta t)^2 - (\Delta x)^2 = -(pq)_m^2$.
Let now $pq$ be some timelike segment. We may construct upon this segment a rhombus whose diagonals are light lines; Brill and Jacobson call this a Minkowski square. There are actually two possibilities for how to do this (depending on whether the light beams from $pq$ go to the left or to the right), but they are congruent. If Alice is an observer for which the line $pq$ is a line of constant position (and thus the line $pp'$ is of constant time), then the (Euclidean) area of this rhombus, within the diagram, is \[ (pq)_e^2 \sin(\theta_A) = (pq)_m^2 \mu_A^2 \sin(\theta_A) = (pq)_m^2 \frac{\lambda_A^2}{\sin(\theta_A)} = (pq)_m^2 \lambda_A \mu_A \] Hence the fact that the square of the proper time $(pq)_m^2$ is proportional to the area of the Minkowski square built upon $pq$ is equivalent to Mermin's proposition that the product of the scaling factors $\lambda \mu$ is equal for all observers. We will prove this fact following Brill and Jacobson. But first, a lemma.
Suppose Alice and Bob meet at a point $p$ while traveling at some uniform speed with respect to one another. Afterwards, at point $q$, Alice emits a light signal which Bob receives at $r'$; similarly, at point $q'$ Bob emits a light signal which Alice receives at $r$.
If $(pq)_m = (pq')_m$, so that Alice and Bob each wait the same amount of time (with respect to their individual clocks) before sending the signal, then we must have $(pr)_m = (pr')_m$ (i.e. both will take equally long to receive their signal), since the situation is completely symmetric. If, instead, one had $(pq)_m = 2(pq')_m$, then we must have $(pr')_m = 2(pr)_m$. Indeed, we can imagine that Alice and Bob actually send two signals, at equally spaced time intervals of $(pq')_m$. Then they each also receive signals at equally spaced intervals of $(pr)_m$, again by symmetry, so that Bob waits $2(pr)_m$ before getting the second signal, which is the true one. In general (and this is our lemma), we have
\[ \frac{(pr')_m}{(pr)_m} = \frac{(pq)_m}{(pq')_m} \]
- Mermin, An introduction to space–time diagrams; American Journal of Physics 65, 476 (1997)
- Brill, D., Jacobson, T. Spacetime and Euclidean geometry. Gen Relativ Gravit 38, 643–651 (2006).
The main postulate of relativity is that all laws of physics look the same in every inertial frame of reference. A useful, but not entirely necessary, second postulate is that there is a thing called 'light' which is easy to produce and which travels in every (inertial) frame of reference at exactly the same speed $c$.
An event is just a name for a point in spacetime. It is something, or perhaps nothing, that happens at a particular location at a particular moment in time. Two different observers might assign different coordinates to events, but they will agree on what the events are. For instance: they may disagree on whether two beams of light reach two different points at the same time (the relativity of simultaneity), but they cannot disagree on whether two beams of light converge onto the same point at the same time! For the latter constitutes an event, which is an absolute notion.
A spacetime diagram is a useful graphical representation of events. To make our lives easier, we will think of a 2-dimensional spacetime (so that we have only one dimension of space), as opposed to our familiar 4-dimensional one. This is not quite enough to describe all interesting relativistic phenomena (see e.g. Thomas precession), but it's plenty enough for a good start. Let us agree, henceforth, that all observers use a particular system of units in which the speed of light is precisely $c = 1$. We will imagine our hypothetical universe has two observers, Alice and Bob, who are moving at uniform speed $v$ with respect to one another.
Alice's point of view
Imagine a spacetime diagram as an infinite sheet of paper. Alice will use this paper to depict events, so that points in the sheet of paper correspond to events in spacetime; there is much freedom in choosing how this representation works, as we will see. The set of all events which happen at a particular location in space will correspond, in the diagram, to a straight (but not necessarily vertical) line (why a straight line and not something curved? This is a convention, the possibility of which is justified by the translational symmetry of spacetime.) This straight line is called a line of constant position. Two different lines of constant position must be parallel, for a point of intersection would correspond to an event that happens at two different locations in space, contradicting the nature of events. Alice will space these lines according to some scale factor $\lambda$, such that two lines of constant position whose distance within the diagram is $d$ correspond to events that happen at locations of spatial distance $d/\lambda$ (in her frame of reference).
Each line of constant position represents all events that happen at a particular location in space. The spatial distance between these locations, for the lines shown in the diagram, is $1$
Analogously, the set of all events which occur at a particular moment in time will correspond, in the diagram, to a straight line called a line of constant time. Alice is free to choose the angle $\theta$ these lines make with the lines of constant position, and will (by convention) use the same factor $\lambda$ to convert actual distances in time to distances within the diagram (so that two lines of constant time whose distance within the diagram is $d$ correspond to events that happen at moments $d/\lambda$ apart in time). Each line of constant time crosses each line of constant position at precisely one event, which happens at that time and that location.
A portion of Alice's grid of unit spatial and temporal distances
Now suppose there are two points $p, q$ in a line of constant time which are at distance $d$ in the diagram. They correspond to two events that happen at the same time and different locations; what is the actual distance $D$ between these locations? To see this, draw lines of constant position through $p$ and $q$: Then $D = d'/\lambda$ where $d'$ is the distance in the diagram between the two lines. Since $d'/d = \sin(\theta)$, it follows that \[ D = \frac{d \sin(\theta)}{\lambda} = \frac{d}{\mu} \] where $\mu = \lambda/\sin(\theta)$ is another scaling factor. This same scaling factor also works for lines of constant position: if $p, q$ are at distance $d$ in a line of constant position, they represent events (at the same location) which are a distance $d/\mu$ in time.
Let us now bring in light. Suppose a beam of light (which we draw as a dashed line) travels between points $p, q$ in one unit of time. Draw lines of constant position and time through $p$ and $q$:
Since these events are one unit of time and one unit of space apart (due to $c = 1$), the diagram figure is a rhombus (i.e. a quadrilateral with all sides of equal length) whose sides have length $\mu$. Therefore the light line bisects the lines of constant position and constant time. As a consequence, it follows that light lines meet each other at right angles: indeed, the other light line through $p$ (not drawn) bisects the angle $\pi - \theta$, so that the angle between the light lines is \[ \frac{\pi - \theta}{2} + \frac{\theta}{2} = \frac{\pi}{2} \]
Bob's point of view
Now Bob, who is moving at uniform speed $v$ with respect to Alice, (metaphorically) comes in and looks over her diagram. Since each point in Alice's diagram already corresponds to a unique event in spacetime, Bob has no freedom in how to relabel this diagram according to his notions of time and space. Since he moves at uniform velocity with respect to Alice, for a given position (in Bob's frame) the set of all events occuring at that position in space will form a straight line in the diagram. These are Bob's lines of constant position, which are tilted with respect to Alice's lines of constant position by an angle $\alpha$ such that $\tan(\alpha) = v$.
It may be tempting to immediately draw Bob's lines of constant time so as to make the angle with the lines of constant position be bisected by light lines. However, recall by this point Bob has no freedom! Instead, we must prove his lines of constant time satisfy this property. This is done via a very standard argument in relativity, which goes as follows. Suppose Bob is standing in the middle of a train and at point $p$ flashes beams of light towards each end $q, r$ of the train, which are then reflected back to him at point $s$. Let $t$ be the intersection of his line of constant position with the line $qr$.
The vertical-ish lines are Bob's lines of constant position. The outer lines represent the two ends of the train, and the middle line represents Bob's position.
Since he is standing at the middle of the train and the speed of light is the same in both directions, he considers $q$ and $r$ to be simultaneous events, so that the line $qr$ is a line of constant time. Note that $pqsr$ is a rectangle with $qr$ and $ps$ its diagonals. The angle that the leftmost line of constant position makes with the light line is the same as $\angle tpr$, which is the same as $\angle tqs$ due to these being congruent triangles. Thus indeed the light line $qs$ bisects the angle between Bob's constant position and constant time lines.
Note then that Bob's lines of constant time are not the same as Alice's (unless Bob is stationary with respect to Alice). This means that events which Alice considers to happen simultaneously are considered by Bob to happen at different moments in time. This phenomenon is known as the relativity of simultaneity, and lies at the heart of many of special relativity's apparent paradoxes, a famous example being the Ladder paradox.
Let now $\lambda_B$ be Bob's scaling factor for distances between lines of constant position (which is forced upon him). This scaling factor need not equal Alice's $\lambda_A$. If $\lambda'_B$ is Bob's scaling factor for distances between lines of constant time, we must see $\lambda'_B = \lambda_B$ (as it happens for Alice). Let a beam of light be emitted at a point $p$ and absorbed one unit of Bob's time later at point $q$. We draw lines of constant position and time through $p$ and $q$ to form a parallelogram:
If $\theta_B$ is the angle between Bob's lines of constant position and constant time, then the vertical-ish sides of the parallelogram are equal to $\mu'_B = \lambda'_B/\sin(\theta_B)$, and the horizontal-ish sides are equal to $\mu_B = \lambda_B/\sin(\theta_B)$. Since the diagonal of this parallelogram bisects the angle between its sides, it follows that it must be a rhombus, and hence $\mu_B = \mu'_B$, so that $\lambda'_B = \lambda_B$ as was to be shown.
We thus conclude that Bob's labelings of the Alice's diagram will work exactly the same way as if he had drawn the diagram himself to begin with, so that the procedure for drawing spacetime diagrams is correctly observer-independent. In his paper, Mermin goes on to establish a relationship between the scaling factors: \[ \lambda_A \mu_A = \lambda_B \mu_B \] From this relationship he extracts the invariance of the spacetime interval $(\Delta t)^2 - (\Delta x)^2$. I find his proofs a bit technical and confusing, so now is a good time to switch over to Brill and Jacobson's paper.
The invariance of the interval
Let us agree to orient our spacetime diagrams in such a way so that the light lines make an angle of $45^\circ$ with the vertical direction, and so that the forward direction in time is the upwards direction in the diagram (as we have been doing). Any line whose angle with the vertical direction is less than $45^\circ$ may represent a line of constant position for some inertial observer (e.g. an observer who travels along such a line). Such lines are said to be timelike.
Reciprocally, any line whose angle with the horizontal direction is less than $45^\circ$ may represent a line of constant time for some inertial observer (e.g. the observer who travels along a line symmetric to the given one with respect to the diagonal). Such lines are said to be spacelike. The diagonal lines, which represent possible trajectories of light, are said to be lightlike (or some times null).
Given some timelike segment $pq$, its proper time $(pq)_m$ (the 'm' being for Minkowski) is defined to be the distance in time between $p$ and $q$ with respect to an observer for whom the line $pq$ is a line of constant position (e.g. an observer who travels along $pq$). In other words, it is the temporal distance measured by an observer from whom both events happen at the same point in space. Since all such observers are stationary with respect to one another, this concept is well-defined. We denote by $(pq)_e$ the usual Euclidean length of the segment (within the diagram). Therefore \[ (pq)_e = \mu (pq)_m \] where $\mu$ is the scaling factor for the aforementioned observer.
Similarly, given some spacelike segment $p'q'$, its proper length is defined to be the distance in space between $p'$ and $q'$ with respect to an observer for whom the line $p'q'$ is a line of constant time (i.e. for whom both events happen at the same moment in time). Again one has $(p'q')_e = \mu (p'q')_m$ for such an observer.
The interval between events $p$ and $q$ is defined as $(\Delta t)^2 - (\Delta x)^2$, where $\Delta t$ (resp. $\Delta x$) is the temporal (resp. spatial) distance between $p$ and $q$, in some reference frame; we will show that this is the same for all reference frames. Note that, with respect to an observer for whom $pq$ is a line of constant position, one has $(\Delta t)^2 - (\Delta x)^2 = (pq)_m^2$. Similarly, with respect to an observer for whom $pq$ is a line of constant time, one has $(\Delta t)^2 - (\Delta x)^2 = -(pq)_m^2$.
Let now $pq$ be some timelike segment. We may construct upon this segment a rhombus whose diagonals are light lines; Brill and Jacobson call this a Minkowski square. There are actually two possibilities for how to do this (depending on whether the light beams from $pq$ go to the left or to the right), but they are congruent. If Alice is an observer for which the line $pq$ is a line of constant position (and thus the line $pp'$ is of constant time), then the (Euclidean) area of this rhombus, within the diagram, is \[ (pq)_e^2 \sin(\theta_A) = (pq)_m^2 \mu_A^2 \sin(\theta_A) = (pq)_m^2 \frac{\lambda_A^2}{\sin(\theta_A)} = (pq)_m^2 \lambda_A \mu_A \] Hence the fact that the square of the proper time $(pq)_m^2$ is proportional to the area of the Minkowski square built upon $pq$ is equivalent to Mermin's proposition that the product of the scaling factors $\lambda \mu$ is equal for all observers. We will prove this fact following Brill and Jacobson. But first, a lemma.
Suppose Alice and Bob meet at a point $p$ while traveling at some uniform speed with respect to one another. Afterwards, at point $q$, Alice emits a light signal which Bob receives at $r'$; similarly, at point $q'$ Bob emits a light signal which Alice receives at $r$.
The light lines are supposed to be perpendicular
We now show that the squared proper time $(pq)_m^2$ of a given timelike segment $pq$ is proportional to the area of the Minkowski square built upon it (with proportionality constant $\lambda \mu$). First we observe that one may construct upon $pq$ a triangle $pqr$ with the two other sides being lightlike (again, there are two congruent possibilities for this triangle):
The triangle $pqr$ is a null triangle
Comments
Post a Comment