Lorentz transformations - a derivation
From: Timo Nieminen (timo_at_physics.uq.edu.au)
Date: 01/06/05
- Next message: Uncle Al: "Re: The genius of the Absolute"
- Previous message: macromitch_at_internetCDS.com: "Re: The genius of the Absolute"
- Next in thread: Edward Green: "Re: Lorentz transformations - a derivation"
- Reply: Edward Green: "Re: Lorentz transformations - a derivation"
- Maybe reply: Eugene Shubert: "Re: Lorentz transformations - a derivation"
- Reply: Eli Botkin: "Re: Lorentz transformations - a derivation"
- Reply: mmeron_at_cars3.uchicago.edu: "Re: Lorentz transformations - a derivation"
- Reply: Dirk Van de moortel: "Re: Lorentz transformations - a derivation"
- Reply: Bilge: "Re: Lorentz transformations - a derivation"
- Reply: Franz Heymann: "Re: Lorentz transformations - a derivation"
- Reply: RP: "Re: Lorentz transformations - a derivation"
- Reply: jimshard: "Barwacz Space"
- Maybe reply: mmeron_at_cars3.uchicago.edu: "Re: Lorentz transformations - a derivation"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 6 Jan 2005 11:32:27 +1000
Given the rather long threads on derivations of the Lorentz
transformations that seem to be making slow progress, I thought this
might be a worthwhile contribution. Feel free to copy and inflict on
correspondents in such threads!
This is a first draft only, and so could have a nice assortment of errors.
0. Introduction
The aim is to provide a simple and general derivation of the
homogeneous Lorentz transformations, without assuming that axes are
parallel, or that motion is along the x-axis. (Perhaps a revision
to avoid the use of Cartesian coordinates might be useful?)
Keeping in mind that the Lorentz transformations relate coordinates in
two inertial reference frames, we will restrict our attention to
such reference frames. At first, we will simply assume that all the
reference frames are in uniform relative motion (ie unaccelerating
and no rotational motion), and later, when some physics is introduced,
we will introduce the inertiality.
Permission is given to use the content of this post, including
publishing online, re-posting, etc.
Comments and corrections welcome.
0.1 Notation
r denotes a position vector; a number can be appended to distinguish
between two different position vectors, eg r1, r2. The components of
the position vector will, in general, differ between reference frames.
t denotes a time as measured in a given reference frame.
d_ij is the Kronecker delta.
Coordinates are specified by x, y, z or x1, y1, z1 etc when necessary.
The reference frame in which position vectors and times will
be specified when necessary by a "subscript" letter eg r_a, r1_a,
t_a, or (t,r)_a. Coordinates are x_a, y_a, z_a.
The scalar product of two vectors a and b is denoted by a.b
The product of two scalars, or of two matrices, is denoted by a b
The transpose of a matrix a is written as aT
Vectors are written as matrices with a single column when
used in matrix expressions; ie a is a column vector,
aT is a row vector.
Where a matrix is written in terms of its elements, the notation
[ a b c; d e f; g h i ] will be used to avoid problems with
non-fixed-width fonts. Here, a b c are the elements of the first row,
d e f the elements of the 2nd row etc.
Periods are left off ends of sentences where they could cause
confusion with mathematical notation (see above).
1. Rotations in 3D space
Consider a 3D Euclidean space with a Cartesian coordinate system such
that the distance between two points r1 and r2 is
ds = sqrt( (r1 - r2).(r1 - r2) )
Note that the scalar product is, in terms of coordinates,
r1.r2 = g_11 x1 x2 + g_22 y1 y2 + g_33 z1 z2
where g_11, g_22, g_33 are the diagonal elements of the metric tensor g.
For a Cartesian coordinate system, we have g = d_ij
Note that we can write this as a matrix product:
r1.r2 = r1T g r2
which, in a Cartesian coordinate system, is r1.r2 = r1T r2
If we consider two Cartesian coordinate systems with coincident origins,
we can ask what linear transformations of coordinates result in
distances being invariant.
Such a transformation must be of the form:
x_b = a_11 x_a + a_12 y_a + a_13 z_a + c_1
y_b = a_21 x_a + a_22 y_a + a_23 z_a + c_2
z_b = a_31 x_a + a_32 y_a + a_33 z_a + c_3
or, more compactly, we can write this as a matrix equation
r_b = A r_a + C
Since we have specified that the origins are coincident, we have
C = (0,0,0); the transformation must be homogeneous.
If we have r_a = r1_a - r2_a, the distance between the points specified
by positions vectors r1_a and r2_a must be the same in both coordinate
systems. Therefore
ds^2 = ds_a^2 = ds_b^2
= r_b.r_b
= (A r_a).(A r_a)
= (A r_a)T (A r_a)
= rT_a AT A r_a
which, since this must also equal r_a.r_a, means that
AT A = I
ie the matrices are orthogonal, and
inv(A) = AT
Therefore, the square of the determinant of A is
|A|^2 = 1
We can further note that 3x3 matrices with |A|^2 = 1 form a group under
matrix multiplication, termed O(3) - the three-dimensional orthogonal
group.
We can identify two distinct classes of transformations in O(3):
|A| = +1, which are pure rotations, and |A| = -1, which are rotations
combined with a reflection.
That these transformations form a group means that:
1. The result of one rotation/reflection followed by another
rotation/reflection can be obtained by a single rotation/reflection.
2. If we replace pairs of rotation/reflection transformations by
equivalent single transformations, the order in which we do so does
not matter. (Note that this is associativity, not commutativity!)
3. There is a rotation/reflection which leaves the coordinates unchanged.
4. For any rotation/reflection, there is an inverse transform that
restores things to the original state.
If we exclude reflections (ie we restrict ourselves to pure rotations
with |A| = +1, which we will call proper rotations), these conditions
are still satisfied, so proper rotations also form a group, denoted
SO(3). Since all proper (ie reflection-free) rotations must form a
continuous group containing the identity transformation, this provides
a general way of identifying the subgroup we are interested in - it
must contain I. Euler's theorem states that all 3D orthogonal
transformations with |A| = +1 are rotations.
1.1 Rotations in n-dimensional space
We will make a diversion into n-dimensional rotations, to see how we can
parameterise rotations, and actually write down the elements of a
rotation matrix.
Note that the considerations in the above section apply equally to
dimensions other than 3 - SO(1), SO(2), SO(4) etc are the groups of
proper rotations in 1, 2, and 4 dimensions.
1.1.1 1D
Since in 1D, we have |A| = A_11, the only 1D rotation matrix is [1].
1.1.2 2D
The transformation A has 4 matrix elements, but the orthogonality
relations provide 3 equations relating these, so only one free
parameter is required to describe a rotation. Therefore, we can give
a single element of SO(2), and generate all other elements by raising
it to a power. That is, given G, an element of SO(2), G^a is also an
element. We can proceed by choosing an "infinitesimal generator" S such
G = exp(-S)
Thus, we have
G^a = exp( - a S )
Noting that |A| = exp(Tr(S)), the requirement that |A| = 1 means that
Tr(S) = 0. Since inv(A) = exp(S), and inv(A) = AT, we must have
ST = -S, so S is antisymmetric. Since this requires all diagonal
elements to be zero, we also have Tr(S) = 0
The matrix
S = [ 0 -1; 1 0 ]
is a suitable infinitesimal generator, since any 2x2 antisymmetric
matrix can be written as the product a S
S has an interesting property:
S^2 = [ -1 0; 0 -1], S^3 = [ 0 1; -1 0 ] = -S, S^4 = -S^2 = I
Therefore, if we write the series expansion for exp(-aS), all of the
higher powers of S can be reduced to S and S^2. Using this, we find
exp(-aS) = - sin(aS) - cos(a S^2)
Since S^2 = -I, we can write any 2D rotation matrix as
R = [ cos(a) sin(a); -sin(a) cos(a) ]
in which we can immediately recognise our (originally abstract)
parameter a as the angle of rotation.
1.1.3 3+D
The same considerations apply. We need only write a set of infinitesimal
generators which are a basis set in terms of which any antisymmetric
matrix can be written. A suitable basis is:
S_1 = [ 0 -1 0; 1 0 0; 0 0 0 ]
S_2 = [ 0 0 1; 0 0 0; -1 0 0 ]
S_3 = [ 0 0 0; 0 0 -1; 0 1 0 ]
and we can write any antisymmetric matrix as
S = a_1 S_1 + a_2 S_2 + a_3 S_3
We can proceed as for 2D (with somewhat more difficulty!) and write
down the 3D rotation matrix in terms of the 3 parameters a_i (left
as an exercise for the reader!)
The astute reader might note that the top left 2x2 block of S_1 is
exactly the same as our 2D S, and must behave in the same way, so
S_1^3 = -S_1, S_1^4 = -S_1^2 etc. The same also applies for S_2 and
S_3. In the simple case where two of the three parameters a_i are
zero, we obtain transformations which we can easily recognise as
rotations about the x, y, and z axes, with the non-zero parameter
being the angle of rotation.
The extension to dimensions higher than 3 is elementary, although
writing down the elements of R explicitly in terms of a_i becomes
progressively more painful.
2. The Lorentz transformations
The mathematics of rotations gives us a simple mechanism to derive
the Lorentz transformations.
Consider a 4D coordinate system with metric tensor
g_00 = -1, g_11 = 1, g_22 = 1, g_33 = 1
A length interval is then
ds = sqrt( rT g r )
Homogenous linear transformations which leave this invariant must
satisfy AT g A = g, and since |g| is non-zero, we must have |A|^2 = 1
Restricting ourselves to proper rotations, we have |A| = 1
Since we have a metric tensor not equal to I, we must explicitly
include it when writing down our generator and infinitesimal generators.
We now require (g S) to be antisymmetric (we actually required this
for rotations in Cartesian systems, but since (g S) = (I S) = S, we
didn't write it down.
Thus, a suitable basis set for the infinitesimal generators is:
S_1 = [ 0 1 0 0; 1 0 0 0; 0 0 0 0; 0 0 0 0 ]
S_2 = [ 0 0 1 0; 0 0 0 0; 1 0 0 0; 0 0 0 0 ]
S_3 = [ 0 0 0 1; 0 0 0 0; 0 0 0 0; 1 0 0 0 ]
S_4 = [ 0 0 0 0; 0 0 -1 0; 0 1 0 0; 0 0 0 0 ]
S_5 = [ 0 0 0 0; 0 0 0 1; 0 0 0 0; 0 -1 0 0 ]
S_6 = [ 0 0 0 0; 0 0 0 0; 0 0 0 -1; 0 0 1 0 ]
Clearly, if we have a_1 = a_2 = a_3 = 0, our transformations are 3D
rotations of the last 3 coordinates, leaving the first coordinate
unchanged.
Since we now have S_1^3 = S_1 and S_1^4 = S_1^2, if we have only a_1
non-zero, we obtain
R = [ cosh(a_1) -sinh(a_1) 0 0; -sinh(a_1) cosh(a_1) 0 0; 0 0 0 0; 0 0 0 0 ]
and similarly for having only a_2 or a_3 non-zero.
We now have the Lorentz transformations and a general recipe for
writing any Lorentz transformation in terms of 6 parameters, of
which 3 specify a 3D rotation of the last 3 coordinates. Now it
is time to intoduce some physics.
3. Lorentz transformations in physics
To make use of the above mathemachinery, we note that we can specify
an event - a combination of a position vector and a time - as a 4D
vector (at,r) = (ar,x,y,z) where a is a scale factor so that ar and
x (and y and z) have the same units. Since x has units of length, and
t has units of time, the scale factor a has units of velocity.
We adopt the postulate that the laws of physics are the same in all
inertial reference frames (the Principle of Relativity).
This requires us to specify what is meant by
an inertial reference frame: a reference frame in which an object acted
on by zero force is either stationary or moves in a straight line at
constant speed. This means that dr/dt is independent of time in all
reference frames, where r(t) is the position of the force-free object.
If the object is inertial in any single reference frame, it will be
inertial in any reference frame related to the first by a linear
transformation. Therefore, the Lorentz transformations relate
inertial reference frames.
We adopt a further postulate: that the Maxwell equations correctly
describe the propagation of electromagnetic waves in free space in
all inertial reference frames. Directly from this, we see that the
speed of light in free space, c, must be the same in all in inertial
reference frames.
Therefore, c is a good choice of scale factor, since it must be the
same in all inertial reference frames, so we write our 4-coordinates
as (ct,r). It is worth noting that if we postulate instead that
either (a) we can use the same scale factor in all inertial reference
frames or (b) that there is a speed that is the same in all inertial
reference frames, we reach the same point, but without having identified
our scale factor as the speed of light in free space. In that way,
we could obtain a result that would be undisturbed by falsification of
the Maxwell equations (eg by measurement of a non-zero photon mass).
However, we will be content to use the historical postulate.
If we consider two event: the launching of a pulse of light, with
4-coordinates (ct1,r1), and its reception (ct2,r2), if the speed of
light is to be the same in all inertial reference frames, we must
have sqrt((r2 - r1).(r2 - r1))/(t2 - t1) = c in all frames. Therefore,
sqrt((r2-r1).(r2-r1)) = ct2 - ct1
(r2-r1).(r2-r1) = (ct2 - ct1)^2
-(ct2 - ct1)^2 + (r2-r1).(r2-r1) = 0
If we write (ct,r) = (ct2,r2) - (ct1,r1), the left hand side of the
above expression is
(ct,r).(ct,r) = (ct,r)T g (ct,r)
Therefore, a linear transformation under which the scalar product
invariant under a metric g_00 = -1, g_11 = g_22 = g_33 = 1 is
invariant results in the speed of light being the same in all
inertial reference frames.
The Lorentz transformations obtained in section 2 are the
transformations which meet these requirements, and therefore must
be the correct transformations relating coordinates (ct,r) in
different reference frames, if the Principle of Relativity is valid,
and the Maxwell equations are correct.
The parameters (a_4,a_5,a_6) are those required to specify a spatial
rotation. What are the other three parameters (a_1,a_2,a_3)?
Since the space origins (r = 0) of different reference frames only
need to coincide at t = 0, clearly the reference frames can be
in relative motion.
As measured in frame a, the origin of frame b moves at a constant
velocity B = dr_a/d(ct_a). Since B is constant, and the 4-origins are
coincident, B = r_a/(ct_a), where (ct_a,r_a) = Lba (ct_b,0,0,0)
Noting the Lorentz transformation resulting from only a_1 being
non-zero, the velocity in such a case would be (-tanh(a_1),0,0),
and (0,-tanh(a_2),0) and (0,0,-tanh(a_3)) when a_2 and a_3 are
the only non-zero parameters, we must have
(a_1,a_2,a_3) = B atanh(|B|) / |B|
for the transformation from a to b (the transformation above was
from b to a) and we are done!
-- Timo Nieminen - Home page: http://www.physics.uq.edu.au/people/nieminen/ Shrine to Spirits: http://www.users.bigpond.com/timo_nieminen/spirits.html
- Next message: Uncle Al: "Re: The genius of the Absolute"
- Previous message: macromitch_at_internetCDS.com: "Re: The genius of the Absolute"
- Next in thread: Edward Green: "Re: Lorentz transformations - a derivation"
- Reply: Edward Green: "Re: Lorentz transformations - a derivation"
- Maybe reply: Eugene Shubert: "Re: Lorentz transformations - a derivation"
- Reply: Eli Botkin: "Re: Lorentz transformations - a derivation"
- Reply: mmeron_at_cars3.uchicago.edu: "Re: Lorentz transformations - a derivation"
- Reply: Dirk Van de moortel: "Re: Lorentz transformations - a derivation"
- Reply: Bilge: "Re: Lorentz transformations - a derivation"
- Reply: Franz Heymann: "Re: Lorentz transformations - a derivation"
- Reply: RP: "Re: Lorentz transformations - a derivation"
- Reply: jimshard: "Barwacz Space"
- Maybe reply: mmeron_at_cars3.uchicago.edu: "Re: Lorentz transformations - a derivation"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|