Re: Lorentz transformations - a derivation
From: Eli Botkin (elibotkin_at_optonline.net)
Date: 01/06/05
- Previous message: Todd: "Re: Einstein's math and physical objects"
- In reply to: Timo Nieminen: "Lorentz transformations - a derivation"
- Next in thread: Timo Nieminen: "Re: Lorentz transformations - a derivation"
- Reply: Timo Nieminen: "Re: Lorentz transformations - a derivation"
- Messages sorted by: [ date ] [ thread ]
Date: Wed, 5 Jan 2005 22:50:31 -0500
Timo:
I haven't gone through your derivation and so can't comment.
Just want to tell you that W. Pauli presents the general form of the
transformation equations as a footnote (pages 10 and 11) in his book "Theory
of Relativity", Pergamon Press, 1958.
Eli Botkin
"Timo Nieminen" <timo@physics.uq.edu.au> wrote in message
news:Pine.LNX.4.50.0501061128280.13596-100000@localhost...
> Given the rather long threads on derivations of the Lorentz
> transformations that seem to be making slow progress, I thought this
> might be a worthwhile contribution. Feel free to copy and inflict on
> correspondents in such threads!
>
> This is a first draft only, and so could have a nice assortment of errors.
>
> 0. Introduction
>
> The aim is to provide a simple and general derivation of the
> homogeneous Lorentz transformations, without assuming that axes are
> parallel, or that motion is along the x-axis. (Perhaps a revision
> to avoid the use of Cartesian coordinates might be useful?)
>
> Keeping in mind that the Lorentz transformations relate coordinates in
> two inertial reference frames, we will restrict our attention to
> such reference frames. At first, we will simply assume that all the
> reference frames are in uniform relative motion (ie unaccelerating
> and no rotational motion), and later, when some physics is introduced,
> we will introduce the inertiality.
>
> Permission is given to use the content of this post, including
> publishing online, re-posting, etc.
>
> Comments and corrections welcome.
>
>
> 0.1 Notation
>
> r denotes a position vector; a number can be appended to distinguish
> between two different position vectors, eg r1, r2. The components of
> the position vector will, in general, differ between reference frames.
>
> t denotes a time as measured in a given reference frame.
>
> d_ij is the Kronecker delta.
>
> Coordinates are specified by x, y, z or x1, y1, z1 etc when necessary.
>
> The reference frame in which position vectors and times will
> be specified when necessary by a "subscript" letter eg r_a, r1_a,
> t_a, or (t,r)_a. Coordinates are x_a, y_a, z_a.
>
> The scalar product of two vectors a and b is denoted by a.b
>
> The product of two scalars, or of two matrices, is denoted by a b
>
> The transpose of a matrix a is written as aT
>
> Vectors are written as matrices with a single column when
> used in matrix expressions; ie a is a column vector,
> aT is a row vector.
>
> Where a matrix is written in terms of its elements, the notation
> [ a b c; d e f; g h i ] will be used to avoid problems with
> non-fixed-width fonts. Here, a b c are the elements of the first row,
> d e f the elements of the 2nd row etc.
>
> Periods are left off ends of sentences where they could cause
> confusion with mathematical notation (see above).
>
>
> 1. Rotations in 3D space
>
> Consider a 3D Euclidean space with a Cartesian coordinate system such
> that the distance between two points r1 and r2 is
>
> ds = sqrt( (r1 - r2).(r1 - r2) )
>
> Note that the scalar product is, in terms of coordinates,
>
> r1.r2 = g_11 x1 x2 + g_22 y1 y2 + g_33 z1 z2
>
> where g_11, g_22, g_33 are the diagonal elements of the metric tensor g.
> For a Cartesian coordinate system, we have g = d_ij
>
> Note that we can write this as a matrix product:
>
> r1.r2 = r1T g r2
>
> which, in a Cartesian coordinate system, is r1.r2 = r1T r2
>
> If we consider two Cartesian coordinate systems with coincident origins,
> we can ask what linear transformations of coordinates result in
> distances being invariant.
>
> Such a transformation must be of the form:
>
> x_b = a_11 x_a + a_12 y_a + a_13 z_a + c_1
> y_b = a_21 x_a + a_22 y_a + a_23 z_a + c_2
> z_b = a_31 x_a + a_32 y_a + a_33 z_a + c_3
>
> or, more compactly, we can write this as a matrix equation
>
> r_b = A r_a + C
>
> Since we have specified that the origins are coincident, we have
> C = (0,0,0); the transformation must be homogeneous.
>
> If we have r_a = r1_a - r2_a, the distance between the points specified
> by positions vectors r1_a and r2_a must be the same in both coordinate
> systems. Therefore
>
> ds^2 = ds_a^2 = ds_b^2
> = r_b.r_b
> = (A r_a).(A r_a)
> = (A r_a)T (A r_a)
> = rT_a AT A r_a
>
> which, since this must also equal r_a.r_a, means that
>
> AT A = I
>
> ie the matrices are orthogonal, and
>
> inv(A) = AT
>
> Therefore, the square of the determinant of A is
>
> |A|^2 = 1
>
> We can further note that 3x3 matrices with |A|^2 = 1 form a group under
> matrix multiplication, termed O(3) - the three-dimensional orthogonal
> group.
>
> We can identify two distinct classes of transformations in O(3):
> |A| = +1, which are pure rotations, and |A| = -1, which are rotations
> combined with a reflection.
>
> That these transformations form a group means that:
> 1. The result of one rotation/reflection followed by another
> rotation/reflection can be obtained by a single rotation/reflection.
> 2. If we replace pairs of rotation/reflection transformations by
> equivalent single transformations, the order in which we do so does
> not matter. (Note that this is associativity, not commutativity!)
> 3. There is a rotation/reflection which leaves the coordinates unchanged.
> 4. For any rotation/reflection, there is an inverse transform that
> restores things to the original state.
>
> If we exclude reflections (ie we restrict ourselves to pure rotations
> with |A| = +1, which we will call proper rotations), these conditions
> are still satisfied, so proper rotations also form a group, denoted
> SO(3). Since all proper (ie reflection-free) rotations must form a
> continuous group containing the identity transformation, this provides
> a general way of identifying the subgroup we are interested in - it
> must contain I. Euler's theorem states that all 3D orthogonal
> transformations with |A| = +1 are rotations.
>
> 1.1 Rotations in n-dimensional space
>
> We will make a diversion into n-dimensional rotations, to see how we can
> parameterise rotations, and actually write down the elements of a
> rotation matrix.
>
> Note that the considerations in the above section apply equally to
> dimensions other than 3 - SO(1), SO(2), SO(4) etc are the groups of
> proper rotations in 1, 2, and 4 dimensions.
>
>
> 1.1.1 1D
>
> Since in 1D, we have |A| = A_11, the only 1D rotation matrix is [1].
>
>
> 1.1.2 2D
>
> The transformation A has 4 matrix elements, but the orthogonality
> relations provide 3 equations relating these, so only one free
> parameter is required to describe a rotation. Therefore, we can give
> a single element of SO(2), and generate all other elements by raising
> it to a power. That is, given G, an element of SO(2), G^a is also an
> element. We can proceed by choosing an "infinitesimal generator" S such
>
> G = exp(-S)
>
> Thus, we have
>
> G^a = exp( - a S )
>
> Noting that |A| = exp(Tr(S)), the requirement that |A| = 1 means that
> Tr(S) = 0. Since inv(A) = exp(S), and inv(A) = AT, we must have
> ST = -S, so S is antisymmetric. Since this requires all diagonal
> elements to be zero, we also have Tr(S) = 0
>
> The matrix
>
> S = [ 0 -1; 1 0 ]
>
> is a suitable infinitesimal generator, since any 2x2 antisymmetric
> matrix can be written as the product a S
>
> S has an interesting property:
>
> S^2 = [ -1 0; 0 -1], S^3 = [ 0 1; -1 0 ] = -S, S^4 = -S^2 = I
>
> Therefore, if we write the series expansion for exp(-aS), all of the
> higher powers of S can be reduced to S and S^2. Using this, we find
>
> exp(-aS) = - sin(aS) - cos(a S^2)
>
> Since S^2 = -I, we can write any 2D rotation matrix as
>
> R = [ cos(a) sin(a); -sin(a) cos(a) ]
>
> in which we can immediately recognise our (originally abstract)
> parameter a as the angle of rotation.
>
>
> 1.1.3 3+D
>
> The same considerations apply. We need only write a set of infinitesimal
> generators which are a basis set in terms of which any antisymmetric
> matrix can be written. A suitable basis is:
>
> S_1 = [ 0 -1 0; 1 0 0; 0 0 0 ]
> S_2 = [ 0 0 1; 0 0 0; -1 0 0 ]
> S_3 = [ 0 0 0; 0 0 -1; 0 1 0 ]
>
> and we can write any antisymmetric matrix as
>
> S = a_1 S_1 + a_2 S_2 + a_3 S_3
>
> We can proceed as for 2D (with somewhat more difficulty!) and write
> down the 3D rotation matrix in terms of the 3 parameters a_i (left
> as an exercise for the reader!)
>
> The astute reader might note that the top left 2x2 block of S_1 is
> exactly the same as our 2D S, and must behave in the same way, so
> S_1^3 = -S_1, S_1^4 = -S_1^2 etc. The same also applies for S_2 and
> S_3. In the simple case where two of the three parameters a_i are
> zero, we obtain transformations which we can easily recognise as
> rotations about the x, y, and z axes, with the non-zero parameter
> being the angle of rotation.
>
> The extension to dimensions higher than 3 is elementary, although
> writing down the elements of R explicitly in terms of a_i becomes
> progressively more painful.
>
>
> 2. The Lorentz transformations
>
> The mathematics of rotations gives us a simple mechanism to derive
> the Lorentz transformations.
>
> Consider a 4D coordinate system with metric tensor
>
> g_00 = -1, g_11 = 1, g_22 = 1, g_33 = 1
>
> A length interval is then
>
> ds = sqrt( rT g r )
>
> Homogenous linear transformations which leave this invariant must
> satisfy AT g A = g, and since |g| is non-zero, we must have |A|^2 = 1
> Restricting ourselves to proper rotations, we have |A| = 1
>
> Since we have a metric tensor not equal to I, we must explicitly
> include it when writing down our generator and infinitesimal generators.
> We now require (g S) to be antisymmetric (we actually required this
> for rotations in Cartesian systems, but since (g S) = (I S) = S, we
> didn't write it down.
>
> Thus, a suitable basis set for the infinitesimal generators is:
>
> S_1 = [ 0 1 0 0; 1 0 0 0; 0 0 0 0; 0 0 0 0 ]
> S_2 = [ 0 0 1 0; 0 0 0 0; 1 0 0 0; 0 0 0 0 ]
> S_3 = [ 0 0 0 1; 0 0 0 0; 0 0 0 0; 1 0 0 0 ]
> S_4 = [ 0 0 0 0; 0 0 -1 0; 0 1 0 0; 0 0 0 0 ]
> S_5 = [ 0 0 0 0; 0 0 0 1; 0 0 0 0; 0 -1 0 0 ]
> S_6 = [ 0 0 0 0; 0 0 0 0; 0 0 0 -1; 0 0 1 0 ]
>
> Clearly, if we have a_1 = a_2 = a_3 = 0, our transformations are 3D
> rotations of the last 3 coordinates, leaving the first coordinate
> unchanged.
>
> Since we now have S_1^3 = S_1 and S_1^4 = S_1^2, if we have only a_1
> non-zero, we obtain
>
> R = [ cosh(a_1) -sinh(a_1) 0 0; -sinh(a_1) cosh(a_1) 0 0; 0 0 0 0; 0 0 0
> 0 ]
>
> and similarly for having only a_2 or a_3 non-zero.
>
> We now have the Lorentz transformations and a general recipe for
> writing any Lorentz transformation in terms of 6 parameters, of
> which 3 specify a 3D rotation of the last 3 coordinates. Now it
> is time to intoduce some physics.
>
>
> 3. Lorentz transformations in physics
>
> To make use of the above mathemachinery, we note that we can specify
> an event - a combination of a position vector and a time - as a 4D
> vector (at,r) = (ar,x,y,z) where a is a scale factor so that ar and
> x (and y and z) have the same units. Since x has units of length, and
> t has units of time, the scale factor a has units of velocity.
>
> We adopt the postulate that the laws of physics are the same in all
> inertial reference frames (the Principle of Relativity).
> This requires us to specify what is meant by
> an inertial reference frame: a reference frame in which an object acted
> on by zero force is either stationary or moves in a straight line at
> constant speed. This means that dr/dt is independent of time in all
> reference frames, where r(t) is the position of the force-free object.
>
> If the object is inertial in any single reference frame, it will be
> inertial in any reference frame related to the first by a linear
> transformation. Therefore, the Lorentz transformations relate
> inertial reference frames.
>
> We adopt a further postulate: that the Maxwell equations correctly
> describe the propagation of electromagnetic waves in free space in
> all inertial reference frames. Directly from this, we see that the
> speed of light in free space, c, must be the same in all in inertial
> reference frames.
>
> Therefore, c is a good choice of scale factor, since it must be the
> same in all inertial reference frames, so we write our 4-coordinates
> as (ct,r). It is worth noting that if we postulate instead that
> either (a) we can use the same scale factor in all inertial reference
> frames or (b) that there is a speed that is the same in all inertial
> reference frames, we reach the same point, but without having identified
> our scale factor as the speed of light in free space. In that way,
> we could obtain a result that would be undisturbed by falsification of
> the Maxwell equations (eg by measurement of a non-zero photon mass).
> However, we will be content to use the historical postulate.
>
> If we consider two event: the launching of a pulse of light, with
> 4-coordinates (ct1,r1), and its reception (ct2,r2), if the speed of
> light is to be the same in all inertial reference frames, we must
> have sqrt((r2 - r1).(r2 - r1))/(t2 - t1) = c in all frames. Therefore,
>
> sqrt((r2-r1).(r2-r1)) = ct2 - ct1
> (r2-r1).(r2-r1) = (ct2 - ct1)^2
> -(ct2 - ct1)^2 + (r2-r1).(r2-r1) = 0
>
> If we write (ct,r) = (ct2,r2) - (ct1,r1), the left hand side of the
> above expression is
> (ct,r).(ct,r) = (ct,r)T g (ct,r)
>
> Therefore, a linear transformation under which the scalar product
> invariant under a metric g_00 = -1, g_11 = g_22 = g_33 = 1 is
> invariant results in the speed of light being the same in all
> inertial reference frames.
>
> The Lorentz transformations obtained in section 2 are the
> transformations which meet these requirements, and therefore must
> be the correct transformations relating coordinates (ct,r) in
> different reference frames, if the Principle of Relativity is valid,
> and the Maxwell equations are correct.
>
> The parameters (a_4,a_5,a_6) are those required to specify a spatial
> rotation. What are the other three parameters (a_1,a_2,a_3)?
> Since the space origins (r = 0) of different reference frames only
> need to coincide at t = 0, clearly the reference frames can be
> in relative motion.
>
> As measured in frame a, the origin of frame b moves at a constant
> velocity B = dr_a/d(ct_a). Since B is constant, and the 4-origins are
> coincident, B = r_a/(ct_a), where (ct_a,r_a) = Lba (ct_b,0,0,0)
>
> Noting the Lorentz transformation resulting from only a_1 being
> non-zero, the velocity in such a case would be (-tanh(a_1),0,0),
> and (0,-tanh(a_2),0) and (0,0,-tanh(a_3)) when a_2 and a_3 are
> the only non-zero parameters, we must have
>
> (a_1,a_2,a_3) = B atanh(|B|) / |B|
>
> for the transformation from a to b (the transformation above was
> from b to a) and we are done!
>
>
> --
> Timo Nieminen - Home page: http://www.physics.uq.edu.au/people/nieminen/
> Shrine to Spirits: http://www.users.bigpond.com/timo_nieminen/spirits.html
- Previous message: Todd: "Re: Einstein's math and physical objects"
- In reply to: Timo Nieminen: "Lorentz transformations - a derivation"
- Next in thread: Timo Nieminen: "Re: Lorentz transformations - a derivation"
- Reply: Timo Nieminen: "Re: Lorentz transformations - a derivation"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|