A pre-metric generalization of the Lorentz transformation
AA pre-metric generalization of the Lorentz transformation
D. H. Delphenich Spring Valley, OH 45379 ____
Abstract:
The concept of an observer and their associated rest space is defined in a pre-metric (i.e., projective-geometric) context that relates to time+space decompositions of the tangent bundle to space-time. The transformation from one observer to another when the two are in a state of relative motion is then defined, and its relationship to the Lorentz transformation is discussed. The group of all linear transformations that preserve the observer quadric, which generalizes the proper-time hyperboloid in Minkowski space, is defined and the reductions to some of its subgroups are described, as well as its extension to the group that preserves the fundamental quadric, which generalizes the light cone.
1. Introduction. – Traditionally, special relativity [ ] essentially begins with Minkowski space and the Lorentz group of transformations that preserve its scalar product. The concept of an observer is then introduced in the form of a Lorentzian frame – i.e., a linear frame that is orthonormal for the Minkowski scalar product. More precisely, one starts with a natural frame for a coordinate system on Minkowski space that is orthonormal for its scalar product. Concepts such as the rest frame and rest space for an observer are then typically introduced in a somewhat casual way that cries out for a more rigorous mathematical definition. It will be shown in the following that one can start with a definition of an observer and their associated rest space in space-time that does not immediately require the introduction of the Minkowski scalar product, but still manages to allow one to define the concept of the time-line of the observer, as well as their rest space, and the concept of a rest frame emerges naturally, although it would be incorrect to say that it exists uniquely. One can then proceed to investigate the transformation between two observers that exist in a state of relative motion. Furthermore, the consistency of the methodology with the better-established methodology of special relativity becomes quite natural. The key to understanding what is going on is to understand what distinguishes projective geometry from metric geometry. In Felix Klein’s celebrated Erlanger Programm [ ], geometries were defined by some fundamental construction or relationship and the group of transformations that preserved that construction or relationship. In the case of metric geometry, the fundamental construction is the metric on the set of points that one called “space,” and the transformations then become isometries of that metric. By contrast, the fundamental relationship in projective geometry is that of “incidence.” That is, whether a given geometric object, such as a point, line, plane, etc., was a subset of another geometric object or vice versa . It is important to see that this is distinct from mere intersection, since two lines can intersect at a point without necessarily being coincident. The transformations that preserve incidence are then projective transformations. Since one has essentially “symmetrized” the definition of incidence to mean that either one set of points is a subset of the other or the other way around, projective transformations will include duality transformations, which reverse the sense of set inclusion. pre-metric generalization of the Lorentz transformation. 2 In Klein’s picture of geometries, one can then classify geometries by their groups of transformations. Indeed, in that scheme, the group of projective transformations of a projective space then become the ultimate geometric group, while the groups of transformations for all of the other geometries became subgroups of it. For instance, one could go from projective geometry to affine, elliptic, or hyperbolic geometry by introducing an appropriate “absolute quadric” on the points at infinity. That probably explains the somewhat-apocryphal observation of Sir Arthur Cayley that “all geometry is projective geometry.” Indeed, Klein also discussed the projective-geometric setting for the Lorentz group [ ]. Some of the key concepts that will be emphasized in this study will be time+space decompositions of a vector space V and frames that are adapted to them. The main extension from Minkowski space will be from the proper-time hyperboloids and the light cone to some corresponding seven-dimensional quadric hypersurfaces in the Cartesian product of V with its dual space. The general plan of what follows is to begin in section by reviewing some basic aspects of Minkowski space and the Lorentz group, but in a slightly more “basis-free” sort of way. Section will then define the concept of an “observer” in terms of time+space splittings of a vector space, which will also allow one to define the time-line and rest space of an observer, as well as the concept of rest frames. Section will then get into the core material that is concerned with transformations of observers in the more general sense and the conditions under which they can reduce to Lorentz boosts. In the final section, the group of transformations that preserve the fundamental quadric that is defined by all observers will be discussed. Although the scope of the following discussion is mostly restricted to four-dimensional vector spaces as a generalization of Minkowski space, nevertheless, its application to more general space-time manifolds is immediate. One simply assumes that the constructions are being made in the tangent and cotangent spaces to the more general manifold, in the same way that Minkowski space gives way to Lorentzian manifolds.
2. Minkowski space. – For the sake of completeness, we shall review some of the basic definitions, notations, and conventions that apply to the established geometry of Minkowski space in a way that will lead naturally into the more general constructions. For a review of the approach that will be taken to linear algebra, one can confer, e.g., Hoffman and Kunze [ ]. a. Basic definitions. – For us, Minkowski space is a pair ( V , ) that consists of a four-dimensional real vector space V and a symmetric, non-degenerate, bilinear form that then takes any pair of vector ( v , w ) in V to a real number ( v , w ), so one will also have: (2.1) Symmetry: ( v , w ) = ( w , v ), () Non-degeneracy ( v , w ) = 0 for all w iff v = 0, (2.3) Bilinearity: ( u + v , w ) = ( u , w ) + ( v , w ) . pre-metric generalization of the Lorentz transformation. 3 (Since we have already assumed symmetry, it becomes redundant to specify the linearity in the second position – i.e., w .) So far, we have defined only a scalar product on V , and sometimes we shall use the alternate notation: (2.4) < v , w > ( v , w ) . That makes V an orthogonal space , by definition. A linear frame on V is a set { e , = 0, 1, 2, 3} of four linearly-independent vectors in V . Hence, any vector v V can be expressed uniquely as a linear combination of those vectors: (2.5) v = v e . The real numbers v are called the components of v relative to that choice of frame. Any choice of linear frame on V will define a linear isomorphism of with V that takes the canonical frame vectors = {(1, 0, 0, 0), … (0, 0, 0, 1)} for to the frame e for V . Hence, any vector ( v , v , v , v ) in will go to the vector v = v e in V . Of course, a different choice of frame will take the v to a different vector, so the linear isomorphism is by no means unique. A linear frame e is called orthonormal or Lorenzian if one has: (2.6) < e , e > = ( e , e ) = diag [+ 1, − − −
1] . It is the choice of the matrix in the final right-hand side of (2.6) that specifies Minkowski space, and one calls that matrix the signature type of the scalar product. In particular, one says that has a normal hyperbolic signature type. (One should be advised that the opposite sign convention is also used, as well as the “imaginary time” convention that makes the matrix take the Euclidian form of an identity matrix.) Any linear frame { f , = 0, 1, 2, 3} can be deformed into an orthonormal frame e by an invertible linear transformation due to the fact that each vector f can be expressed in terms of its components with respect to the frame e : (2.7) f = f e . However, the customary Gram-Schmidt algorithm for starting with f and defining a unique e breaks down whenever one encounters light-like vectors, which cannot be normalized. From bilinearity, one can express the scalar product of two vectors v and w in the component form when they are expressed in an orthonormal frame: (2.8) < v , w > = v w . pre-metric generalization of the Lorentz transformation. 4 For any frame that is not orthonormal, the component matrix for will not generally take the diagonal form in (2.6), so we will denote it by g , and relative to that frame, we will have: (2.9) < v , w > = g v w . The dual space to V is the four-dimensional real vector space V whose elements are linear functionals on V that is, if V and v + w is any linear combination of vectors in V then ( v + w ) will be a real number, and one must have: (2.10) ( v + w ) = ( v ) + ( w ) . A vector in V will also be referred to as a covector . A coframe on V is then a set { , = 0, 1, 2, 3} of four linearly-independent covectors in , V so any covector V can be expressed uniquely as a linear combination of the coframe members: (2.11) = . The real numbers are then the components of relative to that choice of coframe. A choice of coframe on V will define a linear isomorphism of V with that takes any covector to the row vector ( , , , ) of its components. Similarly, it will define an isomorphism of V with that takes and vector v V to ( v ) = v in . Any linear frame e on V gives rise to a unique coframe on V that is called its reciprocal coframe and is defined by the rule: (2.12) ( e ) = , in which is the usual Kronecker delta symbol, which equals 1 when = and 0 otherwise; i.e., as a matrix, it is the identity matrix for . (2.12) also says that the linear isomorphism of V with that is defined the frame is the inverse of the linear isomorphism of with V that is defined by e . (The fact that the reciprocal coframe is unique comes from the fact that any linear map between vector spaces is defined uniquely by its values on a chosen frame in the source vector space.) Hence, a choice of e will also define a linear isomorphism of V with V that takes the frame e to the coframe and all other vector X in V , whose components with respect to e are X will go to the covector X T = X , with X = X . In effect, all that one has done is to transpose the column vector whose components X are to the row vector with the same components; hence, the notation T for the transpose of a matrix. pre-metric generalization of the Lorentz transformation. 5 However, a different choice of frame and reciprocal coframe will define a different linear isomorphism of V its dual. That is because the change of e to f will have to be accompanied by the inverse transformation of in order for the resulting coframe to still be reciprocal to f : (2.13) f = f e , = f , in which the tilde signifies the matrix inverse. That means that if X are the components of X with respect to the new frame f then even though the components of X T with respect to will still be X = X , nonetheless the components of X and X T with respect to e and , resp., will be f X and X f , resp., which are not mere transposes of each other. Since Minkowski space also has a scalar product defined on it, one can define another linear isomorphism of V with V that is unique and independent of any choice of frame. One simply takes every vector v in V to the linear functional v * that will give: (2.14) v * ( w ) = ( v , w ) when it is evaluated on any vector w . When one chooses an orthonormal frame for V and its reciprocal coframe for V , one can represent v in the form (2.5) and v * in the form: (2.15) v * = v , with v = v . Hence, the isomorphism that defines is just what is commonly called “lowering the index” in the conventional literature of relativity. One can also say that the matrix of the scalar product is the matrix of the isomorphism. The inverse isomorphism of V with V then takes the form of raising the index, and the matrix of that inverse is denoted by , which has the same components as as a matrix. The matrix then defines a scalar product on V that is easiest to describe in a Lorentzian frame: (2.16) ( , ) = . However, when one is dealing with Minkowski space, the fact that the signature type is not definite (i.e., all of one sign) implies that the coframe e that is metric-dual to the orthonormal frame e will differ from its reciprocal frame by three sign changes due to the fact that: (2.17) ( ) e e = < e , e > = , but ( e ) = . Thus, if v is a vector in V then the components ( v = v ) of v T relative to some frame e and its reciprocal coframe will also differ by three sign changes from those ( v = v ) of v * . pre-metric generalization of the Lorentz transformation. 6 When V has a scalar product defined on it, one can define a corresponding scalar product on V * by using the duality map * : V * → V that comes from the scalar product on V : (2.18) < , > , . Hence, one can define orthonormality for a coframe on V * analogously: (2.19) < , > = = diag [+ 1, − − −
1] . One notes that the component matrix is the inverse to the component matrix . b. Quadric hypersurfaces defined in Minkowski space. – The scalar product allows one to define a corresponding quadratic form Q [ v ] on Minkowski space by way of: (2.20) Q [ v ] = < v , v > = v v = ( v ) − ( v ) − ( v ) − ( v ) . That defines three types of vectors in Minkowski space: a ) Time-like: Q [ v ] > 0 . b ) Light-like: Q [ v ] = 0 . c ) Space-like: Q [ v ] < 0 . The set of all v such that Q [ v ] is the same constant then defines a quadric hypersurface in V . Depending upon whether that constant is positive, zero, or negative, it will be referred to as time-like , light-like , or space-like , respectively. In particular, when: (2.21) Q [ v ] = c , where c is the speed of light in vacuo , the vectors v are all time-like and the quadric is a hyperboloid of two sheets that one calls the proper-time hyperboloid. Its two sheets are then called the future and past sheets, although the choice of which to call future or past is arbitrary and is referred to as a time orientation . When: (2.22) Q [ v ] = 0, the quadric will be a spherical cone through the origin that one calls the time cone of Minkowski space. Relative to an orthonormal frame, it will then take the forms: pre-metric generalization of the Lorentz transformation. 7 (2.23) 0 = ( v ) − ( v ) − ( v ) − ( v ) or ( v ) = ( v ) + ( v ) + ( v ) . In the latter form, one sees that it can be regarded as a one-parameter family of concentric 2-spheres of radius v . The light cone also has a future and a past sheet (when one deletes the origin), but the choice of which is which is also arbitrary . However, the future sheet of the proper-time hyperboloid must be interior to the future light cone, for consistency. b. Time+space splittings of Minkowski space. – Whenever one has a time-like vector u in Minkowski space, one can generate a line through the origin [ u ] by way of all its scalar multiples. We shall call that a time-line . Because Minkowski space has a scalar product, one can then define a complementary hyperplane that is orthogonal to [ u ], which consists of all vectors v V that are orthogonal to u ; hence: (2.24) < u , v > = 0 . We shall call that orthogonal hyperplane the rest space of u . The vectors of all become space-like with respect to the Minkowski scalar product, but that will be easier to see once we have introduced frames that are adapted to a time+space splitting, which we shall now define. Since the line [ u ] and the hyperplane intersect only at the origin and their dimensions add up to four, they collectively define a direct-sum splitting of V = [ u ] that we shall call a time+space splitting of V relative to u . A different choice of u that is not collinear with the first one would then product a different direct-sum splitting. A linear frame e on V is called adapted to the direct-sum [ u ] when one of its members (we shall always use e ) generates the line [ u ] by all of its scalar multiples, and the other three { e i , i = 1, 2, 3} span the hyperplane . When a frame that is adapted to u is also Lorentzian, one will also call it a rest frame for u . There will then be as many rest frames for u as there are linear frames on , so we shall not give in to the popular temptation to refer to the rest frame of u . When the scalar product on Minkowski space is restricted to pairs of vectors in a spatial hyperplane , it will define a scalar product on . In fact, when v = v t + v s , w = w t + w s , since v t is orthogonal to w s and w t is orthogonal to v s , one can express the scalar product of v and w as: (2.25) < v , w > = < v t , w t > + < v s , w s > = v t w t c + < v s , w s > . When v and w are expressed relative to an adapted Lorentzian frame e , that will make: (2.26) < v , w > = v w c − i jij v w Thus, one sees that the scalar product that is induced on is minus the Euclidian scalar product. Hence, the vectors in will all be space-like with respect to < ., >. When V has been given a time+space splitting, any vector X V can be expressed uniquely as a sum X t + X s where X t = X t u for some unique real number X t , and X s is a unique vector that pre-metric generalization of the Lorentz transformation. 8 belongs to . X t is then called the temporal part of X , while X s is its spatial part , and X t is its temporal component relative to u . One can use the scalar product to define X t , and if one defines < u , u > = c then since: < X , u > = < X t , u > + < X s , u > = X t < u , u > = t X c , one will have: (2.27) X t = c < X , u > . Hence, that will make: (2.28) X s = X – X t = X − c < X , u > u . If we replace < X , u > with u * ( X ) then we can express X t and X s this in the form: (2.29) X t = c ( u * u ) ( X ), X s = ( I − c u * u ) ( X ) , and that, in turn, will allow us to define temporal and spatial projection operators: (2.30) P t = c u * u , P s = I − c u * u , which will then make: (2.31) P t ( X ) = X t , P s ( X ) = X s . P t and P s have the characteristic properties of projection operators, namely: (2.32) P t P t = P t , P s P s = P s , P t P s = P s P t = 0, I = P t + P s , in which I is, of course, the identity operator on V . Relative to a linear frame on V , the matrices of the projection operators will look like: (2.33) P t = u uc , P s = u uc − , in which u = g u . Since will also define a scalar product on V , the dual covector u * will also define a direct-sum splitting V = [ u * ] * . Analogous projection operators on V will be defined in that way. pre-metric generalization of the Lorentz transformation. 9 When a vector space V is given a direct-sum decomposition into [ u ] that is analogous to a time+space splitting of Minkowski space, and is given a scalar product < , > , one can extend it to a scalar product on V by using that fact that any two vectors in V can be represented in the form u + v , u + w , where v and w belong to , so from the bilinearity of any scalar product and the fact that both v and w are assumed to be orthogonal to u , one must have: (2.34) < u + v , u + w > = < u , u > + < v , w > . Thus, when one lets < v , w > = < v , w > , the only thing that is left to be defined is < u , u >, which can then be any non-zero real number. If < v , w > is the (negative) Euclidian scalar product on then choosing < u , u > = c would give the usual Minkowski scalar product on V . c. Lorentz transformations. – When V is Minkowski space, one can define a Lorentz transformation to be a linear map L : V → V that preserves the scalar product. Hence, for any pair of vectors v , w in V one must have: (2.35) < L ( v ), L ( w ) > = < v , w > . In matrix form, this is: (2.36) v T L T L w = v T w , so if that is true for every choice of v and w then one must have: (2.37) L T L = or L T L = I , since the matrix of is its own inverse. That also means that the inverse of L will be: (2.38) L − = L T . One might compare this to the Euclidian case, in which gets replaced with the identity matrix, so the inverse of an orthogonal transformation becomes its transpose. The set of all Lorentz transformations, given the binary operation of composition of operators, becomes a group that is commonly called the Lorentz group . Because such transformations preserve scalar products, they will also preserve the associated quadratic form Q [ ], and as a result, they will take time-like vectors to time-like vectors, light-like vectors to light-like ones, and space-like vectors to space-like ones; i.e., they will preserve the quadric hypersurfaces that are defined by Q . They will also take Lorentzian frames to other such things. When V is given a time+space splitting relative to a time-like u , the restriction of any Lorentz transformation to will be an orthogonal transformation of the Euclidian space that is defined on it. Thus, the Lorentz group includes all (proper and improper) spatial rotations. pre-metric generalization of the Lorentz transformation. 10 However, it also includes the transformations that are sometimes called Lorentz transformations, in their own right, but they are also called boosts , which is the term that we shall employ. They define the essential difference between Galilean relativity and special relativity, as Einstein envisioned it, since they relate to the transformation between two Lorentzian frames that differ by being a state of relative uniform motion with a relative velocity of v . The prototype of all such transformations can be defined on the Minkowski plane; i.e., the plane of u and v , where u is time-like and v is spatial. For our purposes, it will be sufficient to represent it in the form: (2.39) e = ( e − vc e ) , e = ( − vc e + e ) , in which the Lorentzian frame { e , e } is defined by the unit vectors in the directions of u and v , respectively, while: (2.40) = vc − − , ( v − < v , v >) is the ubiquitous Fitzgerald-Lorentz factor, which accounts for such experimentally-observed phenomena as time dilation, length contraction, and the increase in relative mass. In a sense, the boost in the direction of v that is defined by (2.39) is the “canonical” form for a boost, since any other boost in a different direction v with the same value of v can be brought into that form by a suitable rotation of the spatial axes. Hence, there are as many boosts as spatial rotations, which define a three-dimensional Lie group, and that makes the Lorentz group six-dimensional. However, the boosts by themselves do not define a group, except when one considers all boosts in the same direction (i.e., all 0 v < c ). Once again, that is because the composition of two boosts in different directions can be factored into the product of a boost and a rotation. (That is the basis for “Thomas precession.”)
3. Space-time observers and time+space splittings. – We shall now introduce the concept of a space-time observer in a manner that does not require the introduction of a Minkowski scalar product. After that, we shall show how one might transform between observers and then show how it would relate to the Lorentz transformations. There is an appreciable volume of literature by now on the subject of observers, time+space splittings, transverse geometry, and the like, and many references can be found in the author’s paper [ ]. a. Basic definitions. – Space-time, for us, shall be a four-dimensional real vector space V , and its dual vector space shall be denoted by V . Since the elements of V are linear functionals on vectors in V , there is a natural – i.e., canonical – bilinear pairing of V and its dual that produces a real scalar by evaluating a functional on a vector: (3.1) V V → , ( , X ) ( X ) . pre-metric generalization of the Lorentz transformation. 11 If one introduces a linear frame { e , = 0, 1, 2, 3} on V and its reciprocal coframe { , = 0, 1, 2, 3} on V then one can then express the vector X as a linear combination X e and the covector in the form , and one will then have: (3.2) ( X ) = X . Any non-zero vector X in V defines a line [ X ] through the origin by way of all of its scalar multiples. It also defines a hyperplane * in V by way of all linear functionals that annihilate it; i.e.: (3.3) * = {all such that ( X ) = 0}. Dually, any non-zero covector will generate a line [ ] through the origin of V and a hyperplane in V that consists of all vectors v that are annihilated by . The key to understanding the present discussion is to see that the algebraic relationship ( X ) = 0 expresses the geometric relationship of incidence in both cases of and its dual * . That is, when ( X ) = 0, the vector X will be incident on the hyperplane that is defined by . Hence, one is dealing with a fundamentally projective-geometric concept. In the event that the vector space is Minkowski space, so it also has a scalar product < , > defined on it, the natural linear isomorphism of V with V that takes any vector X in V to a covector X * in V will make: (3.4) X * ( Y ) = < X , Y > = X Y = X Y for any vector Y in V . Dually, any covector in V can be associated with a vector * = ( ) e that makes the bilinear pairing of V and V take the form: (3.5) ( , X ) = < * , X > = X . This has the effect of saying that incidence in Minkowski space is related to orthogonality. That is, X is incident on the hyperplane that is annihilated by iff X is orthogonal to the vector * . The hyperplane then becomes the orthogonal complement to the line [ * ]. One sees that the essential generalization from orthogonal spaces to projective geometry amounts to replacing the scalar product of vectors with the canonical bilinear pairing of vectors and covectors. The expansion of scope is due to the fact that not all linear isomorphisms of V with V can be represented in the form of metric isomorphisms, and that amounts to saying that the bilinear form on V that is defined by a linear isomorphism : V → V , namely: (3.6) < X , Y > = ( X ) ( Y ) = X Y , pre-metric generalization of the Lorentz transformation. 12 does not have to be symmetric in X and Y ; i.e., the component matrix does not have to be symmetric in its indices. In projective geometry, such an isomorphism of V with V (or rather, its projection onto the projective spaces P V and P V ) is called a correlation . Our definition of a space-time observer will be simply a pair ( u , u ) that consists of a vector u in V and a covector u in V that are constrained by the demand that ( ): (3.7) u ( u ) = u u = c When one is dealing with Minkowski space, this will take the form of saying that u would be time-like and lie on the proper-time hyperboloid. However, since the relationship between u and u is less specific than that of u and u * , one is no longer dealing with a three-dimensional quadric in V , but a seven-dimensional quadric in V V . In older literature [ ], that quadric (or rather, its projection onto P V P V , where P V and P V * are the projective spaces that are defined by the sets of lines through the origin in V and V , defined what was called an algebraic correspondence. If one has a scalar product on V , however, one can see that this more-general quadric contains the proper-time quadric in the form of all pairs of the form ( u * , u ). Because of the central role that it plays in the theory, we shall refer to the quadric that is defined by (3.7) as the observer quadric. The algebraic correspondence between vectors and covectors that is defined by the observer quadric is hardly a one-to-one correspondence. Indeed, if a vector u satisfies (3.7) for some fixed u then any vector u that differs from u by a vector v will also satisfy it. Hence, a choice of u will define only an affine hyperplane in V , but not a unique vector. Similarly, when one fixes u , u will be defined only up to a covector in * , so a choice of u will be associated with an affine hyperplane in V . One can then regard these two affine hyperplanes as equivalence classes of vectors and covectors under the equivalences: (3.8) u u iff u − u , u u iff u – u * One can also regard the affine hyperplane that corresponds to u as the translate u + of by u , and similarly the affine hyperplane that corresponds to u is the translate u + * . When one has an observer ( u , u ), one can refer to the line [ u ] as the time-line of the observer and the hyperplane as the rest space of the observer. A rest frame then becomes a frame e that is adapted to those spaces in the sense that e generates the same line as u , and the set of three frame members { e i , i = 1, 2, 3} spans the hyperplane . Since a rest frame is not by any means unique at this point (due to the infinitude of linear frames on ), we shall not refer to “the” rest frame of an observer. ( ) Although it might seem more concise to use 1 on the right-hand side of this relation, the introduction of the speed of light in vacuo c is to make the consistency with the usual formulation of special relativity more straightforward. pre-metric generalization of the Lorentz transformation. 13 b. Time+space splittings of space-time. – When one is given an observer ( u , u ), with the present definition, the fact that u ( u ) is not equal to zero says that the time-line [ u ] does not lie in the rest space ; i.e., they are transverse subspaces of V . Since their dimensions are complementary, one can then say that they define a time+space splitting of V in the form of a direct-sum decomposition: (3.9) V = [ u ] . Hence, any vector X in V can be expressed uniquely in the form: (3.10) X = X t + X s = X t u + X s , in which X t belongs to [ u ] and X s belongs to . One again refers to X t as the temporal part of X and X s as its spatial part. The scalar X t is the temporal component of X . If a vector X is represented in time+space form, as in (3.10), then one will see that since u annihilates all vectors in : (3.11) u ( X ) = u ( X t ) = u ( X t u ) = t X c ; i.e.: (3.12) X t = c u ( X ) . Because of the uniqueness of that decomposition, one can once more define projection operators P t : V → [ u ] and P s : V → , that take any X V to: (3.13) P t ( X ) = X t , P s ( X ) = X s , which once again satisfy the traditional properties of projection operators: (3.14) P t P t = P t , P s P s = P s , P t P s = P s P t = 0, I = P t + P s . That allows one to represent the projection operators P t and P s in terms of u and u as tensors of mixed type: (3.15) P t = c u u , P s = I – P t = I − c u u . The component form of these expressions is still the same as in (2.33), but one will not generally have that u = u * (i.e., u = u ). Dually, one has a time+space splitting of V into a direct sum: pre-metric generalization of the Lorentz transformation. 14 (3.16) V = [ u ] * and a unique representation of any covector in the form: (3.17) = t + s = a t u + s , with analogous terminology for the components. Similarly, one has projection operators that are defined by that unique decomposition. When a covector-vector pair ( , x ) has been expressed in time+space form relative to an observer ( u , u ) in the form: (3.18) X = ( u + v ), = ( u − v ), one will have: (3.19) ( X ) = ( ) c v − , since: (3.20) u ( u ) = c , u ( v ) = v ( u ) = 0, v ( v ) v . We can define another quadric by: (3.21) ( X ) = 0, which we shall call the fundamental quadric. Relative to the present choice of observer, as long as and are both non-vanishing (which would be equivalent to both X and being non-vanishing), it will be defined by all ( v , v ) such that: (3.22) v ( v ) = v = c . Hence, it can also be regarded as an affine quadric in * .
4. Transformation of observers . – If one has two observers ( u , u ) and ( u , u ) then one will have two time+space decompositions of V and V accordingly. Hence, any vector X in V can be written in two different ways: (4.1) X = X t + X s = t s + X X depending upon which decomposition is used. pre-metric generalization of the Lorentz transformation. 15
Similarly, any covector in V can be written in two different ways: (4.2) = t + s = t s + . In particular, one can express u and u themselves in terms of u and u : (4.3) in which: (4.4) Since both ( u , u ) and ( u , u ) must lie on the observer quadric, we must have: If we define the spatial vector v and the spatial covector to make: (4.5) s u = v , s u = − , ( v ) v then we will see that we can solve for the product of the scalars: (4.6) We immediately recognize the square of the Fitzgerald-Lorentz coefficient on the right-hand side of this. Hence, we have almost duplicated part of the Lorentz transformation between relativistic observer. However, since the left-hand side does not have to take the form of , in general, one can treat the case in which that is true as a special case. That is, and are independent, except for the constraint (4.6), just as u and u are independent, except for the constraint (3.7). Moreover, and both become dependent upon v in the process. We can rewrite our decompositions of u and u in terms of what we have established: (4.7) u = ( u + v ), u = ( u − ) . That suggests that we can define any transformation of an observer to another observer by a pair ( , v ) such that v and are both spatial relative to the first observer ( u , u ) and ( v ) > 0, along with a pair of non-zero scalars and that are coupled by the constraint in (4.6). Hence, since v and each have three components, while and add two more dimensions to the space of all transformations, but the condition (4.6) subtracts one dimension, we are left with seven. The ( ) ( )( ) ( )( ). s s s s c u u u c u = = + + = + u u u u , , t s t s u u u = + = + u u u , ( ) 0, , ( ) 0. t s t s u u u u = = = = u u u u vc − = − pre-metric generalization of the Lorentz transformation. 16 condition that ( v ) > 0 does not reduce the dimension, since it is an inequality, not an equality, even when one adds the physically-motivated condition that v < c , which is also an inequality. a. Transformation of arbitrary vectors between observers . – An arbitrary vector X can be expressed in two different ways relative to two different observers: (4.8) X = X t u + X s = t s X + u X . Our first problem is to express t X and s X in terms of X t and X s and the parameters of the basic transformation (4.7). We can first say that: t X = uc X = ( )( ) t s u Xc − + u X or (4.9) t X =
1[ ( )] t s
X c − X . We then have that: s X = X − t X u = X t u + X s −
1[ ( )] t s
X c − X ( u + v ) = [(1 ) ] ( ) ( ) t s X I c − − + + + u v u v X . Since: − = 1 −
202 20 cc v − = −
22 20 vc v − = − vc , we can say that: (4.10) t X =
1[ ( )] t s
X c − X , s X = −
22 20 0 ( ) ( ) t s v X Ic c + + + + u v u v X . Hence, between these two equations, we can, in principle, express the transformation of temporal and spatial components of any vector under a change of observer. They also define an invertible linear transformation of those components. However, the transformation that we defined does not take adapted frames to adapted frames. In particular, one notes that u ( X t = 1, X s = 0) goes to something whose temporal component is and whose spatial component is s X = − vc + u v , which is not zero. Similarly, a purely spatial vector, such as X s ( X t = 0), goes to something with a non-zero temporal part, namely, ( ). s c − X pre-metric generalization of the Lorentz transformation. 17 What we have is a way of expressing the same vector X in terms of two different frames, while what we want now is a way of transforming X into a different vector. In particular, we want to extend the transformation that takes u to ( u + v ). In terms of a frame change, that means that the vector X e will go to the vector X e . Let us now define two adapted frames e and e on . By definition, e will be collinear with u and e will be collinear with u , while e i will span and i e will span . In anticipation of the eventual reduction to Lorentz transformations, we make the definitions and replacements: (4.11) u = c e , u = c e , i e → i e . Since we already know how to transform u , we can now put it into the form: (4.12) e = ( ) i i vc + e e . The second of equations (4.10) can then be put into the form: (4.13) i i X e = − ( ) i j ii i i j v v X v c v Xc c + + + + e e e e e . This allows us to express any vector in (for which X = 0) in the form: (4.14) i i X e = i j j ii i i j v X v v Xc c + + e e . Hence, each basis vector i e ( i X = i X = ij ) can be expressed in the form: (4.15) i e =
20 20 0 i i vvc c + + e e . The term in parentheses reduces to: 1 + vc = , so: i e = i i vc + e e . pre-metric generalization of the Lorentz transformation. 18 We then combine our formulas for transforming an adapted frame into an adapted frame into: (4.16) e = i i vc + e e , i e = i i vc + e e . b. Reduction to Lorentz transformations. – Recall the basic form for a Lorentz transformation (2.39): (4.17) e = ( e − vc e ) , e = ( − vc e + e ) , A comparison of these equations with (4.16) will show that one can make them take the same form by first setting = , so will become the actual Fitzgerald-Lorentz factor, adapting the spatial frame e i to the velocity vector v = v e , setting v i = v i , and changing the sign on v . One sees that one difference between the transformation (4.16) and a Lorentz transformation is that there are two scalar multipliers and involved, rather than just the one, namely, . Another essential difference is the fact that in the case of Minkowski space a choice of v will imply a unique choice of , while in the more general case of the fundamental quadric on V V , a choice of v will define only an affine hyperplane of covectors that could serve as . Hence, whereas a choice of boost vector v in the Minkowski case will imply a single and a unique Lorentz transformation, more generally, one must choose , , v and independently (as long as they are consistent with the fundamental constraints). That is why the more general transformation of observers behaves more like a pair of independent Lorentz transformations of V and V , while defining a unique linear isomorphism of V with V will reduce that to just one. c. Dual transformations. – When we express an arbitrary covector in the two forms: (4.18) = t u + s = t s u + , with u = ( u – v ) , we can follow through the same sequence of calculations as in (4.9) to (4.16) with appropriate alterations and obtain analogous results. We shall simply summarize those results. Equations (4.10) now take the form: (4.19) t = [ t + c s ( v )], s = −
22 20 0 ( ) ( ) t s v u v I u vc c − + − − v . Upon introducing the coframes and , with u = c and u = c , equation (4.15) becomes: pre-metric generalization of the Lorentz transformation. 19 (4.20) i =
20 20 0 i i vvc c − + + , which makes the dual coframe transformations (4.16) now take the form: (4.21) = ii vc − , i = i i vc − . Analogous statements apply to their relationship to the Lorentz transformations.
5. The group of linear transformations that preserve the canonical bilinear pairing. – So far, we have considered only transformations of observers, which generalized the Lorentz boosts, but not the more general transformations that might generalize the Euclidian rotations in the rest space. Since an observer ( u , u ) is an element of the vector space V V , we shall begin by considering the two types of linear transformations V V → V V , ( , X ) ( , X ) . First, we have the transformations ( L , L ) that belong to the direct product GL ( n ) GL ( n ), so both L and L are invertible linear transformations of V . However, the action of L on V is by way of its transpose as a map L : V → V , namely, T2 L : V → V , T ( ) L , where: (5.1) T2 ( )( ) L X = ( L ( X )) . When is represented as row matrix and L is an n n matrix, the action in question will be simply the matrix multiplication of times L . Hence, the equations of transformation in this case will be the pair of transformations: (5.2) X = L ( X ), = L . A second type of linear transformation that takes V V to itself is a duality transformation. The consist of pairs ( , T2 ) , where : V → V and T2 : V → V , so ( , X ) will go to T1 2 ( ( ), ( )), X which will make the equations look like: (5.3) X = T2 ( ) , = ( X ) . As mentioned above, one way of defining a duality transformation is by way of a scalar product, such as one has with Minkowski space, but not all duality transformations can be put into that form. More generally, one is defining a correlation. However, if one chooses a frame e for V and its reciprocal coframe then that will define a duality transformation e : V → V that amounts to the transposition of the column vector of pre-metric generalization of the Lorentz transformation. 20 components of a vector in V with respect to e to produce a row vector of components for a covector in V with respect to . Any other duality isomorphism can then be obtained by composing that isomorphism with a linear isomorphism of V : (5.4) = L T e . That is because any other duality isomorphism will take the chosen frame e to: (5.5) ( e ) = , in which the matrix is invertible. Hence, there as many duality transformations as elements of GL ( n ). Of course, a duality transformation does not itself belong GL ( n ) . The group that we shall first consider is the subgroup of GL ( n ) GL ( n ) that consists of all elements ( L , L ) that preserve the canonical bilinear pairing, so: (5.6) < T2 ( ) L , L ( X ) > = < X > for all X V , V . When expressed in terms of matrices, that will take the form: (5.7) = L L X = X for all X V , V , which will imply that: (5.8) L L = I ; i.e., L = L − . Thus, the only pairs ( L , L ) that preserve the canonical bilinear pairing will have the form
11 1 ( , ),
L L − which are then in one-to-one correspondence with the elements of GL ( n ). One can then think of GL ( n ) as acting linearly on V V by taking ( , X ) to ( , ( )) L L − X . It is useful to note that since L and L are both invertible, as long as they both belong to the same conjugacy class, one can always find a unique invertible matrix A that makes L take the form: (5.9) L = A L A − . The conjugacy class of L is then the orbit of L in GL ( n ) as A ranges over all A GL ( n ). Its isotropy subgroup consists of all A that fix L under the action that is defined in (5.9). Hence, all A such that: (5.10) A L = L A . pre-metric generalization of the Lorentz transformation. 21 That isotropy subgroup then consists of all invertible matrices that commute with L , which include not only the scalar multiples of the identity matrix, but also L itself, as well as its inverse. Not all invertible matrices are conjugate to each other; in particular, conjugation will preserve eigenvalues, since the eigenvalues of L are solutions to the characteristic equation: det [ A L A − – I ] = det [ A ( L – I ) A − ] = det [ L – I ] , in which we have factored I into AA − and used the multiplication rule for determinants. Thus, there will be more than one orbit of GL ( n ) when it acts upon itself by conjugation; i.e., more than one conjugacy class. When L and L do not belong to the same conjugacy class, one can still represent their relationship in the form: (5.11) L = A L for some unique invertible A (=
12 1
L L − ). That will then give one a way of going from one conjugacy class to another. When L is expressed as in (5.9), one can convert the expressions in (5.8) into the form: (5.12) T T T1 1
A L A L − = I ; i.e., T T T1
A L A − = L − . One can also say that: (5.13) T T1 1
L A L = A T . In this form, one sees that the transformations in question include all of the orthogonal transformations on V when the matrix A defines a scalar product. For instance, when the matrix A is the identity matrix, one will have the Euclidian orthogonal group, and when A = (suitably generalized to dimension n ), one will have Minkowski space. However, as we mentioned before, that would limit one to only the symmetric invertible matrices, whereas one can now use invertible matrices with more general properties. In particular, when n is even, A can also be antisymmetric, and the linear transformations that it would then define are the symplectic transformations. Hence, we have effectively expanded our scope from scalar products to correlations. On the other hand, when A is not given any special properties, it can be any element of GL ( n ), just as L can. Hence, the system of equations (5.13) amounts to n equations in n unknowns, namely, L . It must then have a unique solution, namely, I . Thus, it is only when one looks at more specific correlations, such as ones for which A is symmetric or anti-symmetric, that one will get a non-trivial group that preserves A under the specified action of GL ( n ). One then sees that for each choice of A , if L and L satisfy (5.13) then so will the product : L L pre-metric generalization of the Lorentz transformation. 22
T T T2 1 1 2
L L A L L = T T2 2
L A L = A T . Since the identity transformation clearly takes A T to A T , each choice of A will define a subgroup of GL ( n ). As for the duality transformations, one must have: (5.14) < ( X ), T2 ( ) > = < , X > for all X V , V , and in terms of matrices that is: (5.15) ( X ) T = X = X ; i.e., = − . (The reason that transposing the term in parentheses does not change its value is that it is a scalar.) Thus, one is dealing with essentially the same transformations as in (5.13). Similarly, any duality transformation can be factored into: (5.16) = A T A − . When is expressed in the form (5.16), this will imply that: (5.17) A − T1 A = I ; i.e., T1 A = A . The fundamental quadric in V * V , for which: (5.18) < , X > = 0 , allows one to expand the expand the group of linear transformations that preserve the canonical bilinear pairing to include all pairs ( , ) of non-zero scalars that multiply and v , since: (5.19) < , X > = < , X > . Hence, if < , X > vanishes then so will < , X >. Such transformations are then pairs of homotheties of V and V * that are centered at their origins. One can also say that the transformations that preserve the fundamental quadric consist of pairs of equivalence classes ([ L ], [ L − ]), in which [ L ] refers to all non-zero scalar multiples of the matrix L , which then defines an element of the group PGL ( n ), which is then the n -dimensional projective group . It is the group of projective transformations of the projective space P n − that n – 0 projects onto by the map that takes every non-zero vector v in n to the line through the origin [ v ] pre-metric generalization of the Lorentz transformation. 23 that it generates. Such transformations are also called homographies , and they are often written in the form of a system of linear equations that look like: (5.20) X = L X , in which L is an invertible n n matrix, while the components X and X represent “homogeneous” coordinates of a point in P n − .
6. Discussion. – One of the obvious extensions of the concepts that were discussed above is to general relativity; that is, to regard the vector spaces in question as tangent and cotangent spaces to a differentiable manifold. In particular, the concept of time+space splittings of the tangent and cotangent bundles is also related to the concept of “static” space-times, which is not surprising since the concept of something being static is closely related to the concept of it being at rest. When one adopts the viewpoint of pre-metric electromagnetism [
7, 8 ], which shifts the center of attention in space-time from the Lorentzian (i.e., metric) structure, which is actually implied by the way that electromagnetic waves propagate in space-time, to the electromagnetic constitutive laws, it also becomes natural to generalize the concept of the Lorentzian structure to other dispersion laws for electromagnetic waves than the traditional classical vacuum law. That would mostly affect the reduction of all transformations of observers and the observer quadric to ones that preserve the dispersion law. However, one should be aware that the higher-degree dispersion laws (such as the quartic one that defines the Fresnel wave surface) will typically have less symmetries than the ones that define spatial spheres, just as deforming a sphere to an ellipsoid will reduce its symmetry group from all spatial rotations to only rotations about its axis of rotation as a surface of revolution.
References
1. W. Rindler,
Essential Relativity , Van Nostrand Reinhold, NY, 1969. 2. F. Klein, “Vergleichende
Betrachtungen über neuere geometrische Forschungen,” Math. Ann. (1893), 63–100; Gesammelte mathematische Abhandlungen , v. 1, Springer, 1921, pp. 460–497; English translation by Mellen Haskell in Bull. N. Y. Math. Soc (1892–1893), 215–249; also available at arXiv:0807.3161. 3. F. Klein, “Über die geometrischen Grundlagen der Lorentzgruppe,” Jahrb. d. Deutschen Math.-Ver., (1910), 287-300; Reprinted in Gesammelte mathematische Abhandlungen , pp. 555-574; English translation by D. H. Delphenich at neo-classical-physics.info. 4. K. Hoffman and R. Kunze,
Linear Algebra , Prentice-Hall, NJ, 1962. 5. D. H. Delphenich, “Transverse geometry and physical observers,” arXiv:0711.2033. 6. B. L. Van der Waerden,
Einführung in die Algebraische Geometrie , Dover, New York, 1945. English translation by D. H. Delphenich at neo-classical-physics.info. 7. F. W. Hehl and Y. N. Obukhov,
Foundations of Classical Electrodynamics,
Birkhäuser, Boston, 2003. 8. D. H. Delphenich,
Pre-metric electromagnetism , Neo-classical Press, 2009., Neo-classical Press, 2009.