-
10-14-11, 06:49 PM #1
Maxwell Could Have Done it All By Himself!
The Entire Special Theory of Relativity Derived Directly and Exclusively from the Classical Laws of Electrodynamics
Section I: Introduction
To expedite the process of teaching Special Relativity in standard undergraduate classes, certain postulates and hand-waving are typically used in order to provide heuristic derivations of the fundamental results. Although this is convenient for teaching purposes in order to provide a functional and reasonably coherent understanding of the theory and its consequences, many postulates are thereby introduced which can in fact be shown to be redundant and unnecessary. Additionally, the foundational logic provided to these students is further lacking in rigour due to common unjustified assumptions about special cases yielding the rules for general purposes.
Many popular textbooks on the subject are equally guilty of such heuristics, including several advanced textbooks on the subject. Even when more detailed examinations of the theory are invoked, i.e. in the study of electromagnetic field transformations, certain unjustified assumptions are utilized, and the final result only demonstrates that the resulting theory is self-consistent. In other cases, results of extreme importance, including the famous result, are deduced using only limiting cases and other questionable assumptions. Even Einstein and his peers are guilty of such reasoning in their papers. Unfortunately, rather than guiding more advanced students to re-examine the logical foundations and justifications of the Special Theory of Relativity, they are instead typically encouraged to accept it as is and move on to studying alternative means of formulating it, and extending it into the realms of gravity and quantum mechanics.
I intend to present here a fully consistent derivation of the Special Theory of Relativity, directly from basic principles of space-time symmetry and the classical laws of electrodynamics. The goal is to place the theory at the same level of rigour as the electrodynamic laws postulated by James Clerk Maxwell as of 1861, and to give a deeper than usual insight into how they must inevitably arise. Often, modern assumptions, examples and experiments are used to demonstrate the relevancy of the theory, and I will attempt to show here that this relevancy follows directly from mathematical facts which were already understood many, many decades earlier. As a bonus, it means cranks arguing against the Special Theory of Relativity will not only have to argue with every single set of postulates that can be used to derive it, but also the very laws of electromagnetism themselves, and all of their proven consequences.
The work here is partially inspired by and extrapolated on various writings and notes I have browsed through over the years, as well as some personal reasoning. I have a hard time believing that anything I'll be doing here hasn't been done by someone else before, at least in private or in a relatively obscure or older publication. However, as I'm struggling to find everything put together in one place and one piece both on the internet and in my personal notes and books, from the bottom up, my plan is to do it here, at least in rough draft. Perhaps one day it could even lead to an actual publication, if there's enough interest.
I don't want to go too far into the boring details of the consequences in the most sophisticated situations, when such situations don't lead to any real insights into the fundamentals. I also don't want to go over too many complicated details in the actual theory of electrodynamics itself other than to summarize what's necessary, because there are any number of excellent and comprehensive sources covering such topics. I only aim to present here the basic essentials from which everything in Special Relativity can be derived, along with the necessary steps in the derivations, and to compare it to how these same essentials are deduced or presented in conventional teaching. I might present a summary or brief outline of certain secondary results, especially if they're needed in order to deduce further consequences, possibly with links to external sources where it's convenient for expediting things or where there might be further details of interest to investigate. I hope to include a few actual diagrams here or there, although for purposes of getting the basic writeup finished, I might try to keep them to a minimum and add more in later. As far as the math and physics level is concerned, there's no point in arguing against electrodynamics unless you have at least a decent undergrad understanding of what it actually says, so I'm aiming the mathematical parts at people who have attained this level at minimum, although I'll try to keep it as simple and concise as I can.
I very much welcome any on-topic comments, suggestions, contributions and other input. I simply find this kind of bottom-up approach to Relativity is really lacking out there, and it's nice to get this all into one place where we can show how things are really inevitable based on results going back 150 years. I aim to present the work I've already been doing to derive all this stuff on paper in a nice, clean typed out and publically-published format, and to make corrections and improvements to it as well as gauging how much interest it actually generates from those in the scientific community and the general public. As I say, if I find there's enough interest out there, and similar material is lacking in the public and private domains, I might eventually be able to produce a publication of my own based on the kinds of discussions I hope to have here.
To begin the actual work, the next section will involve a derivation of the most general possible coordinate transforms, i.e. the "generalized Galilean transform", which relates the coordinatesand
in an arbitrary reference frame
to the coordinates
and
in another arbitrary reference frame
. We will show that with either of two simple classical assumptions, these transforms inevitably reduce to the classical Galilean coordinate transforms, whereas with a single Relativistic assumption, the generalized Galilean transforms reduce to the Relativistic Lorentz transformations instead. However, we will make no such sets of assumptions, instead deriving the Lorentz transformations directly from the basic principles of space-time symmetries and the laws of electrodynamics in vacuum.
Last edited by CptBork; 10-14-11 at 07:00 PM.
-
10-14-11, 09:12 PM #2Section II: The Generalized Galilean Transform
In this section, I will use basic symmetry properties of space and time, as well as a careful set of coordinate definitions, in order to define the most general possible coordinate transformations that one could include in a physical theory. I'll show how it can be reduced to either the Lorentz or Galilean transformations given the right simple assumptions, but will then proceed to the next part without actually committing to any of these assumptions. I will be using some convenient choices of coordinates (with appropriate justifications) in order to simplify the math, so hopefully it doesn't get too ugly. As I mentioned in the previous section, at some point later I might want to supplement some of these arguments with some simple illustrative diagrams (any external help with this would absolutely be tremendously appreciated, because at my ability level, decent and precise graphical work of the mechanics type can be fairly time consuming). Anyhow, let's begin.
We start by considering an arbitrary inertial frame, which for now we shall call the "rest frame", possessing a chosen orthogonal coodinate system with coordinate vectors
, and the origin vector
. Similarly, consider another arbitrary inertial frame
, which for now we shall refer to as the "moving frame". For
we also have a chosen orthogonal coordinate system with the origin
. Every point
in
must correspond to a unique point
in
, and vice-versa. Thus there must be some invertible relationship
,
.
Suppose in theframe we made the arbitrary point translation
. Then we expect the corresponding alteration
to be independent of the position and time
. Otherwise, we could detect an inhomogeneity, i.e. a physical difference between different points in the
frame, such that the same translation for different points in
leads to different translations for the corresponding points in
, depending on where the points in
are located.
Suppose we label the points inas
, and we label corresponding points in
as
. Then the homogeneity of space and time can be represented by the mathematical condition
, where
is some real-valued undetermined function which is independent from
.
Thus we may express the following general conditions:
Here the's are also functions independent of
. We are free to define a new coordinate system by making the translations
,
,
,
, and then relabel the coordinates by dropping a prime (i.e.
). We are then left with the following transformation rules for relating the coordinates of a point
to the coordinates
for the same point:
or, in Einstein summation notation,. We have specifically performed a translation in space and time in the
frame (i.e. a translation in how the coordinate axes are defined) so that the origin
in
corresponds to the same point as the origin
in the
frame. There's much more to be done before this section is finished, but right now I need to take a quick break.
To be continued...Last edited by CptBork; 10-15-11 at 03:34 AM.
-
10-15-11, 12:27 AM #3Continued from Post #2:
Now that we've found the most general coordinate transforms in which the same point defines the origin for each system, we need to start relating the two in some fashion. The first principle we shall invoke is the most basic possible statement of reciprocity: if the spatial originof system
is moving with speed
in system
, then the spatial origin
of system
is moving with speed
in system
.
So now we are ready to define ourand
axis conventions. The coordinates assigned to given events/points in each system can vary depending on the chosen alignment, orientation and origin, but their relations to any other coordinate systems produced by spatial rotations, spatial reflections and translations are already uniquely defined, and do not alter the time coordinate (aside from uniform shifts in time, i.e. different start times on the clocks). We are thus free to choose a coordinate system of convenience in
, define a corresponding coordinate system for
, and a set of mathematical transformation rules between the two. We can then uniquely determine how any other choices of spatial alignments, spatial orientations and translations transform between
and
, and all such transformations will preserve the time relationship between the two frames, up to a uniform translation in time equivalent to changing the start times on their clocks.
We choose to align the coordinate axes ofso that in this system, the velocity of the
origin is set to
. Reciprocally, we choose to align the coordinate axes of
such that it sees the origin of
moving with velocity
. Thus the point
corresponds to the point
. From this result we can immediately set
. Also, since
, we may conclude that
.
To define theaxis, we must consider causality. If events in
occur at
and
, with
, then we say that the event occuring at
could have had a causal effect on the event occuring at the same position but the later time
, but there could be no causal effect in the opposite order. These two events have the corresponding times
and
in
. In order that system
has the same causal structure as system
, we must therefore require that
. So events occurring at increasing times at the origin of
correspond to events occurring at increasing times and uniform motion along the
axis in the
system. The positive direction of the
axis is chosen to be opposite to this motion if
, and in the same direction as this motion if
. Thus the
and
axes in the
system are now unambiguously defined independent of the orientation we choose for the
and
axes in
. Given a particular choice for these
and
axes, it now remains to define a suitable set of corresponding
and
axes orthogonal to
and
.
To be continued...Last edited by CptBork; 10-15-11 at 12:57 AM.
-
10-15-11, 03:05 AM #4Continued from Post #3:
So now it would be a good time to put together what we've derived so far and do a little cleanup. As we apply more symmetries and conditions as well as making additional use of some of our earlier reasoning, things will get even cleaner still. Here are the generalized coordinate transforms we've derived up to this point:
I would like to make a quick note on a mistaken statement I made in Post #3. There I showed that we could define a unique choice of theand
axes, but in fact we have only defined their uniqueness up to arbitrary positive scale factors/choices of units. Our scheme will ultimately guarantee uniform scaling as we progress further, but we'll keep things general for now.
I'd also like to make a note on the reciprocity of velocities between systemsand
. In Post #3 I argued that we can define reciprocal velocities for the relative motions of the two origins, i.e. the origin of
is seen moving at speed
from system
, and the origin of
is seen from
to be moving with speed
. If we didn't select the same speed for each system, then the equivalency of the laws of physics in all inertial reference frames would require that we change the fundamental units and constants we work with in either
or
. Since we'd like to maintain the same set of units and physical constants in each system while preserving the same laws of physics, this requires that we set
to express the physical equivalence of these frames.
Up to here we have used the information that. Let's now also use the information that
. Then we can draw the following conclusions:
These conclusions clearly follow because they hold for arbitrary values of, and for now we are assuming that
. We will show in the end that the same conclusions follow for
.
Our generalized coordinate transforms have now become the following:
There, looking much nicer already ain't it?
To be continued...Last edited by prometheus; 10-17-11 at 02:39 PM. Reason: as requested.
-
10-17-11, 11:37 PM #5Continued from Post #4:
Starting in thesystem, consider the points
defining the
axis. These points correspond to
, which defines a point moving at a uniform rate along a line in
space. In order for the coordinate transformation to be invertible, we must pick either
or
. So this line passes through the origin and has a nontrivial component in the
plane. We may thus define the positive direction of the
axis by choosing
and
. Then the positive direction of
is uniquely set by demanding that the
system have the same orientation as the
system (i.e.
). By the requirement of invertibility, since we have chosen
, we must also necessarily choose
. Since we're free to choose our spatial and time scaling, we are free to choose
, and we will see soon that this choice uniquely defines our space and time scales in
.
Before we discuss the effect of coordinate rotations inon the scaling in
, we must first consider what happens to the functions
. We already defined these functions and showed them to be independent of
. So what else can they depend on? The only remaining physical variable that we have to work with in general is the velocity of the origin in
as seen from
,
. Since we want to start with spacetime as a backdrop for all events to occur therein, we will disregard the possibility of introducing any further physical variables relating the coordinate relations between two inertial frames. Thus we may demand
, and that
be a continuous function of
, so that arbitrarily small changes in inertial velocity will lead to arbitrarily small changes in the coordinate system. We also demand that if
, then the coordinate systems
and
completely coincide.
Thus we define, where
is the usual Kronecker Delta:
So let's summarize what we've got now:
Now what are the possible relationships between two different coordinate descriptions of a system, given that both descriptions are defined with the same point as their spacetime origin? Suppose we have two such descriptions,
and
with coordinate labels
and
respectively.
In order to preserve causality/time-ordering and the uniformity of equivalent intervals in both descriptions, we require thatfor some real-valued constant
. Similarly, to preserve the uniformity of equivalent spatial intervals, we require that
, with
also a real-valued constant. So we may ask, if we have a system
with coordinate descriptions
and
related by a spatial rotation which preserves the distance and time scales (
), along with the condition
, and thus
, how does our scheme affect the scaling and orientation of the resulting coordinate descriptions
and
of the intertial frame
? I will address this question in the next post, but right now it's a good time to take another break.
To be continued...Last edited by CptBork; 10-17-11 at 11:58 PM.
-
10-18-11, 12:30 PM #6Continued from Post #5:
Continuing with our questions about space and time scaling near the conclusion of Post #5, let's investigate what happens if we define two coordinate systemsand
for the inertial frame
, both sharing the same spacetime origin and observing the spatial origin of frame
moving with velocity
(we allow for reversals of the
-axis). Using our scheme, we define two corresponding coordinate systems
and
for frame
, also sharing the same spacetime origin.
From our previous discussion, we know that the systemsand
are related in the following way:
Then in frame, the point
corresponds to the point
. Thus for this point we have the relationship
. But we know that for this same point, the relationship
must also hold, which thus requires that we set
.
So to summarize: If coordinate systemsand
for frame
share the same spacetime origin and observe the spatial origin of frame
moving with velocity
, then the following relationships must hold in the corresponding systems
and
defined by our scheme:
Now we need to calculate the values ofthat will correspond to the various rotations we intend to perform in the
frame. We will see that
in each of the cases under consideration. To simplify matters as much as possible, we should first consider transformations in
which send
.
So consider two systemsand
related by a such transformation, i.e.:
Consider the pointin inertial frame
. The corresponding time relationship in
is
. But since we already know that
, we have the direct implication that
.
So if we have two coordinate systems for framewhich preserve the
and
-axes and the spatial distance
orthogonal to the
-axis, then the corresponding coordinate systems in frame
, as defined by our scheme, must share the following relationship:
Now we can start doing some rotations and cleaning things up!
Consider coordinate systemderived for system
by the spatial rotation
. Then the corresponding relationships we've just derived show that
Since this relationship holds for arbitrary values of,
,
and
, we have demonstrated that
.
To summarize what we have so far:
In the next post, I will demonstrate from this same rotation that we must also require.
To be continued...Last edited by CptBork; 10-18-11 at 07:29 PM.
-
10-18-11, 02:47 PM #7Continued from Post #6:
Continuing from where we left off, again consider the following rotation in:
Then the corresponding relationships inare:
Thus we haveand
, and so the result
which was demonstrated in Post #6 reduces to the requirement that
. So for arbitrary real-valued
,
,
and
, we have the following condition:
Since we have already shown that, a small bit of mathematical manipulation demonstrates that
as I earlier claimed.
Our transformations are starting to look very simple indeed:
Next consider the followingrotation in
:
Then the corresponding relationship in the defined coordinates ofis:
Then the conditionreduces to the condition that
. We thus have the following relationship for arbitrary
and
:
Taking eitheror
immediately yields the condition
, which will help to simplify things.
Expanding both sides of the equationand applying this condition yields the following relation for arbitrary
and
:
Takingimmediately shows that
. Then
.
So we must choose eitheror
. The condition
requires that
. The requirement that
be a continuous function of
thus forces us to choose
for all real values of
.
To summarize, we now have the following:
Having simplified things up to this point, we are now ready to consider rotations inwhich send
and
along with their effect on the resulting coordinate systems defined in
, which will ultimately permit us to justify the principle of reciprocity and thereby reduce everything down to a single undetermined function of
.
To be continued...Last edited by CptBork; 10-18-11 at 02:58 PM.
-
10-18-11, 07:18 PM #8Bloodthirsty Barbarian
- Posts
- 9,397
Errr, isn't essentially what Lorentz did over a century ago, to get the transform in question named after him? I'm sure I was exposed to exactly this line of derivation back in undergrad somewhere...
Isn't this more-or-less the same approach taken by Einstein in his book "Relativity"?
-
10-18-11, 08:01 PM #9Continued from Post #7:
Before I proceed further, I'd like to make a few remarks about how my deductions up to this point differ from what I've experienced to be the standard conventional teachings. Firstly, when either deriving the classical Galilean or Relativistic Lorentz transforms, it's often take for granted that the relationship between the coordinate systems consists of linear combinations. In my derivation, among a few others out there, I've taken care to justify this assumption by deriving it from the homogeneity of space and time, and setting the origins of the two coordinate systems to the same point.
Secondly, in deriving either set of transforms, it's generally assumed thatindependently of
and
. Likewise, it's also generally assumed that
independently of
and
. The standard reasoning is that, on a physical basis, space and time shouldn't have preferred directions for certain things, i.e. if
has its spatial origin moving along the
-axis of
then changing the orientation of the
and
axes shouldn't change the way in which the
and
axes transform. Furthermore, it's also generally assumed that if we pick
, then we're also free to pick
, and that the resulting axes will still be orthogonal as well as preserving the scaling relationship between different descriptions of the same frame with the same origin.
I have taken care to attempt to justify this second set of assumptions from the following:
- Basic postulates regarding the scaling relationships between the times and distances of different coordinate descriptions with the same spacetime origin in the same inertial frame
- The assumption that space is not only homogeneous, but also isotropic. Thus there is no preferred rotational orientation and so, after a careful and unambiguous definition of the
and
axes based on the
and
axes, with the motion of
set along the
-axis with speed
, the same coordinate relationships should hold with the same functions
, regardless of how the
and
axes are oriented orthogonal to
and
.
Now let's proceed as promised. Suppose we have the two coordinate systems for frame, labeled
and
with the relationship
.
In Post #6 I showed that the following relationships must hold in the corresponding systemsand
:
In terms ofand
, we also have the following relationships:
If we choose, then
and we obtain the following:
Thus we necessarily have, and thus
. So then we may simplify the relationship between
and
:
It then immediately follows thatfor arbitrary values of
and
, and thus:
Now at last we're ready to justify and employ the principle of reciprocity, which shall come in the next post.
To be continued...Last edited by CptBork; 10-18-11 at 08:18 PM.
-
10-18-11, 08:09 PM #10
Well I haven't seen anything of his along the lines of what I'm doing. I'm sure there's a lot of similar work out there, possibly including works of Poincare and Lorentz themselves, in which they derive a generalized Galilean transform in presumably a similar way to how I'm approaching it. But when it came to electrodynamics, I looked but haven't found any evidence that they did it directly from Maxwell's equations, but rather from the assumption that not only do all electromagnetic disturbances propagate at light speed, but that an electromagnetic disturbance in one frame corresponds to a similar electromagnetic disturbance in another frame. I'm going to be doing it directly from Maxwell, i.e. not making any usage of the wave equation. In fact, I only need to use the 4 Maxwell's equations in one frame and 3 in the other, and the last one pops out automatically and inevitably. Plus I can thus show that the E-M fields have to transform a specific way in all cases, and not merely assume that the results from special cases hold in general, or make unjustified assumptions about how charge and current densities transform.
Not as far as I recall, nope. The differences might become more clear once I finally finish the task of deriving generalized Galilean and E-M field transforms without any specifically classical or Relativistic physical assumptions, and then apply Maxwell's equations in both frames to show that there's only one way things can work out, and it involves exactly what we're taught in school, but with hopefully a much more rigorous (and more historical) basis. And then I hope to do the same with further results in the theory all the way up toand maybe even beyond, in a much more rigorous and inevitable fashion than is usually taught, and again all deducible directly from Maxwell (with one extra postulate I haven't yet admitted to, which is the assumption that Maxwell's equations hold in all frames). Yes it's true Einstein made this same demand, but I've never seen him do it from bottom up and show that this is the only way things could work out, but rather I've only seen him give some physical reasoning and show that the result is self-consistent.
Last edited by CptBork; 10-18-11 at 08:21 PM.
-
10-19-11, 01:46 AM #11Valued Senior Member
- Posts
- 1,044
Hi CptBork,
You might like to know that there are very well known methods, dating back to Sophus Lie (1842-1899), that will generate the full set of symmetries of the Maxwell equations. If you are comfortable with Lie groups and algebras, then it should only take a little reading to understand the techniques. The best text is Olver's "Applications of Lie Groups to Differential Equations".
-
10-19-11, 10:40 AM #12
Hi Guest, thanks for the tip. I looked through my extensive personal archive and couldn't find the text, but on Google books the entirety of Section 2 is available and that looks like what I need to follow up on your reference, which I'm doing at present. My exposure to Lie Groups and Lie Algebras is unfortunately somewhat limited to what they generally reach in quantum mechanics/quantum field theory, and even though I took 4 undergrad courses and 2 grad courses specifically dealing with abstract algebra, I had the same teacher for 4 of those courses and they didn't seem too interested in dealing with concepts like matrix exponentiation and representation theory (I've filled some of the gaps in myself but have TONS I still need to learn).
I'm curious though if this approach will yield what I'm looking for. Let's say I define a set of transformations, but these transformations involve a single undetermined function of
, as I will be shortly deriving once I complete the section on generalized Galilean transforms. If I have this single undetermined function in my spacetime transforms, and I assume nothing whatsoever about how the
and
fields transform other than that the transformed variables preserve Maxwell's equations, is that sufficient to uniquely determine the transformations of these
and
fields as well as the undetermined function of
in my spacetime transforms? To put it another way, let's assume my spacetime transformations are chosen to be the classical Galilean transforms. Could I not still define a transformation law
such that Maxwell's equations are preserved, even if they lead to some absurd physical consequences?
I could certainly see a conclusive result following if I made the assumption that my spacetime transformations are specifically Lorentzian. But if I haven't yet made that assumption, and have instead kept things general, and I don't make any attempt to relate the generalized symmetries in spacetime to symmetries in the vector fieldsand
, are you certain these Lie Group methods will still lead me to a unique result? As far as I can tell just from my preliminary reading, you need complete information on how space and time transform before you can determine how the
and
fields transform in response to preserve Maxwell's equations using this Lie Group approach, whereas my approach will not require full information on the spacetime transformations but will ultimately give a unique result for this undetermined function of
I've been mentioning.
Last edited by CptBork; 10-19-11 at 10:49 AM.
-
10-19-11, 11:11 AM #13squishy
- Posts
- 2,754
Have you read the essay "How to teach special relativity" by John Bell? This is pretty much the approach he advocates. If you can get a hold of it, it's included as chapter 9 in his book "Speakable and unspeakable in quantum mechanics", and is well worth a read. His approach is completely different than yours though: he starts off by considering things like the electric field around a moving charge, which turns out to be length contracted in the direction of motion. (So imagine what this would do to matter composed of atoms held together by electric forces, including rulers that might be used to define a moving "frame". Incidentally this was originally derived by Heaviside around 1888-1889 and was the inspiration for Fitzgerald's length contraction hypothesis.)
Which makes me wonder: why do you insist on "deriving" the Lorentz transformation in the first place? Maybe it's becaused you haven't finished yet, but it's not clear what you're trying to show or even why you see a need to derive anything in the first place. In your introduction you say:
At its heart relativity is really quite a simple theory: all it does is assert that all the laws of physics possess a certain symmetry we call "Lorentz covariance". I don't know about you but I find this a perfectly well defined (or at least easily qualifiable) definition of relativity, which as it happens is well supported by modern physics: all the most fundamental laws of nature, including some of the most precisely verified in the history of physics, are Lorentz covariant. Historically, Maxwell's theory was the first discovered to possess this symmetry, which led Einstein and others on to the idea that this might be a feature of all the laws of physics, and not just electrodynamics.The goal is to place the theory at the same level of rigour as the electrodynamic laws postulated by James Clerk Maxwell [...]
As far as generic "derivations" of the Lorentz transformation are concerned, it's interesting to have things like Einstein's derivation of the Lorentz transformation in terms of more-or-less experimentally verifiable postulates, or derivations in the style of Lévy-Leblond arguing that if we're going to have a relativity principle (i.e. a velocity-dependent symmetry in the style of the Galilean transformation), then the Galilean and Lorentz transformations are the only two "reasonable" possibilities of their "type". They strengthen the (already strong) case for relativity by showing that basically the only reasonable alternative to relativity now is to abandon the principle of relativity altogether.
But in my experience these derivations can get quite ad-hoc and hand-wavy, and don't make for an elegant formulation of relativity. For example the typical argument that translational symmetry implies linearity: what does that mean? Imposing translational symmetry on its own strictly speaking doesn't constrain anything. On its own it just means that ifis a symmetry, then the complete symmetry group must also contain all the transformations of the form
. When you look at this more more carefully things start to get get hairier: intuitively the problem here is that this gets you a symmetry group that's too large. So in this sort of proof that translational invariance implies linearity, there are really additional hidden assumption that are being applied. I'd have to think more carefully about how to formulate this properly, but intuitively they seem to be related to the idea that a velocity-dependent "boost" should be unique up to rotations and translations.
Reciprocity in my opinion is another problematic one. I think it only makes sense if you're assuming a priori that the symmetry group you're looking for doesn't contain dilations (and if you were, I'd be surprised if it was really necessary). We certainly can't justify it experimentally: it's not like we've ever actually performed an experiment with two observers in two rockets moving at high velocity past one another, testing that each sees the other moving at the same speed. (We can easily justify the removal of uniform dilations experimentally though, with the simple observation that there actually is a heirarchy of distance scales in the universe.)
With that said I'm not a fan of the way relativity is typically taught (especially to laymen), so for that I commend you for starting a thread like this (though if I were to teach relativity I'd approach it differently). As well as rebutting a lot of crank misconceptions in one place it could also give some interested laymen with confused ideas about relativity some insight about what relativity looks like to physicists and how we apply it.Last edited by przyk; 10-19-11 at 11:18 AM.
-
10-19-11, 11:38 AM #14squishy
- Posts
- 2,754
Why wouldn't you? In electrodynamics the electric and magnetic fields don't just drop out of the sky. They're defined in terms of the force acting on a test charge. The Lorentz force law
is pretty much the operational definition of the electric and magnetic field vectors. Obviously any transformation in space-time immediately implies a transformation of the velocity and acceleration vectors, which in turn implies a transformation ofand
.
-
10-19-11, 12:05 PM #15
That sounds like an interesting approach. I am aware of and have previously derived and solved problems with the Lienard-Wiechert potentials of a moving charge, so I'd definitely be interested in seeing what can be concluded via this approach, although I hope there's no hand-waving introduced at some point in the argument.
I think that should become more clear once I finally get things wrapped up and actually derive the Lorentz and E-M field transforms (hopefully soon). It's taking a lot longer than I expected partly because I realized I could make major improvements to my original reasoning regarding generalized Galilean transforms, so I scribbled down a whole new set of notes and calculations on paper, and this stuff of course takes a long time to type up. But the ultimate goal is to improve the rigour as compared to how the Lorentz transforms are generally taught/derived, and show that the necessary logic comes from much simpler and more classical examples and laws than what are typically used. My goal here is certainly not to re-invent the wheel, so once the initial work is done, we'll see if someone comes along and points out how it's already been done in the same or a better way in another location.
My goal is to turn this "might be" into a "must be", for all those who accept the classical laws of electrodynamics and don't believe the Earth is a magically preferred physical reference frame.
I'm perfectly happy to see as many different derivations put out there as possible. I merely want to add, especially for the doubters out there, that it also follows directly from laws which have been well-established in the mid-1800's or earlier. Many people have trouble accepting the idea of space and time transforming in a non-classical way, but little to no trouble accepting things like Gauss's and Ampere's laws.
I'm perfectly happy to admit that my derivation of this result (linearity) contained some heuristics, some loose definitions and a few skipped steps. However, I am quite certain that with a slightly more precise description of the translations involved, the result can actually be shown to be perfectly rigorous without any further assumptions. In the discussion of reciprocity I plan to make in my continuation of Post #9, I am planning to link to the following paper which also discusses the principle and attempts to justify it: Reciprocity Principle and the Lorentz Transformations (warning: you may be required to access it either from a university connection or else a university VPN account if you want to download it for free). As part of their work, they too derive the linearity condition, but in a more rigorous and elegant fashion than I have employed here. However, with careful mathematical reasoning you can actually show that their approach is entirely equivalent to mine, i.e. one set of assumptions can be directly deduced from the other.
I was actually planning to use similar logic to what they used for spacetime translations in that paper, but I was only planning to use this approach when deriving the most general possibleand
transforms, where it becomes a bit more necessary as the starting point. In retrospect, I probably should have used the same method both for the spacetime and E-M field transformations, but I'll be linking to it anyhow for those who are interested.
I believe that with my careful definitions of theaxes, many questions about possible variations in the distance and time scales/units have already been addressed, and that any further questions on the issue will also be dealt with shortly, in a very simple and convenient matter. As I said, I intend to justify the principle of reciprocity, as I will need it to reduce my transformations down to a single undetermined function, which I will probably choose to be
. This justification will include questions about possible changes in distance and time scaling, as I've already done for all the conclusions I've reached up to this point. Honestly I can't blame anyone who thinks everything up to this point looks boring as sh*t, but by carefully and rigorously setting up the initial foundations, it ultimately simplifies things a great deal once we start dealing with the more interesting results and how to justify them.
When I was a teenager (13-14 years old) and started learning certain details about the theory, such as the Lorentz transformations, I had a great deal of skepticism and honestly thought there must be some kind of mistake or reasonable alternative. What ultimately convinced me of Relativity's correctness was the sheer magnitude of evidence behind it as well as seeing some of its derivations. In retrospect, at a higher knowledge level I can now go back to the derivations which already convinced me in the first place, point out gaps and unjustified assumptions in the logic, and then seek improved derivations working directly from the bottom up with the virtually indisputed laws of classical electrodynamics. Will it end all crankery? No of course not, but it should certainly help reduce the number of redundant arguments about it.
But before we could even bother doing that, we would need to know/deduce how,
and
relate to
,
and
in the first place, wouldn't we?
Last edited by CptBork; 10-19-11 at 12:19 PM.
-
10-19-11, 12:42 PM #16Valued Senior Member
- Posts
- 2,788
-
10-19-11, 03:45 PM #17squishy
- Posts
- 2,754
Well that's fairly straightforward: Maxwell's equations are basically invariant under Lorentz transformations and isotropic (in space-time) dilations, so if you accept Maxwell's equations and you don't want there to be a preferred frame, all the laws of physics have to be Lorentz invariant.
I'll look at the paper when I get the chance, but depending on exactly what you mean by "imposing" translation symmetry, I seriously doubt what you're claiming is possible. In your second post you say:I'm perfectly happy to admit that my derivation of this result (linearity) contained some heuristics, some loose definitions and a few skipped steps. However, I am quite certain that with a slightly more precise description of the translations involved, the result can actually be shown to be perfectly rigorous without any further assumptions. In the discussion of reciprocity I plan to make in my continuation of Post #9, I am planning to link to the following paper which also discusses the principle and attempts to justify it: Reciprocity Principle and the Lorentz Transformations (warning: you may be required to access it either from a university connection or else a university VPN account if you want to download it for free). As part of their work, they too derive the linearity condition, but in a more rigorous and elegant fashion than I have employed here. However, with careful mathematical reasoning you can actually show that their approach is entirely equivalent to mine, i.e. one set of assumptions can be directly deduced from the other.
which I take to mean that you're imposing that there should be no priviledged points in space-time. But the only requirement for this is that the symmetry group contains translations as a sub group, and that alone doesn't imply linearity. A simple counterexample is to take the group of all coordinate diffeomorphisms. It's a group, and among a lot of other junk it contains translations, rotations, and Lorentz as well as Galilean boosts. There is nothing wrong with this logically. It's just that in practice this group is so large that only very trivial theories are going to be symmetrical under arbitrary diffeomorphisms - theories in which either almost nothing or almost everything is a valid solution. The only way you're going to be able to whittle this down is to explicitly assume something about the size of the group you're deriving or postulate that certain types of transformations aren't in it.[...] Otherwise, we could detect an inhomogeneity, i.e. a physical difference between different points in the frame [...]
A mathematical contraint that gets you linearity is to impose that the full symmetry group has to commute in some sense with the translation group. By this I mean that if for any translation followed by a transformationyou impose that there must be a translation
such that
, then you get linearity. But what you're doing here is imposing that a translation in the rest frame followed by a boost is equivalent to a boost followed by a translation in the old rest frame. In general how would you justify imposing something like that?
Actually that's another hidden assumption right there: you're assuming that there are distance and time scales - i.e. you're effectively assuming the symmetry group you're looking for doesn't contain dilations.I believe that with my careful definitions of theaxes, many questions about possible variations in the distance and time scales/units have already been addressed [...]
Actually, it would be helpful if you could explicitly list all the assumptions you think you need in one place. You seem to be making them up as you go along, and it's not clear on what basis you're picking them - are they supposed to be "reasonable" or supported experimentally or what?
The acceleration is easy: its transformation is derivable from its definitionBut before we could even bother doing that, we would need to know/deduce how,
and
relate to
,
and
in the first place, wouldn't we?
. That's a bit moot though, since including it in my last post was an error: I really should have written
. If we go with the modern definition of momentum (
), then the way it transforms is going to depend on what you think the kinetic term in the Lagrangian should be in order to reproduce the right behaviour. Rotational symmetry is going to impose that momentum has to be parallel to the velocity though. That combined with q just means that you've only got an overall scaling parameter to play around with in the Lorentz force law - a vector equation. So the Lorentz force law is still going to impose quite a bit on the transformation properties of the electric and magnetic fields.
Additionally I'd argue that q should be invariant (or at worst its velocity dependence should be defined as part of the definition of the theory of electrodynamics). Unlike the mass or momentum, the electric charge is a parameter only relevant to a particle's electromagnetic interactions, and is included in the theory just because we see that not all charges are affected to the same degree in the same electromagnetic field. If you let it vary with velocity you're really just fudging the Lorentz force law.Last edited by przyk; 10-19-11 at 03:55 PM.
-
10-19-11, 05:24 PM #18
But this involves a ton of hand-waving right here. Just because Maxwell's equations might be Lorentz invariant and space might be isotropic, why should that automatically imply that all other laws of physics are invariant in the same way? What can we treat as a Relativistic 4-vector, what must we treat differently? Besides, how do we know that the Lorentz transformations are the only way of preserving Maxwell's equations while conforming to other basic postulates, other than by a method such as I intend to employ? You can make assumptions like this and assume it to be physically reasonable, but it's not mathematically airtight IMO.
But that's the whole point of what I said. Another way of expressing it as follows:, independent of
.
In other words,
From this it immediately follows that, where
is independent of
, and we can apply similar reasoning for translations in
. I didn't assume linearity off the bat- we still have 4 arbitrary additive constants, which I forced to zero by demanding the origins of
and
coincide. As I say, my reasoning here can be shown to be essentially equivalent to what's done in the paper Reciprocity Principle and the Lorentz Transformations, but their approach has the advantage of not needing to assume off the bat that
and
are differentiable functions of
and
.
Actually, I defined a prescription for how distance and time scales should be defined/calibrated inbased on how they're defined/calibrated in
. Specifically I defined the correspondence
, which, as I demonstrated in Posts #6 and #9, when combined with the preservation of orientation, uniquely defines the
-axis and the distance and time scaling in
(remember, time can be related to distance through the velocity
).
They're supposed to be based on physically motivated math postulates, i.e. the preservation of causality and time-ordering in all descriptions of the same reference frame, the uniformity of time and distance intervals relative to the origin (i.e. they must be related by a simple scale factor). You certainly need to make some basic assumptions in order to define what we typically consider a "physically reasonable" mathematical theory.
I agree that there's probably tons of room for major improvement in my presentation, so perhaps in a future draft, or if the mods help me posthumously clean things up once all the dust settles, I could make a better job of laying my postulates out at the start and then referencing them as necessary. Would be cool if I could do what I can do with LaTeX and the hyperref package, so I could refer to equation numbers and then when you click on the number it takes you straight to the relevant equation.
Well let's derive the Lorentz transformations first, so we can deduce how accelerations transform as a secondary consequence.
Ah, but what if we were free to fudge with the Lorentz force law, thereby allowingand
to differ in turn? I've already done the calculations on paper and got the result
for arbitrary
,
(Maxwell's equations can be used to recover charge conservation separately in each frame, so the times don't really matter as long as they're fixed). It was by no means a trivial result to obtain, but things happen to cancel out just right, and it only ends up taking a few lines and a few slightly tricky calculus manipulations to deduce.
Last edited by CptBork; 10-19-11 at 05:34 PM.
-
10-19-11, 08:12 PM #19Continued from Post #9:
Where we last left off, we had reduced the generalized Galilean transform down to the following:
Now recall how I originally defined thesystem. Just as
sees the spatial origin of
moving with velocity
,
sees the spatial origin of
moving with velocity
, thus setting the alignment of the positive
and
axes. We showed that the point
had a non-trivial component in the
plane. This permitted us to define the
-axis with the corresponding points
by picking the component orthogonal to
, and setting the
axis to be orthogonal to
,
and
while preserving the orientation of the
system. I also showed that this uniquely defined the scaling relationships between any two systems
and
produced by reversals of the
-axis and/or rotations in the
plane in system
. We deduced that the resulting scaling relationships were precisely such as to fix the values
for all such systems.
We have now shown that this same set of points actually corresponds to. Thus we can simplify our means of defining the y-axis:
Choose the positive orientations of theand
axes before, with the spatial origin of
corresponding to the point
. We see that we can now define the
-axis by the correspondence
and we will produce the exact same
and
axes as before, along with the same coordinate relationships between
and
.
Equivalently, we can show that by defining the-axis through the correspondence
and picking
to preserve the same orientation found in system
, we again achieve the exact same results. This simplification will now allow us to justify the principle and application of the complete reciprocity principle, rather than the common approach of taking the entire thing as a physical postulate.
Starting with systemin a given inertial frame and the appropriate alignment of the velocity
, we have shown how to define a set of coordinates the corresponding inertial frame
and a unique set of relationships between the two. Suppose we take this coordinate system
as a starting point for constructing another coordinate system in frame
, a second system we shall label as
.
Sinceand
share the same point as their spacetime origin, and the same relation holds between
and
, this means
and
share the same spacetime origin. Both systems
and
see the spatial origin of
moving at velocity
with time increasing for both systems, and so the
-axis coincides with
up to a strictly positive scale factor, as do the axes
and
.
Furthermore, we have the correspondences, and so the
and
axes coincide precisely, as do
and
.
Using identical logic to what I showed in Post #6, we must have the following results:
If we take, then
. Then the fact that
necessarily implies that we set
, from which
.
Thus necessarily, and the coincidence of their positive axes necessitates that
. Additionally,
.
So to conclude, we have. Thus in transforming
and then reversing the transform by
, we recover our original coordinate system, i.e.
. So before concluding this post, I shall summarize the transformation principles we've derived up to this point, along with the result regarding reciprocity.
For those interested in an alternative, more rigorous take on the principle of reciprocity, you might be interested in checking out the following link (requires a direct or VPN university connection to browse for free): Reciprocity Principle and the Lorentz Transformations. I find that much of their logic can be shown to be equivalent to what I've done here, although there are some notable differences in their approach.
To be continued...Last edited by CptBork; 10-20-11 at 07:47 AM.
-
10-20-11, 12:13 AM #20Continued from Post #19:
So at last, we're ready to wrap this section up. For convenient notational purposes, it will be best to express our transformations as matrix equations:
From these relationships, we deduce the following matrix equation:
The only non-trivial relationship we can extract from this relationship is the following:
The conditionsand
can thus be expressed by the equivalent condition
, and we also have the condition
as before.
Our generalized Galilean transforms now reduces to the following:
The Generalized Galilean Transform
Now let's quickly remark on some basic symmetry properties that automatically follow from this generalized transform, as they will be useful in constraining the possible transformations of the electromagnetic fields to be discussed in Section III. The correspondence between coordinate transforms inand
satisfies the following:
As remarked near the beginning of this section, other coordinate descriptions forand
can be uniquely related to the systems we have already constructed, and thus the relationship between any two such corresponding coordinate systems is also uniquely defined, even if one system sees the velocity of the other system's spatial origin not moving along the
-axis.
Now as promised, we shall see how the generalized Galilean transform reduces to either the classical Galilean transform or the Relativistic Lorentz transform given the right simple assumptions.
If we take one of several classical assumptions, such asor that
, then we necessarily conclude that
, and we recover the classical Galilean transform:
The Classical Galilean Transform
On the other hand, if we make either the relativistic assumptionor the assumption
, then we obtain
, and thus we recover the Relativistic Lorentz transform:
The Relativistic Lorentz Transform
Rather than make any of these assumptions, we shall keep things general and proceed onward to Section III, where we will use the results from this section as well as further reasoning to deduce the most general possible transformations of the electromagnetic fields when transitioning between different inertial frames. In Section IV we will then require the consistency of Maxwell's equations in arbitrary reference frames, thereby showing that the coordinate transforms must inevitably reduce to the Lorentzian case as well as yielding the unique, general transformation rules betweenand
fields in different reference frames (less generalized than the generalizations I will make in Section III, because we will also require that Maxwell's equations be satisfied at this point).
I'll make one last remark here which will be of use to us once we start applying Maxwell's equations. If we defineand
, then the chain rule implies that we may make the following partial derivative substitutions:
And now this section is concluded, aside from the matter to be discussed as mentioned in the edit directly below. Thanks everyone for the questions, comments, suggestions, and for reading the thread!
Edit: There actually is one more thing to be done here. Based on a link provided by Przyk, I realized that the composition of velocities implies a certain constraint on the function, and we should explore this constraint and its consequences first before truly proceeding to Section III. This insight wasn't necessary for me when I applied Maxwell's equations and derived the unique result for the Lorentz transformations on paper, but it's nice to see that this final result is already constrained to a certain very familiar form, even before we begin discussing anything whatsoever about electromagnetic fields. I will continue this discussion as an addendum to this section before truly proceeding onwards to Section III.
End of Section II (for now)...Last edited by CptBork; 10-20-11 at 08:00 AM.
Similar Threads
-
By jaiii in forum Computer Science & CultureLast Post: 08-11-11, 01:36 PMReplies: 1
-
By Farsight in forum Pseudoscience ArchiveLast Post: 07-09-10, 03:47 PMReplies: 55
-
By Bishadi in forum Pseudoscience ArchiveLast Post: 03-10-09, 12:24 PMReplies: 23
-
By ael65 in forum Physics & MathLast Post: 09-22-07, 09:39 PMReplies: 0
-
By GeoffP in forum World EventsLast Post: 11-18-06, 06:54 PMReplies: 1

Reply With Quote

Bookmarks