# Thread: Maxwell Could Have Done it All By Himself!

1. ## Maxwell Could Have Done it All By Himself!

The Entire Special Theory of Relativity Derived Directly and Exclusively from the Classical Laws of Electrodynamics

Section I: Introduction

To expedite the process of teaching Special Relativity in standard undergraduate classes, certain postulates and hand-waving are typically used in order to provide heuristic derivations of the fundamental results. Although this is convenient for teaching purposes in order to provide a functional and reasonably coherent understanding of the theory and its consequences, many postulates are thereby introduced which can in fact be shown to be redundant and unnecessary. Additionally, the foundational logic provided to these students is further lacking in rigour due to common unjustified assumptions about special cases yielding the rules for general purposes.

Many popular textbooks on the subject are equally guilty of such heuristics, including several advanced textbooks on the subject. Even when more detailed examinations of the theory are invoked, i.e. in the study of electromagnetic field transformations, certain unjustified assumptions are utilized, and the final result only demonstrates that the resulting theory is self-consistent. In other cases, results of extreme importance, including the famous result $E=mc^2$, are deduced using only limiting cases and other questionable assumptions. Even Einstein and his peers are guilty of such reasoning in their papers. Unfortunately, rather than guiding more advanced students to re-examine the logical foundations and justifications of the Special Theory of Relativity, they are instead typically encouraged to accept it as is and move on to studying alternative means of formulating it, and extending it into the realms of gravity and quantum mechanics.

I intend to present here a fully consistent derivation of the Special Theory of Relativity, directly from basic principles of space-time symmetry and the classical laws of electrodynamics. The goal is to place the theory at the same level of rigour as the electrodynamic laws postulated by James Clerk Maxwell as of 1861, and to give a deeper than usual insight into how they must inevitably arise. Often, modern assumptions, examples and experiments are used to demonstrate the relevancy of the theory, and I will attempt to show here that this relevancy follows directly from mathematical facts which were already understood many, many decades earlier. As a bonus, it means cranks arguing against the Special Theory of Relativity will not only have to argue with every single set of postulates that can be used to derive it, but also the very laws of electromagnetism themselves, and all of their proven consequences.

The work here is partially inspired by and extrapolated on various writings and notes I have browsed through over the years, as well as some personal reasoning. I have a hard time believing that anything I'll be doing here hasn't been done by someone else before, at least in private or in a relatively obscure or older publication. However, as I'm struggling to find everything put together in one place and one piece both on the internet and in my personal notes and books, from the bottom up, my plan is to do it here, at least in rough draft. Perhaps one day it could even lead to an actual publication, if there's enough interest.

I don't want to go too far into the boring details of the consequences in the most sophisticated situations, when such situations don't lead to any real insights into the fundamentals. I also don't want to go over too many complicated details in the actual theory of electrodynamics itself other than to summarize what's necessary, because there are any number of excellent and comprehensive sources covering such topics. I only aim to present here the basic essentials from which everything in Special Relativity can be derived, along with the necessary steps in the derivations, and to compare it to how these same essentials are deduced or presented in conventional teaching. I might present a summary or brief outline of certain secondary results, especially if they're needed in order to deduce further consequences, possibly with links to external sources where it's convenient for expediting things or where there might be further details of interest to investigate. I hope to include a few actual diagrams here or there, although for purposes of getting the basic writeup finished, I might try to keep them to a minimum and add more in later. As far as the math and physics level is concerned, there's no point in arguing against electrodynamics unless you have at least a decent undergrad understanding of what it actually says, so I'm aiming the mathematical parts at people who have attained this level at minimum, although I'll try to keep it as simple and concise as I can.

I very much welcome any on-topic comments, suggestions, contributions and other input. I simply find this kind of bottom-up approach to Relativity is really lacking out there, and it's nice to get this all into one place where we can show how things are really inevitable based on results going back 150 years. I aim to present the work I've already been doing to derive all this stuff on paper in a nice, clean typed out and publically-published format, and to make corrections and improvements to it as well as gauging how much interest it actually generates from those in the scientific community and the general public. As I say, if I find there's enough interest out there, and similar material is lacking in the public and private domains, I might eventually be able to produce a publication of my own based on the kinds of discussions I hope to have here.

To begin the actual work, the next section will involve a derivation of the most general possible coordinate transforms, i.e. the "generalized Galilean transform", which relates the coordinates $x,y,z$ and $t$ in an arbitrary reference frame $\mathbf{A}$ to the coordinates $x',y',z'$ and $t'$ in another arbitrary reference frame $\mathbf{A}'$. We will show that with either of two simple classical assumptions, these transforms inevitably reduce to the classical Galilean coordinate transforms, whereas with a single Relativistic assumption, the generalized Galilean transforms reduce to the Relativistic Lorentz transformations instead. However, we will make no such sets of assumptions, instead deriving the Lorentz transformations directly from the basic principles of space-time symmetries and the laws of electrodynamics in vacuum.

2. Section II: The Generalized Galilean Transform

In this section, I will use basic symmetry properties of space and time, as well as a careful set of coordinate definitions, in order to define the most general possible coordinate transformations that one could include in a physical theory. I'll show how it can be reduced to either the Lorentz or Galilean transformations given the right simple assumptions, but will then proceed to the next part without actually committing to any of these assumptions. I will be using some convenient choices of coordinates (with appropriate justifications) in order to simplify the math, so hopefully it doesn't get too ugly. As I mentioned in the previous section, at some point later I might want to supplement some of these arguments with some simple illustrative diagrams (any external help with this would absolutely be tremendously appreciated, because at my ability level, decent and precise graphical work of the mechanics type can be fairly time consuming). Anyhow, let's begin.

We start by considering an arbitrary inertial frame $\mathbf{A}$, which for now we shall call the "rest frame", possessing a chosen orthogonal coodinate system with coordinate vectors $\left(t,x,y,z\right)$, and the origin vector $\left(t=0,x=0,y=0,z=0\right)$. Similarly, consider another arbitrary inertial frame $\mathbf{A}'$, which for now we shall refer to as the "moving frame". For $\mathbf{A}'$ we also have a chosen orthogonal coordinate system with the origin $\left(t'=0,x'=0,y'=0,z'=0\right)$. Every point $\left(t,\vec{x}\right)$ in $\mathbf{A}$ must correspond to a unique point $\left(t',\vec{x'}\right)$ in $\mathbf{A}'$, and vice-versa. Thus there must be some invertible relationship $t'=t'\left(t,\vec{x}\right)$, $\vec{x'}=\vec{x'}\left(t,\vec{x}\right)$.

Suppose in the $\mathbf{A}$ frame we made the arbitrary point translation $\left(t,\vec{x}\right)\rightarrow\left(t+\mathrm{d }t,\vec{x}+ \vec{\mathrm{d}x}\right)$. Then we expect the corresponding alteration $\left(t',\vec{x'}\right)\rightarrow\left(t'+ \mathrm{d}t',\vec{x'}+ \vec{\mathrm{d}x'}\right)$ to be independent of the position and time $\left(t,x,y,z\right)$. Otherwise, we could detect an inhomogeneity, i.e. a physical difference between different points in the $\mathbf{A}$ frame, such that the same translation for different points in $\mathbf{A}$ leads to different translations for the corresponding points in $\mathbf{A}'$, depending on where the points in $\mathbf{A}$ are located.

Suppose we label the points in $\mathbf{A}$ as $\left(x_0=t,x_1=1,x_2=y,x_3=z\right)$, and we label corresponding points in $\mathbf{A}'$ as $\left(x_0'=t',x_1'=x',x_2'=y',x_3'=z'\right)$. Then the homogeneity of space and time can be represented by the mathematical condition $\frac{\partial x'_i}{\partial x_j}=a_{ij}$, where $a_{ij}$ is some real-valued undetermined function which is independent from $\left(t,x,y,z\right)$.

Thus we may express the following general conditions:

\begin{align} t'&=a_{00}t+a_{01}x+a_{02}y+a_{03}z+\alpha_0\\
x'&=a_{10}t+a_{11}x+a_{12}y+a_{13}z+\alpha_1\\
y'&=a_{20}t+a_{21}x+a_{22}y+a_{23}z+\alpha_2\\
z'&=a_{30}t+a_{31}x+a_{32}y+a_{33}z+\alpha_3 \end{align}

Here the $\alpha_i$'s are also functions independent of $\left(t,\vec{x}\right)$. We are free to define a new coordinate system by making the translations $t''=t'-\alpha_0$, $x''=x'-\alpha_1$, $y''=y'-\alpha_2$, $z''=z'-\alpha_3$, and then relabel the coordinates by dropping a prime (i.e. $x_i''\rightarrow x_i'$). We are then left with the following transformation rules for relating the coordinates of a point $\left(t,\vec{x}\right)$ to the coordinates $\left(t',\vec{x'}\right)$ for the same point:

\begin{align} t'&=a_{00}t+a_{01}x+a_{02}y+a_{03}z\\
x'&=a_{10}t+a_{11}x+a_{12}y+a_{13}z\\
y'&=a_{20}t+a_{21}x+a_{22}y+a_{23}z\\
z'&=a_{30}t+a_{31}x+a_{32}y+a_{33}z \end{align}

or, in Einstein summation notation, $x'_i=a_{ij}x_j$. We have specifically performed a translation in space and time in the $\mathbf{A}'$ frame (i.e. a translation in how the coordinate axes are defined) so that the origin $\left(0,0,0,0\right)$ in $\mathbf{A}$ corresponds to the same point as the origin $\left(0,0,0,0\right)$ in the $\mathbf{A}'$ frame. There's much more to be done before this section is finished, but right now I need to take a quick break.

To be continued...

3. Continued from Post #2:

Now that we've found the most general coordinate transforms in which the same point defines the origin for each system, we need to start relating the two in some fashion. The first principle we shall invoke is the most basic possible statement of reciprocity: if the spatial origin $\left(t,0,0,0)$ of system $\mathbf{A}$ is moving with speed $v'=\left|\left|\frac{\mathrm{d}\vec{x'}}{\mathrm{d }t'} \right|\right|$ in system $\mathbf{A}'$, then the spatial origin $\left(t',0,0,0)$ of system $\mathbf{A}'$ is moving with speed $v=\left|\left|\frac{\mathrm{d}\vec{x}}{\mathrm{d}t } \right|\right|=v'$ in system $\mathbf{A}$.

So now we are ready to define our $x'$ and $t'$ axis conventions. The coordinates assigned to given events/points in each system can vary depending on the chosen alignment, orientation and origin, but their relations to any other coordinate systems produced by spatial rotations, spatial reflections and translations are already uniquely defined, and do not alter the time coordinate (aside from uniform shifts in time, i.e. different start times on the clocks). We are thus free to choose a coordinate system of convenience in $\mathbf{A}$, define a corresponding coordinate system for $\mathbf{A}'$, and a set of mathematical transformation rules between the two. We can then uniquely determine how any other choices of spatial alignments, spatial orientations and translations transform between $\mathbf{A}$ and $\mathbf{A}'$, and all such transformations will preserve the time relationship between the two frames, up to a uniform translation in time equivalent to changing the start times on their clocks.

We choose to align the coordinate axes of $A$ so that in this system, the velocity of the $\mathbf{A}'$ origin is set to $\vec{v}=\left(v_x,0,0\right)$. Reciprocally, we choose to align the coordinate axes of $\mathbf{A}'$ such that it sees the origin of $\mathbf{A}$ moving with velocity $\vec{v'}=\left(-v_x,0,0\right)$. Thus the point $\left(t,\vec{x}\right)=\left(t,0,0,0\right)$ corresponds to the point $\left(t',\vec{x'}\right)=\left(a_{00}t,-v_xt',0,0\right)=\left(a_{00}t,-a_{00}v_xt,0,0\right)$. From this result we can immediately set $a_{20}=a_{30}=0$. Also, since $x=y=z=0\Rightarrow x'=a_{10}t$, we may conclude that $a_{10}=-a_{00}v_x$.

To define the $t'$ axis, we must consider causality. If events in $\mathbf{A}$ occur at $\left(t=t_1,\vec{x}=0\right)$ and $\left(t=t_2,\vec{x}=0\right)$, with $t_2>t_1$, then we say that the event occuring at $t=t_1$ could have had a causal effect on the event occuring at the same position but the later time $t=t_2$, but there could be no causal effect in the opposite order. These two events have the corresponding times $t_1'=a_{00}t_1$ and $t_2'=a_{00}t_2$ in $\mathbf{A}'$. In order that system $\mathbf{A}'$ has the same causal structure as system $\mathbf{A}$, we must therefore require that $a_{00}>0$. So events occurring at increasing times at the origin of $\mathbf{A}$ correspond to events occurring at increasing times and uniform motion along the $x'$ axis in the $\mathbf{A}'$ system. The positive direction of the $x'$ axis is chosen to be opposite to this motion if $v_x>0$, and in the same direction as this motion if $v_x<0$. Thus the $t'$ and $x'$ axes in the $A'$ system are now unambiguously defined independent of the orientation we choose for the $y$ and $z$ axes in $\mathbf{A}$. Given a particular choice for these $y$ and $z$ axes, it now remains to define a suitable set of corresponding $y'$ and $z'$ axes orthogonal to $x'$ and $t'$.

To be continued...

4. Continued from Post #3:

So now it would be a good time to put together what we've derived so far and do a little cleanup. As we apply more symmetries and conditions as well as making additional use of some of our earlier reasoning, things will get even cleaner still. Here are the generalized coordinate transforms we've derived up to this point:

\begin{align} t'&=a_{00}t+a_{01}x+a_{02}y+a_{03}z\\
x'&=-v_xa_{00}t+a_{11}x+a_{12}y+a_{13}z\\
y'&=a_{21}x+a_{22}y+a_{23}z\\
z'&=a_{31}x+a_{32}y+a_{33}z \\ \end{align}

$a_{00}>0$

I would like to make a quick note on a mistaken statement I made in Post #3. There I showed that we could define a unique choice of the $x'$ and $t'$ axes, but in fact we have only defined their uniqueness up to arbitrary positive scale factors/choices of units. Our scheme will ultimately guarantee uniform scaling as we progress further, but we'll keep things general for now.

I'd also like to make a note on the reciprocity of velocities between systems $\mathbf{A}$ and $\mathbf{A}'$. In Post #3 I argued that we can define reciprocal velocities for the relative motions of the two origins, i.e. the origin of $\mathbf{A}$ is seen moving at speed $v'=\left|\left|\frac{\mathrm{d}\vec{x'}}{\mathrm{d }t'}\right|\right|$ from system $\mathbf{A}'$, and the origin of $\mathbf{A}'$ is seen from $\mathbf{A}$ to be moving with speed $v=\left|\left|\frac{\mathrm{d}\vec{x}}{\mathrm{d}t }\right|\right|=v'$. If we didn't select the same speed for each system, then the equivalency of the laws of physics in all inertial reference frames would require that we change the fundamental units and constants we work with in either $\mathbf{A}$ or $\mathbf{A}'$. Since we'd like to maintain the same set of units and physical constants in each system while preserving the same laws of physics, this requires that we set $v'=v$ to express the physical equivalence of these frames.

Up to here we have used the information that $x=y=z=0,\; t=t\quad\Longrightarrow\quad y=z=0,\; x'=-v_xt'$. Let's now also use the information that $x'=y'=z'=0,\; t'=t'\quad\Longrightarrow\quad y=z=0,\; x=v_xt$. Then we can draw the following conclusions:

\begin{align} 0&=-v_xa_{00}t+a_{11}v_xt & \Longrightarrow a_{11}&=a_{00}\\
0&=a_{21}v_xt & \Longrightarrow a_{21}&=0\\
0&=a_{31}v_xt & \Longrightarrow a_{31}&=0 \end{align}

These conclusions clearly follow because they hold for arbitrary values of $t$, and for now we are assuming that $v_x\neq 0$. We will show in the end that the same conclusions follow for $v_x=0$.

Our generalized coordinate transforms have now become the following:

\begin{align} t'&=a_{00}t+a_{01}x+a_{02}y+a_{03}z\\
x'&=a_{00}\left(x-v_xt\right)+a_{12}y+a_{13}z\\
y'&=a_{22}y+a_{23}z\\
z'&=a_{32}y+a_{33}z \\ \end{align}

$a_{00}>0$

There, looking much nicer already ain't it?

To be continued...

5. Continued from Post #4:

Starting in the $\mathbf{A}$ system, consider the points $\left(t=0,x=0,y,z=0\right)$ defining the $y$ axis. These points correspond to $\left(t'=a_{02}y,x'=a_{12}y,y'=a_{22}y,z'=a_{32}y \right)$, which defines a point moving at a uniform rate along a line in $\left(t',\vec{x'}\right)$ space. In order for the coordinate transformation to be invertible, we must pick either $a_{22}\neq 0$ or $a_{32}\neq 0$. So this line passes through the origin and has a nontrivial component in the $y'-z'$ plane. We may thus define the positive direction of the $y'$ axis by choosing $a_{22}>0$ and $a_{32}=0$. Then the positive direction of $z'$ is uniquely set by demanding that the $\mathbf{A}'$ system have the same orientation as the $\mathbf{A}$ system (i.e. $\hat{x}\times\hat{y}=\pm\hat{z}\Longrightarrow \hat{x}'\times\hat{y}'=\pm\hat{z}'$). By the requirement of invertibility, since we have chosen $a_{32}=0$, we must also necessarily choose $a_{33}\neq 0$. Since we're free to choose our spatial and time scaling, we are free to choose $a_{22}=1$, and we will see soon that this choice uniquely defines our space and time scales in $\mathbf{A}'$.

Before we discuss the effect of coordinate rotations in $\mathbf{A}$ on the scaling in $\mathbf{A}'$, we must first consider what happens to the functions $a_{ij}$. We already defined these functions and showed them to be independent of $\left(t,\vec{x}\right)$. So what else can they depend on? The only remaining physical variable that we have to work with in general is the velocity of the origin in $\mathbf{A}'$ as seen from $\mathbf{A}$, $\vec{v}=\left(v_x,0,0\right)$. Since we want to start with spacetime as a backdrop for all events to occur therein, we will disregard the possibility of introducing any further physical variables relating the coordinate relations between two inertial frames. Thus we may demand $a_{ij}=a_{ij}\left(v_x\right)$, and that $a_{ij}$ be a continuous function of $v_x$, so that arbitrarily small changes in inertial velocity will lead to arbitrarily small changes in the coordinate system. We also demand that if $v_x=0$, then the coordinate systems $\mathbf{A}$ and $\mathbf{A}'$ completely coincide.

Thus we define $\lim_{v_x\to 0}a_{ij}\left(v_x\right)=\delta_{ij}$, where $\delta_{ij}$ is the usual Kronecker Delta:

$\delta_{ij}=\begin{cases}1, & i=j\\ 0, & i\neq j \end{cases}$

So let's summarize what we've got now:

\begin{align} t'&=a_{00}\left(v_x\right)t+a_{01}\left(v_x\right) x+a_{02}\left(v_x\right)y+a_{03}\left(v_x\right)z \\
x'&=a_{00}\left(v_x\right)\left(x-v_xt\right)+a_{12}\left(v_x\right)y+a_{13}\left(v_ x\right)z\\
y'&=y+a_{23}\left(v_x\right)z\\
z'&=a_{33}\left(v_x\right)z \end{align}

$a_{00}\left(v_x\right)>0,\quad a_{33}\left(v_x\right)\neq 0, \quad\lim_{v_x\to 0}a_{ij}\left(v_x\right)=\delta_{ij}$

Now what are the possible relationships between two different coordinate descriptions of a system $\mathbf{B}$, given that both descriptions are defined with the same point as their spacetime origin? Suppose we have two such descriptions, $B_1$ and $B_2$ with coordinate labels $\left(t_1,x_1,y_1,z_1\right)$ and $\left(t_2,x_2,y_2,z_2\right)$ respectively.

In order to preserve causality/time-ordering and the uniformity of equivalent intervals in both descriptions, we require that $t_1=\alpha t_2$ for some real-valued constant $\alpha>0$. Similarly, to preserve the uniformity of equivalent spatial intervals, we require that $x_1^2+y_1^2+z_1^2=\beta^2\left(x_2^2+y_2^2+z_2^2 \right)$, with $\beta\neq 0$ also a real-valued constant. So we may ask, if we have a system $\mathbf{A}$ with coordinate descriptions $\mathbf{A_1}$ and $\mathbf{A_2}$ related by a spatial rotation which preserves the distance and time scales ($t_1=t_2$), along with the condition $x_2=\pm x_1,\quad v_{x_2}=\pm v_{x_1}$, and thus $y_1^2+z_1^2=y_2^2+z_2^2$, how does our scheme affect the scaling and orientation of the resulting coordinate descriptions $\mathbf{A_1}'$ and $\mathbf{A_2}'$ of the intertial frame $\mathbf{A}'$? I will address this question in the next post, but right now it's a good time to take another break.

To be continued...

6. Continued from Post #5:

Continuing with our questions about space and time scaling near the conclusion of Post #5, let's investigate what happens if we define two coordinate systems $\mathbf{A_1}$ and $\mathbf{A_2}$ for the inertial frame $\mathbf{A}$, both sharing the same spacetime origin and observing the spatial origin of frame $\mathbf{A}'$ moving with velocity $\vec{v}=\left(\pm v_x,0,0\right)$ (we allow for reversals of the $x$-axis). Using our scheme, we define two corresponding coordinate systems $\mathbf{A_1}'$ and $\mathbf{A_2}'$ for frame $\mathbf{A}'$, also sharing the same spacetime origin.

From our previous discussion, we know that the systems $\mathbf{A_1}'$ and $\mathbf{A_2}'$ are related in the following way:

$t_2'=\alpha t_1', \quad \alpha>0

Then in frame $\mathbf{A}'$, the point $x_1'=\pm v_xt_1',\quad y_1'=z_1'=0$ corresponds to the point $x_2'=\pm v_xt_2'=\pm\alpha v_xt_1',\quad y_2'=z_2'=0$. Thus for this point we have the relationship $x_2'^2=\alpha^2x_1'^2$. But we know that for this same point, the relationship $x_2'^2=\beta^2x_1'^2$ must also hold, which thus requires that we set $\beta^2=\alpha^2>0$.

So to summarize: If coordinate systems $\mathbf{A_1}$ and $\mathbf{A_2}$ for frame $\mathbf{A}$ share the same spacetime origin and observe the spatial origin of frame $\mathbf{A}'$ moving with velocity $\vec{v}=\left(\pm v_x,0,0\right)$, then the following relationships must hold in the corresponding systems $\mathbf{A_1}'$ and $\mathbf{A_2}'$ defined by our scheme:

$t_2'=\alpha t_1', \quad \alpha>0
x_2'^2+y_2'^2+z_2'^2=\alpha^2\left(x_1'^2+y_1'^2+z _1'^2\right)$

Now we need to calculate the values of $\alpha$ that will correspond to the various rotations we intend to perform in the $\mathbf{A}$ frame. We will see that $\alpha=1$ in each of the cases under consideration. To simplify matters as much as possible, we should first consider transformations in $\mathbf{A}$ which send $t\rightarrow t,\ x\rightarrow x,\ v_x\rightarrow v_x,\ y^2+z^2\rightarrow y^2+z^2$.

So consider two systems $\mathbf{A_1}$ and $\mathbf{A_2}$ related by a such transformation, i.e.:
$\left(t_1,x_1,y_1,z_1\right)\longrightarrow\left(t _2=t_1,x_2=x_1,y_2,z_2\right),\quad y_1^2+z_1^2=y_2^2+z_2^2$

Consider the point $\left(t_1=0,x_1,y_1=0,z_1=0)\longrightarrow\left(t _2=\alpha t_1=0,x_2=x_1,y_2=0,z_2=0\right)$ in inertial frame $\mathbf{A}$. The corresponding time relationship in $\mathbf{A}'$ is $t_1'=a_{01}\left(v_x\right)x_1\longrightarrow t_2'=a_{01}\left(v_x\right)x_1=t_1'$. But since we already know that $t_2'=\alpha t_1'$, we have the direct implication that $\alpha=1$.

So if we have two coordinate systems for frame $\mathbf{A}$ which preserve the $x$ and $t$-axes and the spatial distance $y^2+z^2$ orthogonal to the $x$-axis, then the corresponding coordinate systems in frame $\mathbf{A}'$, as defined by our scheme, must share the following relationship:

$t_1'=t_2'
x_1'^2+y_1'^2+z_1'^2=x_2'^2+y_2'^2+z_2'^2$

Now we can start doing some rotations and cleaning things up!

Consider coordinate system $\mathbf{A_2}$ derived for system $\mathbf{A_1}$ by the spatial rotation $t\to t,\ x\to x,\ v_x\to v_x,\ y\to -y,\ z\to -z$. Then the corresponding relationships we've just derived show that

$t_1'=a_{00}\left(v_x\right)t+a_{01}\left(v_x\right )x+a_{02}\left(v_x\right)y+a_{03}\left(v_x\right)z =t_2'=a_{00}\left(v_x\right) t+a_{01}\left(v_x\right)x-a_{02}\left(v_x\right)y-a_{03}\left(v_x\right)z$

Since this relationship holds for arbitrary values of $x$, $y$, $z$ and $t$, we have demonstrated that $a_{02}\left(v_x\right)=a_{03}\left(v_x\right)=0$.

To summarize what we have so far:

\begin{align} t'&=a_{00}\left(v_x\right)t+a_{01}\left(v_x\right) x \\
x'&=a_{00}\left(v_x\right)\left(x-v_xt\right)+a_{12}\left(v_x\right)y+a_{13}\left(v_ x\right)z\\
y'&=y+a_{23}\left(v_x\right)z\\
z'&=a_{33}\left(v_x\right)z \end{align}

In the next post, I will demonstrate from this same rotation that we must also require $a_{12}\left(v_x\right)=a_{13}\left(v_x\right)=0$.

To be continued...

7. Continued from Post #6:

Continuing from where we left off, again consider the following rotation in $\mathbf{A}$: $t\to t,\ x\to x,\ v_x\to v_x,\ y\to -y,\ z\to -z$
Then the corresponding relationships in $\mathbf{A}'$ are:

\begin{align} x_1'&=a_{00}\left(v_x\right)\left(x-v_xt\right)+a_{12}\left(v_x\right)y+a_{13}\left(v_ x\right)z & x_2'&=a_{00}\left(v_x\right)\left(x-v_xt\right)-a_{12}\left(v_x\right)y-a_{13}\left(v_x\right)z \\
y_1'&=y+a_{23}\left(v_x\right)z & y_2'&=-y-a_{23}\left(v_x\right)z=-y_1'\\
z_1'&=a_{33}\left(v_x\right)z & z_2'&=-a_{33}\left(v_x\right)z=-z_1'
\end{align}

Thus we have $y_1'^2=y_2'^2$ and $z_1'^2=z_2'^2$, and so the result $x_1'^2+y_1'^2+z_1'^2=x_2'^2+y_2'^2+z_2'^2$ which was demonstrated in Post #6 reduces to the requirement that $x_1'^2=x_2'^2$. So for arbitrary real-valued $t$, $x$, $y$ and $z$, we have the following condition:

$\left[a_{00}\left(v_x\right)\left(x-v_xt\right)+a_{12}\left(v_x\right)y+a_{13}\left(v_ x\right)z\right]^2=\left[a_{00}\left(v_x\right)\left(x-v_xt\right)-a_{12}\left(v_x\right)y-a_{13}\left(v_x\right)z\right]^2$

Since we have already shown that $a_{00}\left(v_x\right)>0$, a small bit of mathematical manipulation demonstrates that $a_{12}\left(v_x\right)=a_{13}\left(v_x\right)=0$ as I earlier claimed.

Our transformations are starting to look very simple indeed:

\begin{align}
t'&=a_{00}\left(v_x\right)t+a_{01}\left(v_x\right) x \\
x'&=a_{00}\left(v_x\right)\left(x-v_xt\right) \\
y'&=y+a_{23}\left(v_x\right)z \\
z'&=a_{33}\left(v_x\right)z \end{align}

Next consider the following $90^\circ$ rotation in $\mathbf{A}$: $x\to x,\ t\to t,\ v_x\to v_x,\ y\to z,\ z\to -y$
Then the corresponding relationship in the defined coordinates of $\mathbf{A}'$ is:

\begin{align} t_1'&=t_2' && \\
x_1'&=x_2' && \\
y_1'&=y+a_{23}\left(v_x\right)z & y_2'&=z-a_{23}\left(v_x\right)y \\
z_1'&=a_{33}\left(v_x\right)z & z_2'&=-a_{33}\left(v_x\right)y \end{align}

Then the condition $x_1'^2+y_1'^2+z_1'^2=x_2'^2+y_2'^2+z_2'^2$ reduces to the condition that $y_1'^2+z_1'^2=y_2'^2+z_2'^2$. We thus have the following relationship for arbitrary $y$ and $z$:

$\left[y+a_{23}\left(v_x\right)z\right]^2+\left[a_{33}\left(v_x\right)z\right]^2=\left[z-a_{23}\left(v_x\right)y\right]^2+\left[a_{33}\left(v_x\right)y\right]^2$

Taking either $z=0$ or $y=0$ immediately yields the condition $a_{23}\left(v_x\right)^2+a_{33}\left(v_x\right)^2= 1$, which will help to simplify things.
Expanding both sides of the equation $\left[y+a_{23}\left(v_x\right)z\right]^2+\left[a_{33}\left(v_x\right)z\right]^2=\left[z-a_{23}\left(v_x\right)y\right]^2+\left[a_{33}\left(v_x\right)y\right]^2$ and applying this condition yields the following relation for arbitrary $y$ and $z$:

$a_{23}\left(v_x\right)y\left[a_{23}\left(v_x\right)y+2z\right]=0$

Taking $z=0$ immediately shows that $a_{23}\left(v_x\right)=0$. Then $a_{23}\left(v_x\right)^2+a_{33}\left(v_x\right)^2= 1\Longrightarrow a_{33}\left(v_x\right)^2=1$.

So we must choose either $a_{33}\left(v_x\right)=1$ or $a_{33}\left(v_x\right)=-1$. The condition $\lim_{v_x\to 0}a_{ij}\left(v_x\right)=\delta_{ij}$ requires that $\lim_{v_x\to 0}a_{33}\left(v_x\right)=1$. The requirement that $a_{33}\left(v_x\right)$ be a continuous function of $v_x$ thus forces us to choose $a_{33}\left(v_x\right)=1$ for all real values of $v_x$.

To summarize, we now have the following:

\begin{align}
t'&=a_{00}\left(v_x\right)t+a_{01}\left(v_x\right) x \\
x'&=a_{00}\left(v_x\right)\left(x-v_xt\right) \\
y'&=y \\
z'&=z \end{align}

Having simplified things up to this point, we are now ready to consider rotations in $\mathbf{A}$ which send $x\to -x$ and $v_x\to -v_x$ along with their effect on the resulting coordinate systems defined in $\mathbf{A}'$, which will ultimately permit us to justify the principle of reciprocity and thereby reduce everything down to a single undetermined function of $v_x$.

To be continued...

8. Originally Posted by CptBork
I intend to present here a fully consistent derivation of the Special Theory of Relativity, directly from basic principles of space-time symmetry and the classical laws of electrodynamics..
Errr, isn't essentially what Lorentz did over a century ago, to get the transform in question named after him? I'm sure I was exposed to exactly this line of derivation back in undergrad somewhere...

Originally Posted by CptBork
To begin the actual work, the next section will involve a derivation of the most general possible coordinate transforms, i.e. the "generalized Galilean transform", which relates the coordinates $x,y,z$ and $t$ in an arbitrary reference frame $\mathbf{A}$ to the coordinates $x',y',z'$ and $t'$ in another arbitrary reference frame $\mathbf{A}'$. We will show that with either of two simple classical assumptions, these transforms inevitably reduce to the classical Galilean coordinate transforms, whereas with a single Relativistic assumption, the generalized Galilean transforms reduce to the Relativistic Lorentz transformations instead. However, we will make no such sets of assumptions, instead deriving the Lorentz transformations directly from the basic principles of space-time symmetries and the laws of electrodynamics in vacuum.
Isn't this more-or-less the same approach taken by Einstein in his book "Relativity"?

9. Continued from Post #7:

Before I proceed further, I'd like to make a few remarks about how my deductions up to this point differ from what I've experienced to be the standard conventional teachings. Firstly, when either deriving the classical Galilean or Relativistic Lorentz transforms, it's often take for granted that the relationship between the coordinate systems consists of linear combinations. In my derivation, among a few others out there, I've taken care to justify this assumption by deriving it from the homogeneity of space and time, and setting the origins of the two coordinate systems to the same point.

Secondly, in deriving either set of transforms, it's generally assumed that $\left(t,x\right)\rightarrow\left(t',x'\right)$ independently of $y$ and $z$. Likewise, it's also generally assumed that $\left(y,z\right)\rightarrow\left(y',z'\right)$ independently of $t$ and $x$. The standard reasoning is that, on a physical basis, space and time shouldn't have preferred directions for certain things, i.e. if $\mathbf{A}'$ has its spatial origin moving along the $x$-axis of $\mathbf{A}\mathsf{,}$ then changing the orientation of the $y$ and $z$ axes shouldn't change the way in which the $t'$ and $x'$ axes transform. Furthermore, it's also generally assumed that if we pick $y'=y$, then we're also free to pick $z'=z$, and that the resulting axes will still be orthogonal as well as preserving the scaling relationship between different descriptions of the same frame with the same origin.

I have taken care to attempt to justify this second set of assumptions from the following:

1. Basic postulates regarding the scaling relationships between the times and distances of different coordinate descriptions with the same spacetime origin in the same inertial frame
2. The assumption that space is not only homogeneous, but also isotropic. Thus there is no preferred rotational orientation and so, after a careful and unambiguous definition of the $t',\ x',\ y'$ and $z'$ axes based on the $t,\ x,\ y$ and $z$ axes, with the motion of $\mathbf{A}'$ set along the $+x$-axis with speed $v_x$, the same coordinate relationships should hold with the same functions $a_{ij}=a_{ij}\left(v_x\right)$, regardless of how the $y$ and $z$ axes are oriented orthogonal to $x$ and $t$.

Now let's proceed as promised. Suppose we have the two coordinate systems for frame $\mathbf{A}$, labeled $\mathbf{A_1}$ and $\mathbf{A_2}$ with the relationship $t_1\to t_2=t_1,\ x_1\to x_2=-x_1,\ v_{x_1}=v_x\to v_{x_2}=-v_x,\quad y_1^2+z_1^2\to y_2^2+z_2^2$.

In Post #6 I showed that the following relationships must hold in the corresponding systems $\mathbf{A_1}'$ and $\mathbf{A_2}'$:

$t_2'=\alpha t_1',\quad \alpha>0
x_2'^2+y_2'^2+z_2'^2=\alpha^2\left(x_1'^2+y_1'^2+z _1'^2\right)$

In terms of $\mathbf{A_1}$ and $\mathbf{A_2}$, we also have the following relationships:

\begin{align} t_1'&=a_{00}\left(v_x \right)t+a_{01}\left(v_x\right)x & t_2'&=a_{00}\left(-v_x\right)t-a_{01}\left(-v_x\right)x \\
x_1'&=a_{00}\left(v_x\right)\left(x-v_xt\right) & x_2'&=-a_{00}\left(-v_x\right)\left(x-v_xt\right) \\
y_1'&=y_1 & y_2'&=y_2 \\
z_1'&=z_1 & y_2'&=y_2 \end{align}

If we choose $x=t=0$, then $x_1'=x_2'=0$ and we obtain the following:

$y_2'^2+z_2'^2=\alpha^2\left(y_1'^2+z_1'^2\right)= \alpha^2\left(y_1^2+z_1^2\right)=\alpha^2\left(y_2 ^2+z_2^2\right)=\alpha^2\left(y_2'^2+z_2'^2\right)$

Thus we necessarily have $\alpha^2=1$, and thus $\alpha>0\Rightarrow \alpha=1$. So then we may simplify the relationship between $\mathbf{A_1}'$ and $\mathbf{A_2}'$:

$t_2'=t_1',\qquad x_2'^2+y_2'^2+z_2'^2=x_1'^2+y_1'^2+z_1'^2$

It then immediately follows that $a_{00}\left(v_x\right)t+a_{01}\left(v_x\right)x=a_ {00}\left(-v_x\right)t-a_{01}\left(-v_x\right)x$ for arbitrary values of $x$ and $t$, and thus:

$a_{00}\left(-v_x\right)=a_{00}\left(v_x\right),\quad a_{01}\left(-v_x\right)=-a_{01}\left(v_x\right)$

Now at last we're ready to justify and employ the principle of reciprocity, which shall come in the next post.

To be continued...

Errr, isn't essentially what Lorentz did over a century ago, to get the transform in question named after him? I'm sure I was exposed to exactly this line of derivation back in undergrad somewhere...
Well I haven't seen anything of his along the lines of what I'm doing. I'm sure there's a lot of similar work out there, possibly including works of Poincare and Lorentz themselves, in which they derive a generalized Galilean transform in presumably a similar way to how I'm approaching it. But when it came to electrodynamics, I looked but haven't found any evidence that they did it directly from Maxwell's equations, but rather from the assumption that not only do all electromagnetic disturbances propagate at light speed, but that an electromagnetic disturbance in one frame corresponds to a similar electromagnetic disturbance in another frame. I'm going to be doing it directly from Maxwell, i.e. not making any usage of the wave equation. In fact, I only need to use the 4 Maxwell's equations in one frame and 3 in the other, and the last one pops out automatically and inevitably. Plus I can thus show that the E-M fields have to transform a specific way in all cases, and not merely assume that the results from special cases hold in general, or make unjustified assumptions about how charge and current densities transform.

Isn't this more-or-less the same approach taken by Einstein in his book "Relativity"?
Not as far as I recall, nope. The differences might become more clear once I finally finish the task of deriving generalized Galilean and E-M field transforms without any specifically classical or Relativistic physical assumptions, and then apply Maxwell's equations in both frames to show that there's only one way things can work out, and it involves exactly what we're taught in school, but with hopefully a much more rigorous (and more historical) basis. And then I hope to do the same with further results in the theory all the way up to $E=mc^2$ and maybe even beyond, in a much more rigorous and inevitable fashion than is usually taught, and again all deducible directly from Maxwell (with one extra postulate I haven't yet admitted to, which is the assumption that Maxwell's equations hold in all frames). Yes it's true Einstein made this same demand, but I've never seen him do it from bottom up and show that this is the only way things could work out, but rather I've only seen him give some physical reasoning and show that the result is self-consistent.

11. Originally Posted by CptBork
But when it came to electrodynamics, I looked but haven't found any evidence that they did it directly from Maxwell's equations, but rather from the assumption that not only do all electromagnetic disturbances propagate at light speed, but that an electromagnetic disturbance in one frame corresponds to a similar electromagnetic disturbance in another frame. I'm going to be doing it directly from Maxwell, i.e. not making any usage of the wave equation.
Hi CptBork,

You might like to know that there are very well known methods, dating back to Sophus Lie (1842-1899), that will generate the full set of symmetries of the Maxwell equations. If you are comfortable with Lie groups and algebras, then it should only take a little reading to understand the techniques. The best text is Olver's "Applications of Lie Groups to Differential Equations".

12. Originally Posted by Guest254
Hi CptBork,

You might like to know that there are very well known methods, dating back to Sophus Lie (1842-1899), that will generate the full set of symmetries of the Maxwell equations. If you are comfortable with Lie groups and algebras, then it should only take a little reading to understand the techniques. The best text is Olver's "Applications of Lie Groups to Differential Equations".
Hi Guest, thanks for the tip. I looked through my extensive personal archive and couldn't find the text, but on Google books the entirety of Section 2 is available and that looks like what I need to follow up on your reference, which I'm doing at present. My exposure to Lie Groups and Lie Algebras is unfortunately somewhat limited to what they generally reach in quantum mechanics/quantum field theory, and even though I took 4 undergrad courses and 2 grad courses specifically dealing with abstract algebra, I had the same teacher for 4 of those courses and they didn't seem too interested in dealing with concepts like matrix exponentiation and representation theory (I've filled some of the gaps in myself but have TONS I still need to learn).

I'm curious though if this approach will yield what I'm looking for. Let's say I define a set of transformations $\left(t,x,y,z\right)\longrightarrow\left(t',x',y', z'\right)$, but these transformations involve a single undetermined function of $v_x$, as I will be shortly deriving once I complete the section on generalized Galilean transforms. If I have this single undetermined function in my spacetime transforms, and I assume nothing whatsoever about how the $\vec{E}$ and $\vec{B}$ fields transform other than that the transformed variables preserve Maxwell's equations, is that sufficient to uniquely determine the transformations of these $\vec{E}$ and $\vec{B}$ fields as well as the undetermined function of $v_x$ in my spacetime transforms? To put it another way, let's assume my spacetime transformations are chosen to be the classical Galilean transforms. Could I not still define a transformation law $\left(\vec{E},\vec{B}\right)\rightarrow\left(\vec{ E'},\vec{B'}\right)$ such that Maxwell's equations are preserved, even if they lead to some absurd physical consequences?

I could certainly see a conclusive result following if I made the assumption that my spacetime transformations are specifically Lorentzian. But if I haven't yet made that assumption, and have instead kept things general, and I don't make any attempt to relate the generalized symmetries in spacetime to symmetries in the vector fields $\vec{E}$ and $\vec{B}$, are you certain these Lie Group methods will still lead me to a unique result? As far as I can tell just from my preliminary reading, you need complete information on how space and time transform before you can determine how the $\vec{E}$ and $\vec{B}$ fields transform in response to preserve Maxwell's equations using this Lie Group approach, whereas my approach will not require full information on the spacetime transformations but will ultimately give a unique result for this undetermined function of $v_x$ I've been mentioning.

13. Originally Posted by CptBork
I looked but haven't found any evidence that they did it directly from Maxwell's equations, but rather from the assumption that not only do all electromagnetic disturbances propagate at light speed, but that an electromagnetic disturbance in one frame corresponds to a similar electromagnetic disturbance in another frame.
Have you read the essay "How to teach special relativity" by John Bell? This is pretty much the approach he advocates. If you can get a hold of it, it's included as chapter 9 in his book "Speakable and unspeakable in quantum mechanics", and is well worth a read. His approach is completely different than yours though: he starts off by considering things like the electric field around a moving charge, which turns out to be length contracted in the direction of motion. (So imagine what this would do to matter composed of atoms held together by electric forces, including rulers that might be used to define a moving "frame". Incidentally this was originally derived by Heaviside around 1888-1889 and was the inspiration for Fitzgerald's length contraction hypothesis.)

Which makes me wonder: why do you insist on "deriving" the Lorentz transformation in the first place? Maybe it's becaused you haven't finished yet, but it's not clear what you're trying to show or even why you see a need to derive anything in the first place. In your introduction you say:
The goal is to place the theory at the same level of rigour as the electrodynamic laws postulated by James Clerk Maxwell [...]
At its heart relativity is really quite a simple theory: all it does is assert that all the laws of physics possess a certain symmetry we call "Lorentz covariance". I don't know about you but I find this a perfectly well defined (or at least easily qualifiable) definition of relativity, which as it happens is well supported by modern physics: all the most fundamental laws of nature, including some of the most precisely verified in the history of physics, are Lorentz covariant. Historically, Maxwell's theory was the first discovered to possess this symmetry, which led Einstein and others on to the idea that this might be a feature of all the laws of physics, and not just electrodynamics.

As far as generic "derivations" of the Lorentz transformation are concerned, it's interesting to have things like Einstein's derivation of the Lorentz transformation in terms of more-or-less experimentally verifiable postulates, or derivations in the style of Lévy-Leblond arguing that if we're going to have a relativity principle (i.e. a velocity-dependent symmetry in the style of the Galilean transformation), then the Galilean and Lorentz transformations are the only two "reasonable" possibilities of their "type". They strengthen the (already strong) case for relativity by showing that basically the only reasonable alternative to relativity now is to abandon the principle of relativity altogether.

But in my experience these derivations can get quite ad-hoc and hand-wavy, and don't make for an elegant formulation of relativity. For example the typical argument that translational symmetry implies linearity: what does that mean? Imposing translational symmetry on its own strictly speaking doesn't constrain anything. On its own it just means that if $x^{\nu} \rightarrow f^{\mu}(x^{\nu})$ is a symmetry, then the complete symmetry group must also contain all the transformations of the form $x^{\nu} \rightarrow f^{\mu}(x^{\nu} + \delta^{\nu}) + \varepsilon^{\mu}$. When you look at this more more carefully things start to get get hairier: intuitively the problem here is that this gets you a symmetry group that's too large. So in this sort of proof that translational invariance implies linearity, there are really additional hidden assumption that are being applied. I'd have to think more carefully about how to formulate this properly, but intuitively they seem to be related to the idea that a velocity-dependent "boost" should be unique up to rotations and translations.

Reciprocity in my opinion is another problematic one. I think it only makes sense if you're assuming a priori that the symmetry group you're looking for doesn't contain dilations (and if you were, I'd be surprised if it was really necessary). We certainly can't justify it experimentally: it's not like we've ever actually performed an experiment with two observers in two rockets moving at high velocity past one another, testing that each sees the other moving at the same speed. (We can easily justify the removal of uniform dilations experimentally though, with the simple observation that there actually is a heirarchy of distance scales in the universe.)

With that said I'm not a fan of the way relativity is typically taught (especially to laymen), so for that I commend you for starting a thread like this (though if I were to teach relativity I'd approach it differently). As well as rebutting a lot of crank misconceptions in one place it could also give some interested laymen with confused ideas about relativity some insight about what relativity looks like to physicists and how we apply it.

14. Originally Posted by CptBork
[...] if I don't make any attempt to relate the generalized symmetries in spacetime to symmetries in the vector fields $\vec{E}$ and $\vec{B}$
Why wouldn't you? In electrodynamics the electric and magnetic fields don't just drop out of the sky. They're defined in terms of the force acting on a test charge. The Lorentz force law
$
m \bar{a} = q \bigl[ \bar{E} + \bar{v} \times \bar{B} \bigr]
$
is pretty much the operational definition of the electric and magnetic field vectors. Obviously any transformation in space-time immediately implies a transformation of the velocity and acceleration vectors, which in turn implies a transformation of $\bar{E}$ and $\bar{B}$.

15. Originally Posted by przyk
Have you read the essay "How to teach special relativity" by John Bell? This is pretty much the approach he advocates. If you can get a hold of it, it's included as chapter 9 in his book "Speakable and unspeakable in quantum mechanics", and is well worth a read. His approach is completely different than yours though: he starts off by considering things like the electric field around a moving charge, which turns out to be length contracted in the direction of motion. (So imagine what this would do to matter composed of atoms held together by electric forces, including rulers that might be used to define a moving "frame". Incidentally this was originally derived by Heaviside around 1888-1889 and was the inspiration for Fitzgerald's length contraction hypothesis.)
That sounds like an interesting approach. I am aware of and have previously derived and solved problems with the Lienard-Wiechert potentials of a moving charge, so I'd definitely be interested in seeing what can be concluded via this approach, although I hope there's no hand-waving introduced at some point in the argument.

Originally Posted by przyk
Which makes me wonder: why do you insist on "deriving" the Lorentz transformation in the first place? Maybe it's becaused you haven't finished yet, but it's not clear what you're trying to show or even why you see a need to derive anything in the first place.
I think that should become more clear once I finally get things wrapped up and actually derive the Lorentz and E-M field transforms (hopefully soon). It's taking a lot longer than I expected partly because I realized I could make major improvements to my original reasoning regarding generalized Galilean transforms, so I scribbled down a whole new set of notes and calculations on paper, and this stuff of course takes a long time to type up. But the ultimate goal is to improve the rigour as compared to how the Lorentz transforms are generally taught/derived, and show that the necessary logic comes from much simpler and more classical examples and laws than what are typically used. My goal here is certainly not to re-invent the wheel, so once the initial work is done, we'll see if someone comes along and points out how it's already been done in the same or a better way in another location.

Originally Posted by przyk
At its heart relativity is really quite a simple theory: all it does is assert that all the laws of physics possess a certain symmetry we call "Lorentz covariance". I don't know about you but I find this a perfectly well defined (or at least easily qualifiable) definition of relativity, which as it happens is well supported by modern physics: all the most fundamental laws of nature, including some of the most precisely verified in the history of physics, are Lorentz covariant. Historically, Maxwell's theory was the first discovered to possess this symmetry, which led Einstein and others on to the idea that this might be a feature of all the laws of physics, and not just electrodynamics.
My goal is to turn this "might be" into a "must be", for all those who accept the classical laws of electrodynamics and don't believe the Earth is a magically preferred physical reference frame.

Originally Posted by przyk
As far as generic "derivations" of the Lorentz transformation are concerned, it's interesting to have things like Einstein's derivation of the Lorentz transformation in terms of more-or-less experimentally verifiable postulates, or derivations in the style of Lévy-Leblond arguing that if we're going to have a relativity principle (i.e. a velocity-dependent symmetry in the style of the Galilean transformation), then the Galilean and Lorentz transformations are the only two "reasonable" possibilities of their "type". They strengthen the (already strong) case for relativity by showing that basically the only reasonable alternative to relativity now is to abandon the principle of relativity altogether.
I'm perfectly happy to see as many different derivations put out there as possible. I merely want to add, especially for the doubters out there, that it also follows directly from laws which have been well-established in the mid-1800's or earlier. Many people have trouble accepting the idea of space and time transforming in a non-classical way, but little to no trouble accepting things like Gauss's and Ampere's laws.

Originally Posted by przyk
But in my experience these derivations can get quite ad-hoc and hand-wavy, and don't make for an elegant formulation of relativity. For example the typical argument that translational symmetry implies linearity: what does that mean? Imposing translational symmetry on its own strictly speaking doesn't constrain anything. On its own it just means that if $x^{\nu} \rightarrow f^{\mu}(x^{\nu})$ is a symmetry, then the complete symmetry group must also contain all the transformations of the form $x^{\nu} \rightarrow f^{\mu}(x^{\nu} + \delta^{\nu}) + \varepsilon^{\mu}$. When you look at this more more carefully things start to get get hairier: intuitively the problem here is that this gets you a symmetry group that's too large. So in this sort of proof that translational invariance implies linearity, there are really additional hidden assumption that are being applied. I'd have to think more carefully about how to formulate this properly, but intuitively they seem to be related to the idea that a velocity-dependent "boost" should be unique up to rotations and translations.
I'm perfectly happy to admit that my derivation of this result (linearity) contained some heuristics, some loose definitions and a few skipped steps. However, I am quite certain that with a slightly more precise description of the translations involved, the result can actually be shown to be perfectly rigorous without any further assumptions. In the discussion of reciprocity I plan to make in my continuation of Post #9, I am planning to link to the following paper which also discusses the principle and attempts to justify it: Reciprocity Principle and the Lorentz Transformations (warning: you may be required to access it either from a university connection or else a university VPN account if you want to download it for free). As part of their work, they too derive the linearity condition, but in a more rigorous and elegant fashion than I have employed here. However, with careful mathematical reasoning you can actually show that their approach is entirely equivalent to mine, i.e. one set of assumptions can be directly deduced from the other.

I was actually planning to use similar logic to what they used for spacetime translations in that paper, but I was only planning to use this approach when deriving the most general possible $\vec{E}$ and $\vec{B}$ transforms, where it becomes a bit more necessary as the starting point. In retrospect, I probably should have used the same method both for the spacetime and E-M field transformations, but I'll be linking to it anyhow for those who are interested.

Originally Posted by przyk
Reciprocity in my opinion is another problematic one. I think it only makes sense if you're assuming a priori that the symmetry group you're looking for doesn't contain dilations (and if you were, I'd be surprised if it was really necessary). We certainly can't justify it experimentally: it's not like we've ever actually performed an experiment with two observers in two rockets moving at high velocity past one another, testing that each sees the other moving at the same speed. (We can easily justify the removal of uniform dilations experimentally though, with the simple observation that there actually is a heirarchy of distance scales in the universe.)
I believe that with my careful definitions of the $t',\ x',\ y',\ z'$ axes, many questions about possible variations in the distance and time scales/units have already been addressed, and that any further questions on the issue will also be dealt with shortly, in a very simple and convenient matter. As I said, I intend to justify the principle of reciprocity, as I will need it to reduce my transformations down to a single undetermined function, which I will probably choose to be $a_{00}\left(v_x\right)$. This justification will include questions about possible changes in distance and time scaling, as I've already done for all the conclusions I've reached up to this point. Honestly I can't blame anyone who thinks everything up to this point looks boring as sh*t, but by carefully and rigorously setting up the initial foundations, it ultimately simplifies things a great deal once we start dealing with the more interesting results and how to justify them.

Originally Posted by przyk
With that said I'm not a fan of the way relativity is typically taught (especially to laymen), so for that I commend you for starting a thread like this (though if I were to teach relativity I'd approach it differently). As well as rebutting a lot of crank misconceptions in one place it could also give some interested laymen with confused ideas about relativity some insight about what relativity looks like to physicists and how we apply it.
When I was a teenager (13-14 years old) and started learning certain details about the theory, such as the Lorentz transformations, I had a great deal of skepticism and honestly thought there must be some kind of mistake or reasonable alternative. What ultimately convinced me of Relativity's correctness was the sheer magnitude of evidence behind it as well as seeing some of its derivations. In retrospect, at a higher knowledge level I can now go back to the derivations which already convinced me in the first place, point out gaps and unjustified assumptions in the logic, and then seek improved derivations working directly from the bottom up with the virtually indisputed laws of classical electrodynamics. Will it end all crankery? No of course not, but it should certainly help reduce the number of redundant arguments about it.

Originally Posted by przyk
Why wouldn't you? In electrodynamics the electric and magnetic fields don't just drop out of the sky. They're defined in terms of the force acting on a test charge. The Lorentz force law
$
m \bar{a} = q \bigl[ \bar{E} + \bar{v} \times \bar{B} \bigr]
$
is pretty much the operational definition of the electric and magnetic field vectors. Obviously any transformation in space-time immediately implies a transformation of the velocity and acceleration vectors, which in turn implies a transformation of $\bar{E}$ and $\bar{B}$.
But before we could even bother doing that, we would need to know/deduce how $m'$, $a'$ and $q'$ relate to $m$, $a$ and $q$ in the first place, wouldn't we?

16. Originally Posted by CptBork
(with one extra postulate I haven't yet admitted to, which is the assumption that Maxwell's equations hold in all frames).
If you make the assumption that electromagnetic waves propagate at the same speed, regardless of source or observer (receiver), same assumption as for the light, you get the same result.
If you manage to prove without this assumption, then yes, it would be something.

17. Originally Posted by CptBork
My goal is to turn this "might be" into a "must be", for all those who accept the classical laws of electrodynamics and don't believe the Earth is a magically preferred physical reference frame.
Well that's fairly straightforward: Maxwell's equations are basically invariant under Lorentz transformations and isotropic (in space-time) dilations, so if you accept Maxwell's equations and you don't want there to be a preferred frame, all the laws of physics have to be Lorentz invariant.

I'm perfectly happy to admit that my derivation of this result (linearity) contained some heuristics, some loose definitions and a few skipped steps. However, I am quite certain that with a slightly more precise description of the translations involved, the result can actually be shown to be perfectly rigorous without any further assumptions. In the discussion of reciprocity I plan to make in my continuation of Post #9, I am planning to link to the following paper which also discusses the principle and attempts to justify it: Reciprocity Principle and the Lorentz Transformations (warning: you may be required to access it either from a university connection or else a university VPN account if you want to download it for free). As part of their work, they too derive the linearity condition, but in a more rigorous and elegant fashion than I have employed here. However, with careful mathematical reasoning you can actually show that their approach is entirely equivalent to mine, i.e. one set of assumptions can be directly deduced from the other.
I'll look at the paper when I get the chance, but depending on exactly what you mean by "imposing" translation symmetry, I seriously doubt what you're claiming is possible. In your second post you say:
[...] Otherwise, we could detect an inhomogeneity, i.e. a physical difference between different points in the frame [...]
which I take to mean that you're imposing that there should be no priviledged points in space-time. But the only requirement for this is that the symmetry group contains translations as a sub group, and that alone doesn't imply linearity. A simple counterexample is to take the group of all coordinate diffeomorphisms. It's a group, and among a lot of other junk it contains translations, rotations, and Lorentz as well as Galilean boosts. There is nothing wrong with this logically. It's just that in practice this group is so large that only very trivial theories are going to be symmetrical under arbitrary diffeomorphisms - theories in which either almost nothing or almost everything is a valid solution. The only way you're going to be able to whittle this down is to explicitly assume something about the size of the group you're deriving or postulate that certain types of transformations aren't in it.

A mathematical contraint that gets you linearity is to impose that the full symmetry group has to commute in some sense with the translation group. By this I mean that if for any translation followed by a transformation $x \rightarrow f(x + \delta)$ you impose that there must be a translation $\varepsilon$ such that $f(x + \delta) = f(x) + \varepsilon$, then you get linearity. But what you're doing here is imposing that a translation in the rest frame followed by a boost is equivalent to a boost followed by a translation in the old rest frame. In general how would you justify imposing something like that?

I believe that with my careful definitions of the $t',\ x',\ y',\ z'$ axes, many questions about possible variations in the distance and time scales/units have already been addressed [...]
Actually that's another hidden assumption right there: you're assuming that there are distance and time scales - i.e. you're effectively assuming the symmetry group you're looking for doesn't contain dilations.

Actually, it would be helpful if you could explicitly list all the assumptions you think you need in one place. You seem to be making them up as you go along, and it's not clear on what basis you're picking them - are they supposed to be "reasonable" or supported experimentally or what?

But before we could even bother doing that, we would need to know/deduce how $m'$, $a'$ and $q'$ relate to $m$, $a$ and $q$ in the first place, wouldn't we?
The acceleration is easy: its transformation is derivable from its definition $\frac{\mathrm{d}^{2}\bar{x}}{\mathrm{d}t^{2}}$. That's a bit moot though, since including it in my last post was an error: I really should have written $\frac{\mathrm{d}\bar{p}}{\mathrm{d}t}$. If we go with the modern definition of momentum ($\bar{p} = \frac{\partial\mathcal{L}}{\partial \bar{x}}$), then the way it transforms is going to depend on what you think the kinetic term in the Lagrangian should be in order to reproduce the right behaviour. Rotational symmetry is going to impose that momentum has to be parallel to the velocity though. That combined with q just means that you've only got an overall scaling parameter to play around with in the Lorentz force law - a vector equation. So the Lorentz force law is still going to impose quite a bit on the transformation properties of the electric and magnetic fields.

Additionally I'd argue that q should be invariant (or at worst its velocity dependence should be defined as part of the definition of the theory of electrodynamics). Unlike the mass or momentum, the electric charge is a parameter only relevant to a particle's electromagnetic interactions, and is included in the theory just because we see that not all charges are affected to the same degree in the same electromagnetic field. If you let it vary with velocity you're really just fudging the Lorentz force law.

18. Originally Posted by przyk
Well that's fairly straightforward: Maxwell's equations are basically invariant under Lorentz transformations and isotropic (in space-time) dilations, so if you accept Maxwell's equations and you don't want there to be a preferred frame, all the laws of physics have to be Lorentz invariant.
But this involves a ton of hand-waving right here. Just because Maxwell's equations might be Lorentz invariant and space might be isotropic, why should that automatically imply that all other laws of physics are invariant in the same way? What can we treat as a Relativistic 4-vector, what must we treat differently? Besides, how do we know that the Lorentz transformations are the only way of preserving Maxwell's equations while conforming to other basic postulates, other than by a method such as I intend to employ? You can make assumptions like this and assume it to be physically reasonable, but it's not mathematically airtight IMO.

Originally Posted by przyk
I'll look at the paper when I get the chance, but depending on exactly what you mean by "imposing" translation symmetry, I seriously doubt what you're claiming is possible. In your second post you say:

Originally Posted by CptBork
[...] Otherwise, we could detect an inhomogeneity, i.e. a physical difference between different points in the frame [...]
which I take to mean that you're imposing that there should be no priviledged points in space-time. But the only requirement for this is that the symmetry group contains translations as a sub group, and that alone doesn't imply linearity.
But that's the whole point of what I said. Another way of expressing it as follows: $t'\left(t+\mathrm{d}t,\vec{x}+\vec{\mathrm{d}x} \right)=t'\left(t,\vec{x}\right)+\mathrm{d}t'$, independent of $\left(t,\vec{x}\right)$.
In other words, $t'\left(t_1+\mathrm{d}t,\vec{x_1}+\vec{\mathrm{d}x } \right)-t'\left(t_1,\vec{x_1}\right)=t'\left(t_2+\mathrm{d }t,\vec{x_2}+\vec{\mathrm{d}x} \right)-t'\left(t_2,\vec{x_2}\right)=dt'$

From this it immediately follows that $\frac{\partial t'}{\partial x_j}=a_{0j}$, where $a_{0j}$ is independent of $\left(t,x,y,z\right)$, and we can apply similar reasoning for translations in $\vec{x'}$. I didn't assume linearity off the bat- we still have 4 arbitrary additive constants, which I forced to zero by demanding the origins of $\mathbf{A}$ and $\mathbf{A}'$ coincide. As I say, my reasoning here can be shown to be essentially equivalent to what's done in the paper Reciprocity Principle and the Lorentz Transformations, but their approach has the advantage of not needing to assume off the bat that $t'$ and $\vec{x'}$ are differentiable functions of $t$ and $\vec{x}$.

Originally Posted by przyk
Actually that's another hidden assumption right there: you're assuming that there are distance and time scales - i.e. you're effectively assuming the symmetry group you're looking for doesn't contain dilations.
Actually, I defined a prescription for how distance and time scales should be defined/calibrated in $\mathbf{A}'$ based on how they're defined/calibrated in $\mathbf{A}$. Specifically I defined the correspondence $\left(t=0,x=0,y=y,z=0\right)\rightarrow\left(t'=t' ,x'=x',y'=y,z'=0\right)$, which, as I demonstrated in Posts #6 and #9, when combined with the preservation of orientation, uniquely defines the $z'$-axis and the distance and time scaling in $\mathbf{A}'$ (remember, time can be related to distance through the velocity $v_x$).

Originally Posted by przyk
Actually, it would be helpful if you could explicitly list all the assumptions you think you need in one place. You seem to be making them up as you go along, and it's not clear on what basis you're picking them - are they supposed to be "reasonable" or supported experimentally or what?
They're supposed to be based on physically motivated math postulates, i.e. the preservation of causality and time-ordering in all descriptions of the same reference frame, the uniformity of time and distance intervals relative to the origin (i.e. they must be related by a simple scale factor). You certainly need to make some basic assumptions in order to define what we typically consider a "physically reasonable" mathematical theory.

I agree that there's probably tons of room for major improvement in my presentation, so perhaps in a future draft, or if the mods help me posthumously clean things up once all the dust settles, I could make a better job of laying my postulates out at the start and then referencing them as necessary. Would be cool if I could do what I can do with LaTeX and the hyperref package, so I could refer to equation numbers and then when you click on the number it takes you straight to the relevant equation.

Originally Posted by przyk
The acceleration is easy: its transformation is derivable from its definition $\frac{\mathrm{d}^{2}\bar{x}}{\mathrm{d}t^{2}}$.
Well let's derive the Lorentz transformations first, so we can deduce how accelerations transform as a secondary consequence.

Originally Posted by przyk
Additionally I'd argue that q should be invariant (or at worst its velocity dependence should be defined as part of the definition of the theory of electrodynamics). Unlike the mass or momentum, the electric charge is a parameter only relevant to a particle's electromagnetic interactions, and is included in the theory just because we see that not all charges are affected to the same degree in the same electromagnetic field. If you let it vary with velocity you're really just fudging the Lorentz force law.
Ah, but what if we were free to fudge with the Lorentz force law, thereby allowing $q$ and $q'$ to differ in turn? I've already done the calculations on paper and got the result

$q'=\int\mathrm{d}^3x'\,\rho'\left(t',\vec{x'} \right)=\int\mathrm{d}^3x\,\rho\left(t,\vec{x} \right)=q$ for arbitrary $t$, $t'$ (Maxwell's equations can be used to recover charge conservation separately in each frame, so the times don't really matter as long as they're fixed). It was by no means a trivial result to obtain, but things happen to cancel out just right, and it only ends up taking a few lines and a few slightly tricky calculus manipulations to deduce.

19. Continued from Post #9:

Where we last left off, we had reduced the generalized Galilean transform down to the following:

\begin{align} t'&=a_{00}\left(v_x\right)t+a_{01}\left(v_x\right) x \\
x'&=a_{00}\left(v_x\right)\left(x-v_xt\right) \\
y'&=y \\
z'&=z \\
a_{00}\left(-v_x\right)&=a_{00}\left(v_x\right) \\
a_{01}\left(-v_x\right)&=-a_{01}\left(-v_x\right) \end{align}

Now recall how I originally defined the $\mathbf{A}'$ system. Just as $\mathbf{A}$ sees the spatial origin of $\mathbf{A}'$ moving with velocity $\vec{v}=\left(v_x,0,0\right)$, $\mathbf{A}'$ sees the spatial origin of $\mathbf{A}$ moving with velocity $\vec{v'}=\left(-v_x,0,0\right)$, thus setting the alignment of the positive $t$ and $x$ axes. We showed that the point $\left(t=0,x=0,y,z=0\right)$ had a non-trivial component in the $y'-z'$ plane. This permitted us to define the $y'$-axis with the corresponding points $\left(t',x',y'=y,z'=0\right)$ by picking the component orthogonal to $x'$, and setting the $z'$ axis to be orthogonal to $t'$, $x'$ and $y'$ while preserving the orientation of the $\mathbf{A}$ system. I also showed that this uniquely defined the scaling relationships between any two systems $\mathbf{A_1}'$ and $\mathbf{A_2}'$ produced by reversals of the $x$-axis and/or rotations in the $y-z$ plane in system $\mathbf{A}$. We deduced that the resulting scaling relationships were precisely such as to fix the values $a_{ij}\left(v_x\right)$ for all such systems.

We have now shown that this same set of points actually corresponds to $\left(t'=0,x'=0,y'=y,z'=0\right)$. Thus we can simplify our means of defining the y-axis:

Choose the positive orientations of the $x'$ and $t'$ axes before, with the spatial origin of $\mathbf{A}$ corresponding to the point $\left(t',x'=-v_xt',y=0,z=0\right)$. We see that we can now define the $y'$-axis by the correspondence $\left(t=0,x=0,y=y,z=0\right)\longrightarrow\left(t '=0,x'=0,y'=y,z'=0\right)$ and we will produce the exact same $y'$ and $z'$ axes as before, along with the same coordinate relationships between $\mathbf{A}$ and $\mathbf{A}'$.

Equivalently, we can show that by defining the $z'$-axis through the correspondence $\left(t=0,x=0,y=0,z=z\right)\longrightarrow\left(t '=0,x'=0,y'=0,z'=z\right)$ and picking $y'$ to preserve the same orientation found in system $\mathbf{A}$, we again achieve the exact same results. This simplification will now allow us to justify the principle and application of the complete reciprocity principle, rather than the common approach of taking the entire thing as a physical postulate.

Starting with system $\mathbf{A}$ in a given inertial frame and the appropriate alignment of the velocity $\vec{v}$, we have shown how to define a set of coordinates the corresponding inertial frame $\mathbf{A}'$ and a unique set of relationships between the two. Suppose we take this coordinate system $\mathbf{A}'$ as a starting point for constructing another coordinate system in frame $\mathbf{A}$, a second system we shall label as $\mathbf{A}''$.

Since $\mathbf{A}$ and $\mathbf{A}'$ share the same point as their spacetime origin, and the same relation holds between $\mathbf{A}'$ and $\mathbf{A}''$, this means $\mathbf{A}$ and $\mathbf{A}''$ share the same spacetime origin. Both systems $\mathbf{A}$ and $\mathbf{A}''$ see the spatial origin of $\mathbf{A}'$ moving at velocity $\vec{v}=\vec{v''}=\left(v_x,0,0\right)$ with time increasing for both systems, and so the $+x$-axis coincides with $+x''$ up to a strictly positive scale factor, as do the axes $+t$ and $+t''$.

Furthermore, we have the correspondences $\left(t=0,x=0,y=y,z=0\right)\longrightarrow\left(t '=0,x'=0,y'=y,z'=0\right)\longrightarrow\left(t''= 0,x''=0,y''=y'=y,z''=0\right)$, and so the $y$ and $y''$ axes coincide precisely, as do $z$ and $z''$.

Using identical logic to what I showed in Post #6, we must have the following results:

$t''=\alpha t,\quad \alpha>0
x''^2+y''^2+z''^2=\alpha^2\left(x^2+y^2+z^2\right)$

If we take $x=t=0$, then $x'=t'=0\Longrightarrow x''=t''=0$. Then the fact that $y''=y,\ z''=z$ necessarily implies that we set $\alpha^2=1$, from which $\alpha>0\Rightarrow\alpha=1$.

Thus necessarily $x''^2=x^2$, and the coincidence of their positive axes necessitates that $x''=x$. Additionally, $\alpha=1\Rightarrow t''=t$.

So to conclude, we have $t''=t,\ x''=x,\ y''=y,\ z''=z$. Thus in transforming $\mathbf{A}\rightarrow\mathbf{A}'$ and then reversing the transform by $\mathbf{A}'\rightarrow\mathbf{A}''$, we recover our original coordinate system, i.e. $\mathbf{A}=\mathbf{A}''$. So before concluding this post, I shall summarize the transformation principles we've derived up to this point, along with the result regarding reciprocity.

\begin{align} t'&=a_{00}\left(v_x\right)t+a_{01}\left(v_x\right) x & t&=a_{00}\left(v_x\right)t'-a_{01}\left(v_x\right)x' \\
x'&=a_{00}\left(v_x\right)\left(x-v_xt\right) & x&=a_{00}\left(v_x\right)\left(x'+v_xt'\right) \\
y'&=y & y&=y' \\
z'&=z & z&=z' \end{align}

$a_{00}\left(-v_x\right)=a_{00}\left(v_x\right)=a_{00}\left( \left| v_x \right| \right),\quad a_{01}\left(-v_x\right)=-a_{01}\left(v_x\right)

For those interested in an alternative, more rigorous take on the principle of reciprocity, you might be interested in checking out the following link (requires a direct or VPN university connection to browse for free): Reciprocity Principle and the Lorentz Transformations. I find that much of their logic can be shown to be equivalent to what I've done here, although there are some notable differences in their approach.

To be continued...

20. Continued from Post #19:

So at last, we're ready to wrap this section up. For convenient notational purposes, it will be best to express our transformations as matrix equations:

\begin{align} \begin{pmatrix} t'\\x'\\y'\\z' \end{pmatrix} &= \begin{pmatrix} a_{00}\left(v_x\right) & a_{01}\left(v_x\right) & 0 & 0 \\
-v_xa_{00}\left(v_x\right) & a_{00}\left(v_x\right) & 0 & 0 \\ 0&0&1&0 \\ 0&0&0&1 \end{pmatrix}\begin{pmatrix} t\\x\\y\\z \end{pmatrix}, &\qquad\qquad \begin{pmatrix} t\\x\\y\\z \end{pmatrix}&= \begin{pmatrix} a_{00}\left(v_x\right) & -a_{01}\left(v_x\right) &0&0 \\
v_xa_{00}\left(v_x\right) &a_{00}\left(v_x\right) \\ 0&0&1&0 \\ 0&0&0&1 \end{pmatrix}=\begin{pmatrix} t'\\x'\\y'\\z' \end{pmatrix} \end{align}

From these relationships, we deduce the following matrix equation:

$\begin{pmatrix} a_{00}\left(v_x\right) & a_{01}\left(v_x\right) \\
-v_xa_{00}\left(v_x\right) & a_{00}\left(v_x\right) \end{pmatrix}\begin{pmatrix} a_{00}\left(v_x\right) & -a_{01}\left(v_x\right) \\
v_xa_{00}\left(v_x\right) & a_{00}\left(v_x\right) \end{pmatrix}=\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$

The only non-trivial relationship we can extract from this relationship is the following:

$a_{00}\left(v_x\right)\left[a_{00}\left(v_x\right)+v_xa_{01}\left(v_x\right) \right]=1\Longrightarrow a_{01}\left(v_x\right)=\frac{1-\left[a_{00}\left(v_x\right)\right]^2}{v_xa_{00}\left(v_x\right)}$

The conditions $\lim_{v_x\to 0}a_{00}\left(v_x\right)=1$ and $\lim_{v_x\to 0}a_{01}\left(v_x\right)=0$ can thus be expressed by the equivalent condition $\lim_{v_x\to 0}\frac{1-a_{00}\left(v_x\right)}{v_x}=0$, and we also have the condition $a_{00}\left(v_x\right)>0$ as before.

Our generalized Galilean transforms now reduces to the following:

The Generalized Galilean Transform

\begin{align} t'&=a_{00}\left(\left|v_x\right|\right)t+\frac{1-\left[a_{00}\left(\left|v_x\right|\right)\right]^2}{v_xa_{00}\left(\left|v_x\right|\right)}x & t&=a_{00}\left(\left|v_x\right|\right)t'-\frac{1-\left[a_{00}\left(\left|v_x\right|\right)\right]^2}{v_xa_{00}\left(\left|v_x\right|\right)}x' \\
x'&=a_{00}\left(\left|v_x\right|\right)\left(x-v_xt\right) & x&=a_{00}\left(\left|v_x\right|\right)\left(x'+v_x t'\right) \\
y'&=y & y&=y' \\
z'&=z & z&=z' \end{align}

Now let's quickly remark on some basic symmetry properties that automatically follow from this generalized transform, as they will be useful in constraining the possible transformations of the electromagnetic fields to be discussed in Section III. The correspondence between coordinate transforms in $\mathbf{A}$ and $\mathbf{A}'$ satisfies the following:

$\left(t,x,y,z\right)\longrightarrow\left(t,\pm x,ay+bz,cy+dz\right)\Longrightarrow\left(t',x',y', z'\right)\longrightarrow\left(t',\pm x',ay'+bz',cy'+dz'\right)$

As remarked near the beginning of this section, other coordinate descriptions for $\mathbf{A}$ and $\mathbf{A}'$ can be uniquely related to the systems we have already constructed, and thus the relationship between any two such corresponding coordinate systems is also uniquely defined, even if one system sees the velocity of the other system's spatial origin not moving along the $x$-axis.

Now as promised, we shall see how the generalized Galilean transform reduces to either the classical Galilean transform or the Relativistic Lorentz transform given the right simple assumptions.

If we take one of several classical assumptions, such as $t'=t$ or that $t=0\Rightarrow x'=x$, then we necessarily conclude that $a_{00}\left(\left|v_x\right|\right)\equiv 1$, and we recover the classical Galilean transform:

The Classical Galilean Transform

\begin{align} t'&=t & t&=t' \\
x'&=x-v_xt & x&=x'+v_xt' \\
y'&=y & y&=y' \\
z'&=z & z&=z' \end{align}

On the other hand, if we make either the relativistic assumption $\left(t=t,x=ct,y=0,z=0\right)\longrightarrow\left( t'=t',x'=ct',y'=0,z'=0\right)$ or the assumption $\left(t=t,x=-ct,y=0,z=0\right)\longrightarrow\left(t'=t',x'=-ct',y'=0,z'=0\right)$, then we obtain $a_{00}\left(\left|v_x\right|\right)=\gamma\left(v_ x\right)\equiv\frac{1}{\sqrt{1-v_x^2/c^2}}$, and thus we recover the Relativistic Lorentz transform:

The Relativistic Lorentz Transform

\begin{align} t'&=\gamma\left(v_x\right)\left(t-v_xx/c^2\right) & t&=\gamma\left(v_x\right)\left(t'+v_xx'/c^2\right) \\
x'&=\gamma\left(v_x\right)\left(x-v_xt\right) & x&=\gamma\left(v_x\right)\left(x'+v_xt'\right) \\
y'&=y & y&=y' \\
z'&=z & z&=z' \end{align}

Rather than make any of these assumptions, we shall keep things general and proceed onward to Section III, where we will use the results from this section as well as further reasoning to deduce the most general possible transformations of the electromagnetic fields when transitioning between different inertial frames. In Section IV we will then require the consistency of Maxwell's equations in arbitrary reference frames, thereby showing that the coordinate transforms must inevitably reduce to the Lorentzian case as well as yielding the unique, general transformation rules between $\vec{E}$ and $\vec{B}$ fields in different reference frames (less generalized than the generalizations I will make in Section III, because we will also require that Maxwell's equations be satisfied at this point).

I'll make one last remark here which will be of use to us once we start applying Maxwell's equations. If we define $k_1:=a_{00}\left(\left|v_x\right|\right)$ and $k_2:=\frac{1-\left[a_{00}\left(\left|v_x\right|\right)\right]^2}{v_xa_{00}\left(\left|v_x\right|\right)}=\frac{ 1-k_1^2}{v_xk_1}$, then the chain rule implies that we may make the following partial derivative substitutions:

\begin{align} \frac{\partial}{\partial t'}&=k_1\left(\frac{\partial}{\partial t}+v_x\frac{\partial}{\partial x}\right) & \frac{\partial}{\partial x'}&=-k_2\frac{\partial}{\partial t}+k_1\frac{\partial}{\partial x} & \frac{\partial}{\partial y'}&=\frac{\partial}{\partial y} & \frac{\partial}{\partial z'}=\frac{\partial}{\partial z} \end{align}

And now this section is concluded, aside from the matter to be discussed as mentioned in the edit directly below. Thanks everyone for the questions, comments, suggestions, and for reading the thread!

Edit: There actually is one more thing to be done here. Based on a link provided by Przyk, I realized that the composition of velocities implies a certain constraint on the function $a_{00}\left(\left|v_x\right|\right)$, and we should explore this constraint and its consequences first before truly proceeding to Section III. This insight wasn't necessary for me when I applied Maxwell's equations and derived the unique result for the Lorentz transformations on paper, but it's nice to see that this final result is already constrained to a certain very familiar form, even before we begin discussing anything whatsoever about electromagnetic fields. I will continue this discussion as an addendum to this section before truly proceeding onwards to Section III.

End of Section II (for now)...

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•