General Relativity Primer

Discussion in 'Physics & Math' started by Markus Hanke, May 16, 2013.

  1. Markus Hanke Registered Senior Member

    Messages:
    381
    1. PRELIMINARIES

    Before we really get into the swing of things, a little bit of housekeeping is necessary.
    Much of the maths involved in GR contains upper and lower indices, so the first thing we all need to be clear about is the Einstein summation convention. This convention of notation states that, if the same index appears as an upper and lower index within the same term, a summation over all possible values of the index is implied without having to explicitely write down the summation sign.
    For example :

    \(\displaystyle{\alpha _{i}e^{i}\equiv \sum_{i=1}^{3}\alpha _{i}e^{i}=\alpha _{1}e^{1}+\alpha _{2}e^{2}+\alpha _{3}e^{3}}\)

    Or another example :

    \(\displaystyle{T{_{k}}^{k}\equiv \sum_{k=1}^{3}T{_{k}}^{k}=T{_{1}}^{1}+T{_{2}}^{2}+T{_{3}}^{3}}\)

    Furthermore, in the maths of GR there is distinction between the use of latin letters and greek letters; latin letters refer to spatial components and thus take on only the values 1,2,3. Greek letters on the other hand refer to space-time components, and thus take on the values 0,1,2,3.

    Example :

    \(\displaystyle{a_{i}x^{i}\equiv a_{1}x^{1}+a_{2}x^{2}+a_{3}x^{3}}\)

    but

    \(\displaystyle{a_{\mu }x^{\mu }\equiv a_{0}x^0+a_{1}x^{1}+a_{2}x^{2}+a_{3}x^{3}}\)

    Lastly, it is very important to distinguish between covariant ( lower ) and contravariant ( upper ) indices - in the general case these are not freely interchangeable ! We will take a closer look at the mathematical and physical meaning of that distinction later. For the moment just be clear that in the general case

    \(\displaystyle{A_{\mu }\neq A^{\mu }}\)

    In the next section we will take a closer look at coordinate basis, and the distinction between covariant and contravariant quantities.
     
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. Markus Hanke Registered Senior Member

    Messages:
    381
    2. MANIFOLDS AND COORDINATES

    Now let's talk a bit about manifolds. Mathematically, and in laymen's terms, an n-dimensional manifold is a topological space which can locally ( i.e. in the immediate vicinity of each point ) be approximated by ordinary n-dimensional Euclidean space. The emphasize here is on locally, because that feature does not necessarily hold globally ( more about that later ). In GR we specifically deal with smooth, differentiable 4-manifolds. Differentiable means ( in simple terms ) that we can do calculus on it, i.e. we can define derivatives at each point, and smooth means that we can form derivatives of any arbitrary degree. The "4" means that we are dealing with a manifold of exactly four dimensions. The differentiability and smoothness conditions imply that such manifolds cannot have any discontinuities or other topological defects which prevent us from forming derivatives at such points; this very simple and intuitive definition leads us to a very important fact - in GR there can be no singularities. You may now jump up and say : "Wait a minute, what about the Big Bang, and Black Holes ?". The answer is that neither of these types of singularities are part of space-time; GR does not predict a singularity at the Big Bang, and it does not predict a singularity at the "centre" of a black hole. This is a misconception which stems from failing to understand the definition of a manifold. In actual fact, GR makes no prediction whatsoever about what happens at the centre of a gravitational collapse, and it makes no prediction whatsoever about the Big Bang. These events are not part of physical space-time, so GR cannot be applied to them; in fact, none of the physics of GR is even defined or definable at such points, they are simply outside the domain of applicability of the model.

    Anyway, back to the topic at hand. The notion of a manifold as we have defined it above is rather abstract, and in its current form not terribly useful. In order to extract some real physics from GR ( i.e. numerical predictions & calculations ), we are going to need three elements :

    1. A chart, i.e. a set of coordinates, and possibly and atlas, being a set of charts on different regions of our manifold
    2. A connection, which defines the notion of transport along curves, which in turn can be used to represent the intrinsic geometry of our manifold
    3. A metric, which enables us to define measurements such as length, angles volumes etc on our manifold

    Let's start with the first element, a set of coordinates. As mentioned above GR deals with a 4-dimensional manifold, so we are going to need a set of four coordinates to represent a single point; we will denote these general coordinates with numbers, like so :

    \(\displaystyle{\left \{ x^{0},x^{1},x^{2},x^{3} \right \}}\)

    The reason why we don't call them t,x,y,z is that we do not wish to restrict ourselves to Cartesian coordinates; it would be just as valid and correct to use a system of polar coordinates, or spherical coordinates, or cylindrical coordinates, or any other system you can dream up. Which brings us to a very important property of GR called general covariance. What this means is that all the laws of physics which we write down in the context of GR are independent of the choice of particular coordinates, in other words, their form does not change if we decide to choose a different coordinate system. This makes perfect sense, because the moon stays in orbit regardless of whether we describe its motion in terms of Cartesian coordinates, or spherical coordinates; the laws of physics don't care about the labels we assign to the basis vectors on our manifold, so it is meaningful to try and formulate them in such a way that they hold in any coordinate system.

    Consider now a general 4-vector A on our 4-dimensional manifold - locally, since by above definition of a manifold we can approximate a local neighbourhood by ordinary Euclidean space, any such vector can be decomposed in terms of the basis vectors of our coordinate system by multiplying each one with a scalar :

    \(\displaystyle{A_{\mu }=A_{0}\mathbf{e}^{0}+A_{1}\mathbf{e}^{1}+A_{2}\mathbf{e}^{2}+A_{3}\mathbf{e}^{3}}\)

    Now, it is important to reiterate that this is a local decomposition, so with a BIG wave of the hand, and totally without rigorous proof, I now state that the set of basis vectors can be written as

    \(\displaystyle{\left \{ \mathbf{e}^{\mu } \right \}\Rightarrow \left \{ dx^{\mu } \right \}}\)

    to denote that we are now working in an infinitesimally small neighbourhood on our manifold. This may seem very abstract, but will become important later. Much of GR's math is first written in terms of infinitesimals like the above, and then integrated over regions of the manifold to obtain more general results. We will encounter the "dx" notation for basis vectors quite often in future chapters of this presentation.

    Finally, to conclude this post, a word or two about covariant and contravariant vectors. I remind you again that in GR contravariant components are denoted with upper indices, and covariant components with lower indices. So what's the difference ? This once again boils down to the notion of general covariance, i.e. the independence of any particular coordinate system; in other words, if we apply a coordinate transformation to any relation written in terms of vectors, we want those vectors to behave in just such a way that the general form of the relation does not change during the transformation. So here it goes :

    For a contravariant vector ( upper indices ! ) to be independent of the coordinate basis, its elements must contra-vary with our change in coordinates. For example, imagine a position vector which is 1000m long; we are now performing a ( rather trivial ) transformation of our coordinate system from meters to kilometres, i.e. the basis vectors become longer in our new system. To compensate, the original vector must become shorter ( magnitude goes from 1000 to 1 ) to compensate - it contra-varies in the opposite sense to represent the same physics. Examples of contravariant vectors would be anything to do with positions, and its derivatives like velocity, acceleration etc.

    For a covariant vector ( lower indices ! ), the situation is reversed - for it to retain its physical meaning under a coordinate transformation, it must co-vary ( vary in the same sense ) with the basis vectors. This is generally the case if we are dealing with gradients, and such vectors would have dimensions inverse to the ones listed above, so for example "per meter".

    Later on when we deal with tensors ( which can have mixed indices ) it is very important to realize and remember that covariant and contravariant indices are not freely interchangeable; one cannot just put an upper index down and vice versa and expect everything to still work fine. There are however ways to "raise" or "lower" indices, which we will discuss in a future post.

    Now, that was a handful. Everyone still with me ? In the next post I will introduce the very crucial concept of the infinitesimal line element, which brings us closer to the notion of the metric tensor.
     
  4. Google AdSense Guest Advertisement



    to hide all adverts.
  5. Markus Hanke Registered Senior Member

    Messages:
    381
    3. THE LINE ELEMENT

    Now that we have a better understanding of manifolds and coordinates ( hopefully ), let us start thinking about how to measure things on those manifolds.

    Consider a very simple scenario, a flat piece of paper with two random dots drawn on it. We are given a cartesian coordinate basis \(\left \{ x,y \right \}\) where x,y are just real numbers, and we want to find the distance between the two dots in terms of that coordinate basis. The solution is of course elementary - all we need to do is find the difference in y and the difference in x, and apply the Pythagoream theorem ( we denote the distance with "s" ) :

    \(\displaystyle{s^2=(\triangle x)^2+(\triangle y)^2}\)

    Now we want to generalize this result so that we are able to define the arc length of any arbitrary curve on our flat piece of paper. In order to do this we allow our two dots to approach each other infinitesimally close, so that we can re-write the above relation is terms of infinitesimals, like so :

    \(\displaystyle{ds^2=(dx)^2+(dy)^2}\)

    In other words, the above relation gives us a measurement of the distance between two infinitesimally close neighbouring points on our flat surface. One would naively think that this distance is zero, but not so ! Why ? Because, if we now go and form the sum of infinitely many of these infinitesimal distances along some arbitrary curve ( denoted C ), we get

    \(\displaystyle{s=\int_{C}ds=\int_{C}\sqrt{dx^2+dy^2}}\)

    which is a line integral, and we find that when we evaluate that integral along a finite segment of our curve C ( i.e. between two well defined points on the curve ) we get a finite, well defined result. This is just the arc length of the curve segment. This shows us that the ( scalar ) quantity ds is not just zero, but indeed a valid and meaningful measurement of an infinitesimal distance between two neighbouring points on a manifold, even though this does not initially appear to make intuitive sense ! We shall henceforth call ds the line element, or length element. To generalise the notion a little further, we need to recognize that the notation "ds" implies that we are dealing with a change of some sort; if s denotes an arc length, then ds can alternatively be thought of a change in arc length. Now, with a bit of handwaving an arc length can also be expressed as some vector x ( the magnitude of which is just the scalar arc length itself ), which enables us to re-write our definition for ds in a more general form in terms of changes in that position vector :

    \(\displaystyle{ds^2=d\mathbf{x}\cdot d\mathbf{x}=\left \langle d\mathbf{x}|d\mathbf{x} \right \rangle}\)

    where the angle brackets denote the inner product, which, in Euclidean space, is just the normal vector dot product. This derivation of the above is by no means rigorous, but you will find by referring to any differential geometry textbook that the end result is indeed correct.

    Now, thus far we have restricted ourselves to flat 2-dimensional Euclidean spaces, and Cartesian coordinates. Is there a way to generalize this to arbitrary dimensions, and arbitrary coordinate systems, perhaps even curvilinear coordinate systems ? As it turns out, there is. In fact, we have already done most of the work ! Let us consider first what happens if we stay in Cartesian coordinates on flat Euclidean space, but go to 3 dimensional space; the vector x now has three components, and the above general relation for the line element becomes

    \(\displaystyle{ds^2=d\mathbf{x}\cdot d\mathbf{x}=dx^2+dy^2+dz^2}\)

    without any further work or ado. This can in the exact same way be done for any n-dimensional Euclidean space using Cartesian coordinates. The result is just simply the sum of the squared differentials.

    Now pay close attention, the following bit is crucial for the understanding of the mathematics of General Relativity. We are now asking ourselves - how can we generalize the above even further, and obtain a line element in any number of dimensions, and for any coordinate system, whether flat or curvilinear ? Having such a line element would enable us to define measurements on any arbitrary manifold, so this would be a very powerful tool.

    Let us work through it with a concrete example, being spherical coordinates in three dimensions. As the name implies, spherical coordinates allow us to easily specify a point on the surface of a 3-sphere. Since we are working in three dimensions, we are of course expecting three coordinates. What are they ? Well, first we have to specify the radius of the sphere, or else we can't pinpoint a surface. Once this is done we then need two angles, which correspond to the North-South and East-West directions, in order to uniquely specify a point on the surface. We denote the so-obtained set of coordinate labels as

    \(\displaystyle{\left \{ r,\theta ,\phi \right \}}\)

    Since the entire setup is spherically symmetric ( duh ! ), a little bit of elementary trigonometry shows us that we can get from Cartesian coordinates to spherical coordinates via the following mapping :

    \(\displaystyle{\left \{ x,y,z \right \}\rightarrow \left \{ r,r\theta ,r sin \theta \phi \right \}}\)

    which is, if you look very closely, the new coordinate labels multiplied by some factor ( this fact will become very important in a second ). Let's change it to infinitesimals, and insert this mapping into our line element, and see what it looks like :

    \(\displaystyle{ds^2=dr^2+r^2d\theta ^2+r^2sin^2\theta d\phi ^2}\)

    Again, this is just a sum of the ( squared ) coordinate infinitesimals multiplied by some factor, i.e. for a general set of arbitrary coordinates it has the general form

    \(\displaystyle{ds^2=g_{0}(dx^{0})^2+g_{1}(dx^{1})^2+...+g_{n}(dx^{n})^2}\)

    or more explicitely

    \(\displaystyle{ds^2=g_{0}dx^{0}dx^{0}+g_{1}dx^{1}dx^{1}+...+g_{n}dx^{n}dx^{n}}\)

    As you can see above the pattern so far is that we form the terms by multiplying the coordinate differentials with themselves, and then multiply the result with some factor. But why restrict ourselves to multiplying the differentials only with themselves ? Surely, we can conceive of coordinate systems where the individual terms are dependent on sets of two coordinates, i.e. where diagonals in a coordinate grid are important. This will give us mixed terms of the form "dxdz" or "dydx" and such like. If we want to allow such mixed terms in our line element in order to obtain the most general expression for any coordinate system in any number of dimensions ( we definitely do ! ), we get :

    \(\displaystyle{ds^2=g_{00}dx^{0}dx^{0}+...+g_{0n}dx^{0}dx^{n}+...+g_{n0}dx^{n}dx^{0}+...+g_{nn}dx^{n}dx^{n}}\)

    Take a careful look at the coefficients in front of our coordinate differentials - if we allow mixed terms in the sum, we need not just a simple set of coefficients, but a 4x4 matrix of them to cover all possible combinations. Now, the above looks horrible and is very unwieldy; however, since this is just a sum of product terms, we can make use of the aforementioned Einstein summation convention and rewrite this like so :

    \(\displaystyle{ds^2=\sum_{\mu =0}^{3}\sum_{\nu =0}^{3}g_{\mu \nu }dx^{\mu }dx^{\nu }=g_{\mu \nu }dx^{\mu }dx^{\nu }}\)

    which is much more clear and concise. And there you have it - the above is a general line element, valid in any number of dimensions, for any coordinate system we care to pick. What it does is give us a the measurement for an infinitesimal small arc length segment on any arbitrary manifold; we can integrate this element to obtain arc lengths of curves on our manifold, so for example we could use this to calculate the proper time of an observer, since this is just the arc length of his world line.

    The matrix of coefficients \(g_{\mu \nu }\) is called the metric tensor, and is our first encounter with a new class of mathematical objects called tensors. We will take a closer look at them and their properties in the next post.

    I must reiterate at this point that all of the above is valid only on differentiable manifolds, which we assume we are dealing with here. All manifolds that GR handles are differentiable ones.

    Phew. Everyone still with me ?
     
  6. Google AdSense Guest Advertisement



    to hide all adverts.
  7. Markus Hanke Registered Senior Member

    Messages:
    381
    4. TENSORS

    Now we need to tackle a topic which many people dread the most when dealing with GR, that of tensors. Since the notion of tensors is crucial to the understanding of GR, I will ask you to set aside to your preconceived ideas and prejudices, and simply follow the below. The purpose of this is not to give an exhaustive, mathematically rigorous and precise lecture on tensor calculus, but simply to convey the basic underlying ideas. Again I will opt for clarity at the expense of mathematical rigour.

    Let us once again take a closer look at the general expression for the squared line element from the last post :

    \(\displaystyle{ds^2=g_{\mu \nu }dx^{\mu }dx^{\nu }}\)

    If you refer back to topic (2) below, you will see that we came to the conclusion that the differentials "dx" can be considered unit vectors in our coordinate system; keeping that in mind, the conceptual form of the above relation is

    (scalar) = (tensor) x (vector) x (vector)

    So what does this mean ? It means quite simply that the tensor in the above relation acts on two vectors, processes them, and produces a scalar as the end result. And that is what tensors do - they are a special class of "functions" which take quantities such as vectors and other tensors as input, process them, and produce a real number as output. In more formal terms, tensors are multilinear maps, which map covariant and contravariant quantities into real numbers.

    The intuitive meaning behind tensors is that they establish geometric relationships between quantities. Examples are the dot product in three dimensions ( a tensor which maps two vectors into a scalar ), or our metric tensor ( which agains maps two vectors into a scalar ). The crucial bit to understand is that tensors do this without reference to any particular coordinate system; for example, the above mentioned dot product looks algebraically different in different coordinate systems, but the geometric relationship, i.e. the interpretation, behind it is the same. Tensors are therefore covariant objects, i.e. they and their relationship retain their form under arbitrary coordinate transformations. This is where their true power lies, because if we formulate a physical law in terms of tensors, then that formulation will be valid for any coordinate system, which of course means for any observer on a given manifold.

    So once again - tensors represent geometric relationships between quantities, and they are invariant under coordinate transformations, i.e. they establish these relationships consistently and in the same form for all observers on a given manifold. And that is all there is to it. They are not some magical entity which is impossible to grasp and understand, they are simply functions which take some input, crunch it up, and spit out some output, and they do this consistently in the same way for all observers. It really is that simple.

    The rank of a tensor is the number of independent indices it has - for example, our metric tensor \(g_{\mu \nu }\) has two indices, so it is a rank-2 tensor. The tensor \(\displaystyle{R{^{\lambda }}_{\mu \nu \xi }}\) would be of rank 4, since it has four indices.

    The valence of a tensor denotes the number of covariant and contravariant indices it has. \(\displaystyle{R{^{\lambda }}_{\mu \nu \xi }}\) has valence (1,3), and the metric tensor \(g_{\mu \nu }\) has valence (0,2). Tensors which have both upper and lower indices are also called mixed tensors.
    The significance of the mixed upper and lower indices can easily be understood in terms of tensors being "maps" or "functions"; a general type-(M,N) tensor is a function which takes exactly M co-vectors ( "row vectors" ) and N vectors ( "column vectors" ) as input, processes them, and outputs a real number. It's that simple. For example, the type (0,2) metric tensor simply takes two vectors and maps them into a real number. The above mentioned rank-4, type (1,3) tensor maps one co-vector and three vectors into a real number. How difficult is this, really ?

    All tensors can be represented by matrices, but not all matrices are automatically tensors ( because not all matrices are invariant under arbitrary coordinate transformations ), so be careful here. A rank-2 tensor in n dimensions can be represented by a (n x n) matrix, a rank-3 tensor in n dimensions can be represented by a ( n x n x n ) matrix and so on. Rank-0 tensors are scalars, rank-1 tensors are vectors.

    Tensors can be added and multiplied by the usual laws of matrix algebra, and their is also a operator called the tensor product ( \(\otimes \) ) defined on them. I will skip the details here, since I don't think I will need this for my presentation. We can also do calculus with tensors, i.e. we can define the notion of a tensor derivative, which is called the covariant derivative. I will come back to this later, since we will need to talk about connections first before we can define that derivative.

    Lastly I will show you how to raise and lower indices on tensors. The basic idea is very simply to sum the index in question with the opposite index of the metric tensor defined on our manifold; i.e. to "lower" an upper index, you sum it up with a corresponding lower index of the metric tensor, thereby eliminating it and leaving only a lower index, and vice versa :

    \(\displaystyle{g^{\mu \nu }A_{\nu }=A^{\mu }}\)

    and

    \(\displaystyle{g_{\mu \nu }A^{\mu }=A_{\nu }}\)

    Likewise for indices on rank-2 tensors :

    \(\displaystyle{A^{\mu \nu }=g^{\mu \xi }g^{\nu \pi }A_{\xi \pi }}\)

    The trick is simply to keep careful track of what gets summed with what, and what is left after the summation. The summation of an upper and a lower index in the same term eliminates that index, and reduces the rank of the tensor by 1. You can also sum two or more indices on a single tensor; this is called an index contraction. For example :

    \(\displaystyle{R{^{\alpha }}_{\beta \alpha \gamma }=R_{\beta \gamma }}\)

    Before we wrap it up a word of caution - in the general case it is best not to attempt to "visualize" a tensor. It is tempting to try and find a visualization, like one can visualize a vector as an "arrow", but in the case of tensors this is not in general possible, or even helpful. There are such methods of visualisations for lower rank tensors, but from my own experience I can say with confidence that these lead to more confusion than anything else. Just understand a tensor to be a little machine crunching up other vectors and other tensors, and spitting out a real number. This works for tensors of any rank and valence.

    This concludes our little introduction to tensors. In the next chapter we will take a closer look at one particular very important tensor, the metric tensor. All of this may seem like a lot of maths, but bear with me, we will soon be coming to the physics of GR, which will then be a breeze to understand given our newly found maths knowledge
     
  8. Markus Hanke Registered Senior Member

    Messages:
    381
    5. THE METRIC TENSOR

    Without a shadow of a doubt the metric tensor is the single most important object in GR; it encapsulates the measurements and geometry of our space-time manifold, and it is what the ( yet to be developed ) Einstein field equations are solved for. It is thus only fair that we devote an entire post to it.

    Firstly, we need to distinguish between the metric tensor and the metric. The former refers to the mathematical object we defined previously :

    \(\displaystyle{g_{\mu \nu }:=\sqrt{\left \langle d\mathbf{x}|d\mathbf{x} \right \rangle}}\)

    We have mentioned before that this metric tensor, in the general case, is a function of position ( its elements depend on position ), i.e. \(\displaystyle{g_{\mu \nu }=g_{\mu \nu }(\mathbf{x})}\). As such what we are actually dealing with on our space-time is a field of metric tensors, i.e. every point in space-time has a metric tensor associated with it. This tensor field as a whole is called the metric of our manifold.

    A very important property of the metric tensor is that it is symmetric : \(\displaystyle{g_{\mu \nu }=g_{\nu \mu }}\). We can swap the order of the indices without effecting the maths or physics. This makes intuitive sense, since this tensor defines an inner product on vectors or covectors ( which is commutative ), so the outcome should not depend on the order in which the vectors/covectors are inserted. Also, the property of symmetry reduces the number of independent elements of the tensor; we will later see that in the general case only 10 out of the 16 elements are functionally independent, which will make our lives easier once it comes to tackling the field equations of GR.

    Now, as mentioned previously the main function of the metric tensor is to enable us to define measurements on our manifold. We have already seen one example of this, being the line element, which can be integrated along a curve to yield arc length. We can also define the norm or length of a given vector in terms of the metric tensor as

    \(\displaystyle{\left \| \mathbf{x} \right \|=\sqrt{g(\mathbf{x},\mathbf{x})}=\sqrt{g_{\mu \mu }x^{\mu }x^{\mu }}}\)

    Likewise, we can define the angle between two vectors x,y as

    \(\displaystyle{\varphi =arccos\left ( \frac{g\left ( \mathbf{x},\mathbf{y} \right )}{\left \| \mathbf{x} \right \|\left \| \mathbf{y} \right \|}\right )}\)

    So now we know how to calculate lengths, vector norms and angles in terms of the metric tensor; the last thing we need is a volume element, preferrably a generalized one which we can also use for surfaces and volumes in any number of dimensions. Without proof or derivation ( refer to any textbook on differential geometry, if you are interested ), here it is :

    \(\displaystyle{vol_{g}:=\sqrt{\left | det(g) \right |}dx^{0}\wedge ...\wedge dx^{n}}\)

    where \(\wedge\) denotes the wedge product known from exterior algebra, and det is the usual matrix determinant. This wedge product takes into account orientation as well, so that the signs are all correct. As an example, the total area of a surface S in two dimensions in any arbitrary coordinate system becomes

    \(\displaystyle{\int \int_{S}\sqrt{\left | det(g) \right |}dx^{0}dx^{1}}\)

    as expected.

    One more feature of metric tensors which we should be aware of is the notion of the metric signature. This is quite simply the signs of the diagonal elements of the metric tensor; so, for example, Minkowski space-time would have a metric signature of {+,-,-,-}. Basically what this shows us is that we have one temporal and three spatial dimensions.

    Let us, just for clarity, give two explicit examples of metric tensors, so everyone can see what it actually looks like. The first one is trivial, it is simply the metric tensor of Minkowski space-time, i.e. Special Relativity :

    \(\displaystyle{g_{\mu \nu }=\begin{pmatrix} 1& 0& 0& 0\\ 0& -1& 0& 0\\ 0& 0& -1& 0\\ 0& 0& 0& -1 \end{pmatrix}}\)

    The second example are spherically polar coordinates in three dimensions :

    \(\displaystyle{g_{ik}=\begin{pmatrix} 1& 0& 0\\ 0& r^2& 0\\ 0& 0& r^2 sin^2\theta \end{pmatrix}}\)

    Before we wrap it up, take note of this - we have been speaking a lot about what the metric tensor is, but it is equally important to realize what it is not :

    - it is not a force
    - it is not a potential ( energy )
    - it is not curvature

    All it is a simply a way to define measurements at each point in space-time.

    This concludes our little discussion of the metric tensor. In the next chapter we will discuss connections and the covariant derivative, which will enable us to extract information about the geometry of a manifold from the metric tensor. Given that we will then finally be able to introduce the notion of geodesics and curvature, and how those are mathematically described.

    Stay tuned !
     
  9. Markus Hanke Registered Senior Member

    Messages:
    381
    6. CONNECTIONS AND COVARIANT DERIVATIVES

    Now that we have a fair understanding of how measurements and coordinates are defined on our manifold, we need to talk about how to extract more information about its geometry from the metric tensor. This is important, because we need some way to quantify geometric properties of space-time in order to formulate physical laws on it, and about it. Clearly, we would expect that information to be in a covariant form, since a mere transformation of coordinates should leave the underlying geometric properties of the manifold untouched; in other words, all observers in our space-time should agree on its geometric properties. We therefore expect the quantitative measure of those properties to show up in the form of tensors, which is indeed what happens, as we shall see shortly.

    From now on, when we talk about manifolds, we shall assume that these are Riemann manifolds - such manifolds are differentiable, smooth, and equipped with a metric at each and every point. That means that wherever we go on the manifold, the notions of inner product, vectors, tangents, measurements, derivatives etc are all well defined in a consistent manner. Physically, this is a reasonable assumption. So, henceforth, when I refer to "manifold" I shall actually mean "Riemann manifold".

    To start with, picture a rather trivial scenario, a flat Euclidean space which is smooth and differentiable everywhere. This could, for example, be a 2-dimensional sheet of paper stretching into infinity in all directions. Clearly we have no problem forming derivatives at each point of such a space, i.e. defining a tangent space at each point of this flat manifold; since the tangent space of a flat Euclidean surface is just another copy of that same surface, we can also with confidence say that derivatives, or more precisely tangent vectors along the same direction at different points on the manifold, are parallel. I state this now without formal proof, since it is intuitively obvious for this simple scenario. In more practical terms this means that we can define a common differential operator for the operation "take the derivative" that is valid everywhere on our rather boring manifold. For some vector field \(A^i\), that is simply

    \(\displaystyle{A{^{i}}_{|k}=\frac{\partial A^{i}}{\partial x^{k}}}\)

    The single pipe before the index denotes the ordinary partial derivative with respect to the k-th coordinate ( I will talk about this particular notation a bit later ).

    Now let us perform that same thought experiment on a different kind of manifold, the 2-dimensional surface of a ( 3-dimensional ) sphere. Let us pick an arbitrary point A on the sphere's surface - the tangent space at that point would be a flat plane, which is why it is unsurprisingly called the tangent plane. Now pick another point B on the same sphere's surface, and again visualize the tangent plane at that new point. Finally, compare those two tangent planes, and you will notice immediately that they are not parallel; instead they are at an angle relative to each other. What that means is that, if we naively calculate the derivate along the same direction but on different points of the surface, we get a different result ( a vector with the same magnitude but pointing in a different direction ). That is rather awkward, since it makes it a lot harder to define a differential operator for the "take the derivative with respect to the k-th coordinate" which gives comparable results no matter where on the surface we are. The above expression for the ordinary partial derivative is clearly insufficient for this purpose. If we wish to find a differential operator which generalises the notion of "partial derivative", and which is equally valid on all points of the manifold in question, we have to somehow compensate for the change in orientation of our tangent plane as we go from point A to point B. We will denote this new, generalized operation by a double pipe before the index :

    \(\displaystyle{A{^{i}}_{||k}=\frac{\partial A^{i}}{\partial x^{k}}+(correction terms)}\)

    Now we need to find these correction terms. Let's go about this systematically - the LHS is an object with two indices; since the whole point of the exercise is to find a covariant generalisation of the partial derivative, we see that this must turn out to be a rank-2 tensor ( or else we will have failed in our endeavour ). The ordinary partial derivative on the RHS contains also 2 indices, so, since tensors are linear in addition, the correction term must also have two indices. At the same time the correction will in general depend on all coordinates for a general manifold. The simplest possible object which contains a summation over all coordinates, yet leaves us with two free indices, would be an object with a total of three indices ( one of them is summed over all coordinates ). Without further proof I now present to you the covariant derivative :

    \(\displaystyle{A{^{i}}_{||k}=\frac{\partial A^{i}}{\partial x^{k}}+\Gamma _{kp}^{i}A^{p}}\)

    This is the sought-after generalisation of the notion of "partial derivative", which is valid on any Riemannian manifold of any dimension, and in any coordinate system. The coefficients \(\Gamma _{kp}^{i}\) are called the Christoffel Symbols of the Second Kind - I state here, again without proof, that these are not in fact tensors; this is intuitively obvious, since as being the correction terms in the above covariant derivative, they must explicitly depend on the chosen coordinates, so they can't be tensors. Once can also form Christoffel Symbols of the First Kind by summation with the metric tensor :

    \(\displaystyle{\Gamma _{ikp}=g_{im}\Gamma _{kp}^{m}}\)

    The Christoffel symbols of the second kind can be explicitly calculated from the metric tensor and its ordinary partial derivatives, like so :

    \(\displaystyle{\Gamma _{kp}^{m}=\frac{1}{2}g^{mi}\left ( \frac{\partial g_{ik}}{\partial x^p} +\frac{\partial g_{ip}}{\partial x^k}-\frac{\partial g_{kp}}{\partial x^i} \right)}\)

    The covariant derivative can be defined not just for vectors, but for tensors of any rank; adding an extra rank will add another correction term for the additional index. Swapping co- and contravariant indices will simply change the sign before the Christoffel symbol. For example, the covariant derivative of a rank-2, type (1,1) tensor with respect to some coordinate p would be

    \(\displaystyle{T{^{i}}_{k||p}=T{^{i}}_{k|p}+\Gamma _{pm}^{i}T{^{k}}_{m}-\Gamma _{kp}^{m}T{^{i}}_{m}}\)

    As a final remark, let me state that in the general case the covariant derivative is not commutative, i.e. the order of differentiation does matter : \(\displaystyle{A_{a||b||c}\neq A_{a||c||b}}\). This fact is of crucial importance, as we will see in the next chapter.

    Now, let us go back for a minute to the initial example of the tangent planes at points on the surface of a sphere. What we have done now is that we have defined a generalisation of the partial derivative which is equally valid on all points of our sphere. To put this in a slightly different way, we have succeeded in finding a way to "connect" the tangent planes with an object ( the covariant derivative ) that compensates for the changes in orientation as we move one plane from point A along some curve into point B. This works for all points along any arbitrary curve on our manifold. This object, our covariant derivative, since in a sense it "connects" tangent planes in a consistent manner, is also called an affine connection. The "affine" here refers, so far as I understand this, to the fact that the connection preserves the notion of parallelism when connecting our tangent planes - this is what the correction terms in the covariant derivative do. As Wikipedia ( lol ) will confirm for you, an affine connection can always be defined as the covariant derivative on the tangent bundle, i.e. on the set of all tangent spaces on a manifold, just like we have done.

    But we are not done yet. Let us think about this a bit more - our initial motivation was that tangent vectors at different points on a general manifold aren't necessarily parallel, unlike in the special case of the Euclidean flat space. Rather, they are at an angle. How do we arrive at such an angle between two tangent planes ? Well, one way is to smoothly "roll" a tangent plane from point A to point B along a curve without letting it slip or twist along the way. If our surface is flat, the tangent planes will of course coincide. If the surface is not flat, as is the case for our sphere, the tangent planes will be at an angle relative to each other. So far so intuitive. However, there is another way to arrive at the same result - instead of "rolling" the tangent plane along a curve on a curved surface, we could just as well pretend that the surface is actually flat, and instead "twist" the tangent plane around a straight line on a flat surface. The end result is the same - two planes at an angle relative to each other. By the same right we could combine these two operations, and partly "roll", partly "twist" the plane, to arrive at always the same angle. In fact, we see now that there are infinitely many combinations of "rolling" and "twisting", which all yield the same result !

    ( continued in next post )
     
  10. Markus Hanke Registered Senior Member

    Messages:
    381
    ( Connections and Covariant Derivatives - continued )

    What this shows us is that there is not just one way of "connecting" tangent planes on a given manifold, but actually infinitely many, all characterized by suitable combinations of "rolling" and "twisting" them along some general curve. In a more formal manner, the "rolling without slipping or twisting" reflects curvature, whereas the "twisting" reflects the notion of torsion. Both of these are properties of the connection which we choose to connect tangent spaces at points on our manifold. We immediately see that there are infinitely many connections for any given manifold - we can choose any suitable combination of curvature and torsion to define a connection between tangent spaces.

    But of course, we want to make our life as easy and as intuitive as possible. For one thing, if the end result is the same, we might as well pick a connection that is not a mix of curvature and torsion, but either pure torsion, or pure curvature; we expect that this will simplify our field equations later ( since less information is required ). And indeed it does, as we shall see. For general relativity, Einstein has chosen for us the condition that torsion must vanish on our connection; this is mere convention. The other condition which we can apply is that the connection cannot effect the inner product between vectors, as we expect tangent vectors at different points to vary only in terms of direction ( which the connection compensates for ), but not in length. In other words, the metric must be "parallel" at all points of the manifold, in that measurements of invariant quantities do not differ. It can be shown that these two conditions specify exactly one, unique connection on our Riemann manifold - it is called the Levi-Civita connection. Its properties are that it is everywhere torsion-free, and that it preserves the metric everywhere. Since GR is based on this connection, we will obtain space-times which possess curvature but no torsion. The Levi-Civita connection is given by our covariant derivative, and therefore explicitly by the Christoffel symbols defined earlier; these symbols are therefore also called connection coefficients.
    We could have gone the other way and chosen a connection which is metric-preserving but everywhere curvature-free ( only has torsion ); this is called the Weizenboeck connection. Einstein himself considered this possibility in his later years, which resulted in a model called Teleparallelism. On this thread we will deal only with General Relativity, i.e. going forward we will always assume a Levi-Civita connection, which has only curvature but no torsion.

    So much for connections and the covariant derivative. I hope you are still with me - this was not an easy post, since I find explaining these notions without "loosing" the reader along the way very challenging. These aren't very intuitive concepts, so I can only hope that I was able to at least adequately explain it. Once of the most important points to take away from here is the fact that curvature is a property of the connection on our space-time, and not directly a property of the metric tensor ( even though this tensor ultimately "encodes" this geometric information ). We can start with the same metric tensor and arrive at a completely flat space-time that contains only torsion simply by choosing a different connection. That is why I stated in chapter (5) that the metric tensor is not curvature. We will see in the next post that curvature is quantitively defined and calculated purely via the Christoffel symbols, i.e. the connection.

    We now possess the mathematical tools to obtain a quantitive measure of the geometric properties of our Riemann manifolds, specifically Levi-Civita manifolds since this is the connection Prof Einstein has chosen to use in his theory of GR. In the next post we will therefore investigate how to quantify curvature, and introduce the Riemann curvature tensor and its contractions, the Ricci tensor and Ricci scalar. Once that is done we can at long last talk about the physics of GR !
     
  11. Markus Hanke Registered Senior Member

    Messages:
    381
    7. CURVATURE

    The last chapter was a bit of a beast, and probably the most difficult to understand conceptually. I hope everyone is still with me. Even if you haven't understood much of covariant derivatives and connections, just take with you this very basic information :

    1. Covariant derivatives generalise the notion of partial derivatives to non-trivial manifolds in a consistent manner
    2. The covariant derivate can also be thought of a way to connect tangent planes at different points, can is thus also called a connection
    3. There are infinitely many possible connections for any given manifold
    4. The principle invariants of a connection are its curvature and its torsion. Curvature is thus defined by the connection, not the metric tensor.
    5. in GR we deal with the Levi-Civita connection where all torsion vanishes

    We will now take a closer look at how to quantify the notion of curvature. Let us once again consider the scenario from post 16, where we were talking about a flat sheet of paper, and drew a tangent vector at any point on that ( rather trivial ) manifold. Obviously, the tangent plane ( the set of all possible tangent vectors ) just coincided with the manifold itself. If we start at point A, draw our tangent vector, and then transport that vector along some arbitrary closed curve in such a way that it remains parallel to the original one, eventually returning to point A, we will find that the transported vector exactly coincides with the original vector we started with. Such a procedure is called parallel transport. The outcome is no surprise, since all tangent planes at all points are just the same as the manifold itself on our flat Euclidean sheet.
    But what happens if we do the same on the surface of a sphere ? Imagine a sphere, and pick three arbitrary points A,B and C on it. Now draw a tangent vector at point A, and start parallel transporting that vector along the curve A -> B -> C -> A. Here is a handy interactive visualisation tool :

    http://demonstrations.wolfram.com/ParallelTransportOnA2Sphere/

    As you will see in the above visualisation, if we parallel transport tangent vectors along closed curves on a sphere, the resulting vector will not concide with the original vector we started with. This is precisely because the geometry of the surface deviates from a flat Euclidean space; since in GR we only deal with the Levi-Civita connection, we can say without further ado that this deviation from Euclidean geometry is a measure of curvature. More practically, we can define a measure for a manifold's curvature by examining how much parallel-transported tangent vectors deviate from the original vector. To do that, let us make the closed curve along which we parallel-transport a vector infinitesimally small. Without proof and a good bit of handwaving I am telling you now that the difference between the vectors is quantified by the failure of the covariant derivative to commute - in mathematical language this is just the difference

    \(\displaystyle{A_{i||k||p}-A_{i||p||k}=?}\)

    This difference vanishes on flat Euclidean spaces ( like our flat sheet from before ), but it does not in general vanish for more complicated manifolds. Now, what does this difference evaluate to ? Let us look at the relation more closely; as it is written now, it is a linear combination of two objects which contain three indices each. In order for an equation like this to make any sense we expect the right hand side ( "?" ) to leave us with three indices also. On the other hand - above we have defined this whole notion as the difference between two vectors after parallel transport; the difference between two vectors is of course also a vector ! So, we need to find some combination which involves a vector, but still leaves us with three indices. The simplest such combination involves a summation over one index, and is

    \(\displaystyle{A_{i||k||p}-A_{i||p||k}=R{^{m}}_{ikp}A_{m}}\)

    Again without rigorous proof I can tell you that the left hand side actually behaves like a tensor of rank 3; on the right hand side we see a vector, and an object with four indices. If we got a rank-3 tensor on the left, and a vector on the right, then the object in between must be a rank-4 tensor ( handwave, handwave... ). See any differential geometry textbook for rigorous proofs. The object \(R{^{m}}_{ikp}\) is called the Riemann curvature tensor; it contains all information about the intrinsic geometry of our manifold at each point. Indeed, if we were to choose a different connection which permits torsion too, then the Riemann curvature tensor would also contain information about the torsion on our manifold.
    From the definition of the covariant derivative in the previous chapter, we can obtain an explicit expression for the Riemann tensor in terms of the connection coefficients :

    \(\displaystyle{R{^{m}}_{ikp}=\partial _{k}\Gamma _{pm}^{m}-\partial _{p}\Gamma _{km}^{m}+\Gamma _{k\lambda }^{m}\Gamma _{pi}^{\lambda }-\Gamma _{p\lambda }^{m}\Gamma_{ki}^{\lambda }}\)

    As you can see this is just a linear combination of the Christoffel symbol and its derivatives. Technically this rank-4 tensor has 256 components, but due to the fact that torsion vanishes in GR, and due to various other symmetries this tensor has, it can be shown that only 20 of these components are actually functionally independent.

    One can form two other tensors by simply contracting a pair of indices - the first one is the Ricci tensor :

    \(\displaystyle{R_{\mu \nu }=R{^{\lambda }}_{\mu \lambda \nu }}\)

    and then by further contraction the Ricci scalar :

    \(\displaystyle{R=R{^{\lambda }}_{\lambda }}\)

    The geometrical interpretation of these objects is as follows :

    1. The Riemann curvature tensor contains all information as to the geometry of our manifold at each point. It is the base object from which all other curvature tensors can be derived. It can be thought of as a measure of failure of tangent vectors to coincide with themselves after having been parallel transported along some infinitesimal curve in the vicinity of each point on the manifold.
    2. The Ricci tensor is a measure of how the volume of a sphere at a given point deviates from the volume of a similar sphere in flat Euclidean space.
    3. The Ricci scalar measures the deviation of the sum of angles in a triangle compared to a similar triangle in flat Euclidean space ( i.e. 180 degrees ).

    In a way all of these objects are measures of how much the geometry of our manifold deviates from Euclidean geometry. There are yet other notions of curvature ( e.g. Weyl curvature, which is a measure of how shapes change compared to Euclidean geometry ), but again, all of these derive in one way or another from the Riemann curvature tensor.

    So there you have it. We now have a way to quantify geometric properties, specifically curvature, on our manifolds; the next step towards GR will be to somehow relate this to the notion of energy in all its various forms. Before we do that though, we will need to consider how curves and trajectories on our manifolds are described; this leads us to the notion of geodesics, which will discuss in the next section.
     
  12. Markus Hanke Registered Senior Member

    Messages:
    381
    8. WORDLINES AND GEODESICS

    In General Relativity, particles and bodies are no longer modelled as moving through space ( as is the case in Newtonian dynamics ), but as stationary worldlines in space-time. In our everyday world we can only see one "slice" of space-time at any given moment, and a succession a such slices within a given period ( much like rapidly flipping through the pages in a book ), giving us the impression that things around us "move" in space. In fact, they are stationary 4-dimensional entities in a 4-dimensional space-time, of which we can see only a succession of "slices". For example, the moon moving in a ( roughly ) circular orbit around earth in the everyday world of our senses would be described as a static helix in 4-dimensional space-time.
    Interestingly, these "slices" are well ordered; we don't suddenly jump backwards to an earlier "slice", then back forward again...we perceive a well ordered succession. Somehow, there is a preferred direction along the time axis, being the future. This is an interesting topic, which is further explored in a model called Causal Dynamical Triangulations where this is the result of geometric considerations, which is however outside the scope of this primer.

    Mathematically, worldlines are 4-dimensional vector functions, the components of which are parametrized by some parameter, like so :

    \(\displaystyle{x^{\mu }(\lambda)}\)

    This is analogous to Newtonian dynamics, but in four dimensions instead of three. What this means is simply that we have a vector with four components, each one of which is dependent on the same paramter \(\lambda\). Choosing a value for \(\lambda\) then specifies exactly one unique point on our worldline. This is all pure maths - it is up to us to determine a physical meaning for the parameter; often it makes sense, for example, to associate it with proper time along a worldline, because if we integrate a piece of the worldline over a range of this parameter we obviously get the arc length. More on this later.

    Worldlines can be any lines on a given manifold, connecting any points; we have not imposed any restrictions on it. But what if we wish to find the shortest curve between two given points in space-time ? In Euclidean space this is just a straight line, but on a general 4-manifold with curvature and/or torsion the situation is not so straightforward. For example, we all know that the shortest connection between two points on the surface of a sphere is an arc segment. Clearly, we have to generalise the notion of "shortest connection" in a way that takes into account the intrinsic geometry of the manifold we are on.

    So let us consider some general, non-trivial manifold with curvature, in any number of dimensions. We pick two points on that manifold, A and B, and draw a connecting curve between them. We know already that the arc length of that curve will be given by

    \(\displaystyle{s=\int_{A}^{B}ds=\int_{A}^{B}\sqrt{g_{\mu \nu }dx^{\mu }dx^{\nu }}}\)

    as we have found out when we talked about line elements and the metric tensor earlier on. In order to find the shortest possible route from A and B we turn the above into a problem of variational calculus by stating

    \(\displaystyle{s=extremal}\)

    i.e. we wish to find a minimum of the general arc length function. To do so we can make full and good use of the well established principles of variational calculus, and first define a Lagrangian function :

    \(\displaystyle{L\left ( \dot{x}(\lambda ),x(\lambda ) \right )=const\cdot \sqrt{g_{\mu \nu }\dot{x}^{\mu }\dot{x}^{\nu }}}\)

    The constant can be chosen by imposing the condition that this reduces to the usual Newtonian dynamics in the absence of curvature, which yields const = -mc. The above expression can then be inserted in the usual Euler-Lagrange equations

    \(\displaystyle{\frac{\mathrm{d} }{\mathrm{d} \lambda }\frac{\partial L}{\partial \dot{x}^{\kappa }}=\frac{\partial L}{\partial x^{\kappa }}}\)

    This can be explicitly evaluated; I skip the maths, since they are pretty tedious, and simply state the end result :

    \(\displaystyle{\frac{\mathrm{d^2} x^{\kappa }}{\mathrm{d} \lambda ^2}=-\Gamma _{\mu \nu }^{\kappa }\frac{\mathrm{d} x^{\mu }}{\mathrm{d} \lambda }\frac{\mathrm{d} x^{\nu }}{\mathrm{d} \lambda }}\)

    These are the so-called geodesic equations. Given appropriate boundary conditions this set of differential equations allows us to calculate the geodesics ( i.e. the shortest curves between points ) on any given manifold in any number of dimensions. A trivial example would be flat Euclidean space - on such a manifold there is no curvature, so all elements of the Christoffel symbols vanish, leaving us simply with

    \(\displaystyle{\frac{\mathrm{d^2} x^{\kappa }}{\mathrm{d} \lambda ^2}=0}\)

    the solutions of which are of course straight lines, as expected.

    Physically, geodesics are trajectories which are taken by particles under the influence of gravity only, in other words, by particles in free fall. This is very important, so take careful note of it : particles in free fall trace out geodesics in space-time. There are three types of geodesics :

    1. Time-like geodesics : these are geodesics between events in space-time that are causally connected. In other words, these are "ordinary" geodesics traced out by "ordinary" particles in free fall, as we would see them in the real world.
    2. Null geodesics : These are geodesics which are traced out by massless particles. Most notably, these are the curves light propagates on. These type of geodesics form the boundary between causally connected and disconnected events in space-time; any particle moving on such a geodesic can only do so if moving exactly at the speed of light, and will experience no acceleration.
    3. Space-like geodesics : These are geodesics between events which are not causally connected. In order to trace out such a geodesic one would need to be moving a superluminal speeds, which is obviously unphysical.

    The above geodesic equation forms, together with the yet to be developed field equations, one of the cornerstones of GR. From a practical point of view they are just the equations of motion of GR, which we can evaluate once a metric tensor is given to us. This does of course tell us another very important principle - in GR, the trajectories of freely falling particles are entirely determined by the geometry of space-time; no forces are required or necessary. It is of course possible to define the notion of forces ( e.g. through the relative acceleration between particles as they free fall ), but that is a secondary phenomenon, and not intrinsic to the dynamics of GR.

    In the next chapter we will take a look at how energy is described in GR, which, at long last, will then enables us to formulate a relation between energy and space-time geometry. These are just the Einstein Field Equations.
     
  13. Markus Hanke Registered Senior Member

    Messages:
    381
    9. ENERGY IN GENERAL RELATIVITY

    The final piece of the puzzle ( promise ! ) before we can derive the famous Einstein Field Equations is a covariant description of energy in all its forms. Not surprisingly, this turns out to be a tensor, more specifically a rank-2 Riemann tensor. It is called the Stress-Energy-Momentum tensor ( abbreviated SEM tensor ), and is usually denoted \(T^{\mu \nu}\) as a matter of convention. Its geometrical interpretation is that it represents the flux of the \(\mu\)th component of the relativistic 4-momentum vector across a surface with constant \(x^{\nu}\) coordinate; its various components thus can be taken to have the following meanings :

    \(T^{00}\) : Energy density
    \(T^{n0}, T^{0n}\) : Momentum densities ( n=1..3 )
    \(T^{nn}\) : Pressure ( n=1..3 )
    \(T^{12},T^{13},T^{23}\) : Shear stress
    \(T^{21},T^{31},T^{32}\) : Momentum flux

    The SEM tensor as it is used in GR is symmetric in its indices, i.e. \(T^{\mu \nu}=T^{\nu \mu}\). The total energy-momentum of any given system is simply the sum of all contributing forms of energy, i.e.

    \(\displaystyle{T^{\mu \nu }=T_{grav}^{\mu \nu }+T_{em}^{\mu \nu }+...}\)

    In classical ( closed ) systems total energy is always conserved; this corresponds to the condition that

    \(\displaystyle{\partial _{\nu }T^{\mu \nu }=0}\)

    In GR we are dealing with non-inertial coordinate systems, so we have to generalise this using the covariant derivative to

    \(\displaystyle{T{^{\mu \nu }}_{||\nu }=0}\)

    The physical meaning, namely that total energy remains conserved, is the same, but the above version is valid for all observers, whether inertial or not. This is also called the continuity equation.

    Let us give two specific examples which we will need later on; the first one is the SEM tensor of an ideal fluid in equilibrium :

    \(\displaystyle{T^{\mu \nu }=\left ( \rho +\frac{P}{c^2} \right )u^{\mu }u^{\nu }-g^{\mu \nu }P}\)

    Herein P denotes pressure, \(\rho\) is the density, and u is the fluid's 4-velocity.

    The other example is the SEM tensor for an electromagnetic field in vacuum :

    \(\displaystyle{T^{\mu \nu }=\frac{1}{\mu _{0}}\left ( F^{\mu \alpha }g_{\alpha \beta }F^{\nu \beta }-\frac{1}{4}g^{\mu \nu }F_{\delta \gamma }F^{\delta \gamma } \right )}\)

    wherein F is the electromagnetic field tensor defined through the potential A as

    \(\displaystyle{F_{\mu \nu }=\partial _{\mu }A_{\nu }-\partial _{\nu }A_{\mu }}\)

    Now that we have a way to describe the energy of a system, we are at long last in a position to derive a relation between energy and space-time curvature. This relation is the Einstein Field Equations, which we will obtain in the next chapter.
     
  14. Markus Hanke Registered Senior Member

    Messages:
    381
    10. THE EINSTEIN FIELD EQUATIONS

    Now that we have all the tools we need, we can go ahead and derive the Einstein Field Equations ( EFEs ), which, together with the geodesic equations, determine the dynamics of GR as a theory. There are different ways to derive these equations; on here we will follow roughly the route Einstein himself took.

    Let's start with Newton's gravity, the field equations of which are

    \(\displaystyle{\nabla ^2\phi =4\pi G\rho}\)

    This is of the general form div(gravity field) = constant * mass density. The generalisation of this equation should, in the spirit of GR, be a covariant equation, i.e. an equation which is valid independently of any particular choice of coordinates. In other words, we are expecting to find a tensor equation. Given that, we can guess at what the right hand side should be, because a tensor which contains terms for mass density is just the SEM tensor introduced in the last chapter. We leave the left hand side undetermined for now :

    \(\displaystyle{\Delta =\kappa T_{\mu \nu }}\)

    wherein \(\kappa\) is some proportionality constant. For this type of equation to make any sense, the left hand side must also be a rank-2 tensor, or else the equation can't hold. Furthermore, we know that the SEM tensor is symmetric under exchange of indices, thus the left hand side must exhibit the same symmetry. Lastly, given that GR is a geometric theory, we would expect the left hand side to be made up of the metric tensor and its ordinary and covariant derivatives. The easiest "setup" that fulfills all of these conditions is a linear combination of the Ricci tensor, the Ricci scalar, and the metric tensor itself. We thus try the following ansatz :

    \(\displaystyle{R_{\mu \nu }+\lambda g_{\mu \nu }R+\Lambda g_{\mu \nu }=\kappa T_{\mu \nu}}\)

    wherein \(\lambda\) and \(\Lambda\) are as yet unknown constants. To determine those constants we can first apply the energy conservation condition \(\displaystyle{T_{\mu \nu ||\nu }=0}\), giving us :

    \(\displaystyle{\left ( R_{\mu \nu }+\lambda g_{\mu \nu }R \right )_{||\nu }=0}\)

    Bear in mind here that the covariant derivative of the metric tensor itself vanishes. The above expression can be directly evaluated to determine the constant \(\lambda\); I will skip the maths here since it is very tedious without giving much physical insight. The result is

    \(\displaystyle{\lambda =-\frac{1}{2}}\)

    which, once inserted back into our ansatz, yields

    \(\displaystyle{R_{\mu \nu }-\frac{1}{2}g_{\mu \nu }R+\Lambda g_{\mu \nu }=\kappa T_{\mu \nu }}\)

    These are the Einstein Field Equations in their full form. The constant \(\kappa\) can be calculated from the fact that these field equations need to reduce to the usual Newtonian theory for non-relativistic cases, and evaluate to

    \(\displaystyle{\kappa =\frac{8\pi G}{c^4}}\)

    \(\Lambda\) is called the cosmological constant, and can physically be interpreted as being a measure of vacuum energy density across the universe. It only plays a role in cosmological solutions of the field equations, and can be taken to vanish in all other cases. If, as a matter of notational convenience, we introduce the so-called Einstein tensor

    \(\displaystyle{G_{\mu \nu }=R_{\mu \nu }-\frac{1}{2}g_{\mu \nu }R}\)

    and disregard the cosmological constant for now, we can write the field equations as

    \(\displaystyle{G_{\mu \nu }=\kappa T_{\mu \nu }}\)

    I will also give you an alternative ( but completely equivalent ) form of the equations, the trace-reversed field equations :

    \(\displaystyle{R_{\mu \nu }=-\frac{8\pi G}{c^4}(T_{\mu \nu }-\frac{1}{2}Tg_{\mu \nu })}\)

    This form of the EFEs can be handy in certain cases, as we will see in the next chapter.

    Since all the indices run 0...3, this is a system of 16 coupled, non-linear partial differential equations to determine the elements of the metric tensor. However, we know that both sides of the equation are symmetric in the indices, so in general only 10 of these equations are functionally independent. This number can be further reduced by introducing more symmetries for particular physical scenarios, as we will see later on. Remember that what these equations are solved for are elements of the metric tensor.

    Because the EFEs are non-linear and coupled, they are generally very difficult to solve. There is no general solution to these equations, and no prescribed "recipe" how to go about finding solutions. There are a number of exact analytical solutions to special cases, and a quite a few numerical solutions using CAM and simulation software; also, there is a way to linearize the equations, which, if I have the muse and the time, we might have a look at in a future post. The general meaning of the field equations, however, is quite clear just by looking at their general form :

    Space-time Geometry = Energy Content

    And this is exactly what GR is all about; gravity is not described in terms of forces, but as a geometric property of space-time itself. Space-time curvature and energy content are one and the same thing; we can interpret this as energy "curving" space-time, or equivalently as curvature manifesting itself as energy, e.g. mass.

    So there you have it. With the above field equations and the geodesic equations derived earlier the Theory of General Relativity is uniquely determined. This concludes the first part of our presentation.
     
    Last edited: May 17, 2013
  15. eram Sciengineer Valued Senior Member

    Messages:
    1,877
    Cool, excellent material.

    Please Register or Log in to view the hidden image!



    Next time, when Farsight tries to argue you can always direct him here. Though he might just say "you know f*** all about physics."

    Please Register or Log in to view the hidden image!




    Perhaps I could use this to "sound clever" too.

    Please Register or Log in to view the hidden image!

     
  16. Markus Hanke Registered Senior Member

    Messages:
    381
    Well, after several years of having been active on forums ( public and invitation-only alike ) under the same user name, my level of knowledge or lack thereof is simply a matter of public record. That is why I don't generally feel the need to respond to allegations of "you don't know physics". People can just Google me, and form their own opinions based on what they find.

    That's pretty pointless. He'd just say I hide behind the maths, or leave some comment about "showing off".

    P.S.: I am not finished here yet, there's more to come, specifically the actual physics of GR. At the moment though I don't have the time.
     
    Last edited: May 16, 2013
  17. ash64449 Registered Senior Member

    Messages:
    795
    WoW!! It would be great if i could understand!!!
     
  18. Markus Hanke Registered Senior Member

    Messages:
    381
    Go through it again carefully, from post #1 onwards in order. You should be able to at least get the main ideas, if not all the maths details...
     
  19. eram Sciengineer Valued Senior Member

    Messages:
    1,877
    Well, we haven't heard from him for a while, which could be a good thing.

    Is all this on the Science forums as well?
     
  20. Markus Hanke Registered Senior Member

    Messages:
    381
    It's on The Science Forum. I haven't put it up anywhere else, since other sites use different tags to write LaTeX, so I'd have to change the posts first, and I don't have time for it right now. Perhaps in the near future.
     
  21. eram Sciengineer Valued Senior Member

    Messages:
    1,877
    Reading this, I can say that Einstein was by far the most creative physicist of the early 20th century.

    No wonder he said that "Imagination is more important than knowledge."
     
  22. Markus Hanke Registered Senior Member

    Messages:
    381
    You can learn to solve problems, even problems you haven't encountered before, through developing an understanding of the underlying principles. However, it takes a very special state of mind to perform a paradigm shift, and look at something in a completely different light without compromising its scientific value. You cannot learn to do this. There are few people in history who have had that very special and precious ability, and it is them who are now being remembered for laying the foundations of modern physics. Einstein was one of them, and his achievement ( a paradigm shift from classical mechanics of forces to space-time geometry ) was even more remarkable because it was largely a one-man show.

    We all understand that GR will one day be superseded by a more powerful model, but Einstein's achievement in performing that paradigm shift will forever stand out as truly remarkable in science history.
     
  23. Markus Hanke Registered Senior Member

    Messages:
    381
    11. THE EXTERIOR SCHWARZSCHILD METRIC

    This post now constitutes the beginning of the second part of my presentation, where I will show you some of the solutions of the EFEs, and what kind of physics they yield; after all, it is the physics we are after, even though we have spent much time thus far discussing the maths behind GR.

    I will begin by giving an explicit example of how to obtain a simple solution to the EFEs - the exterior Schwarzschild metric. We will spend some time on this metric since it is quite important; many scenarios in astrophysics can be approximated using this metric. There are two kinds of Schwarzschild metrics ( SM ), the e xterior SM and the interior SM. The former is an example of a vacuum solution to the field equations, i.e. a solution which describes the space-time outside an energy distribution; the latter describes the geometry of space-time in the interior of an energy distribution. Vacuum solutions, in general, are characterized by the fact that the SEM tensor vanishes. I will use the exterior SM to show you how to mathematically obtain an exact solution to the EFEs; in future posts, when I present to you other solutions, I will only give you the end result without explicitly writing down the solution steps. The reason is that, as you see below, obtaining a solution is extremely tedious even in the simplest of cases, never mind more complicated ones.

    So let's get to work - I will copy most of this from my other thread, since there's no point in reinventing the wheel. Recall first the definitions of the basic entities used in the EFEs :

    Einstein Field Equations ( trace-reversed, without cosmological constant )

    (1) \(\displaystyle{R_{\mu \nu }=-\frac{8\pi G}{c^4}(T_{\mu \nu }-\frac{1}{2}Tg_{\mu \nu })}\)

    Ricci tensor :

    (2) \(\displaystyle{R_{\mu \nu }=R{^{\rho }}_{\mu \rho \nu }=\frac{\partial \Gamma _{\mu \rho }^{\rho }}{\partial x^{\nu }}-\frac{\partial \Gamma _{\mu \nu }^{\rho }}{\partial x^\rho }+\Gamma _{\mu \rho }^{\sigma }\Gamma _{\sigma \nu }^{\rho }-\Gamma _{\mu \nu }^{\sigma }\Gamma _{\sigma \rho }^{\rho }}\)

    Christoffel symbols :

    (3) \(\displaystyle{\Gamma _{\lambda \mu }^{\sigma }=\frac{1}{2}g^{\sigma \nu }(\frac{\partial g_{\mu \nu }}{\partial x^\lambda }+\frac{\partial g_{\lambda \nu }}{\partial x^\mu }-\frac{\partial g_{\mu \lambda }}{\partial x^\nu })}\)

    Contracted Christoffel symbols :

    (4) \(\displaystyle{(\Gamma _{\mu \rho }^{\rho })=(\frac{\partial ln\sqrt{g}}{\partial x^{\mu }})}\)

    Every solution of the field equations requires an ansatz. The exterior Schwarzschild Metric is based on the following assumptions about the source of the gravitational field :

    1. It is static ( i.e. not changing )
    2. It is spherically symmetric
    3. There is no angular momentum ( i.e. it doesn't rotate )
    4. It is electrically neutral

    Given these, particularly the spherical symmetry condition and the fact that it is static, we can make the following ansatz :

    (5) \(\displaystyle{ds^2=B(r)c^2dt^2-A(r)dr^2-r^2(d\theta ^2+sin^2\theta d\phi ^2)}\)

    with two as yet unspecified functions A(r) and B(r). Our task will be to find these two functions from the field equations.

    In the vacuum outside the source ( \(T_{\mu \nu }=0\) ) the Einstein Field Equations (1) then reduce to

    (6) \(\displaystyle{R_{\mu \nu }=0}\)

    which is a set of partial differential equations for the unknown functions A(r) and B(r).

    The elements of the Christoffel symbols which do not vanish are

    \(\displaystyle{\Gamma_{01}^{0}=\Gamma _{10}^{0}=\frac{B'}{2B}}\)

    \(\displaystyle{\Gamma_{11}^{1}=\frac{A'}{2A}}\)

    \(\displaystyle{\Gamma_{12}^{2}=\Gamma _{22}^{1}=\frac{1}{r}}\)

    \(\displaystyle{\Gamma_{13}^{1}=\Gamma _{31}^{3}=\frac{1}{r}}\)

    \(\displaystyle{\Gamma_{23}^{3}=\Gamma _{32}^{3}=ctg\theta }\)

    \(\displaystyle{\Gamma_{00}^{1}=\frac{B'}{2A} }\)

    \(\displaystyle{\Gamma_{22}^{1}=-\frac{r}{A} }\)

    \(\displaystyle{\Gamma_{33}^{1}=-\frac{rsin^2\theta }{A}}\)

    \(\displaystyle{\Gamma_{33}^{2}=-sin\theta cos\theta }\)

    The non-vanishing elements of the Ricci tensor are then

    \(\displaystyle{R_{00}=-\frac{B''}{2A}+\frac{B'}{4A}(\frac{A'}{A}+\frac{B'}{B})-\frac{B'}{rA} }\)

    \(\displaystyle{R_{11}=\frac{B''}{2B}-\frac{B'}{4B}(\frac{A'}{A}+\frac{B'}{B})-\frac{A'}{rA} }\)

    \(\displaystyle{R_{22}=-1-\frac{r}{2A}(\frac{A'}{A}-\frac{B'}{B})+\frac{1}{A}}\)

    \(\displaystyle{R_{33}=R_{22}sin^2\theta }\)

    From the above we obtain the system of equations

    \(\displaystyle{R_{00}=0}\)

    \(\displaystyle{R_{11}=0}\)

    \(\displaystyle{R_{22}=0}\)

    \(\displaystyle{R_{33}=0}\)

    We now write

    \(\displaystyle{\frac{R_{11}}{A}+\frac{R_{00}}{B}=-\frac{1}{rA}(\frac{A'}{A}+\frac{B'}{B})=0}\)

    and, doing some algebra, we obtain from this

    \(\displaystyle{A(r)B(r)=const.}\)

    We also know that the gravitational field vanishes at infinity, i.e for \(\displaystyle{r \to \infty }\) we obtain

    \(\displaystyle{A(r)\xrightarrow[]{r \to \infty }1}\)

    \(\displaystyle{B(r)\xrightarrow[]{r \to \infty }1}\)

    and therefore

    \(\displaystyle{A(r)=\frac{1}{B(r)}}\)

    Now we can insert this into the remaining equations :

    \(\displaystyle{R_{22}=-1+rB'+B=0}\)

    \(\displaystyle{R_{11}=\frac{B''}{2B}+\frac{B'}{rB}=\frac{1}{2rB}\frac{\mathrm{d} R_{22}}{\mathrm{d} r}=0}\)

    One can easily verify that these two differential equations are solved by

    \(\displaystyle{B(r)=1-\frac{2a}{r}}\)

    \(\displaystyle{A(r)=\frac{1}{1-2a/r}}\)

    with an integration constant a. This constant is determined by the condition that the solution of the field equation must reduce the usual Newton's law at infinity; therefore

    \(\displaystyle{a=\frac{GM}{c^2}}\)

    Putting all this back into the ansatz (5) gives us the solution of the Einstein field equation we were looking for :

    \(\displaystyle{ds^2=(1-\frac{2a}{r})c^2dt^2-\frac{dr^2}{1-\frac{2a}{r}}-r^2(d\theta ^2+sin^2\theta d\phi ^2)}\)

    This is called the Exterior Schwarzschild Metric, and its form is the simplest possible vacuum solution to the original field equations without cosmological constant. We will discuss its properties in more detail in the next post.
     

Share This Page