Sunday, October 8, 2017

The gravitational field

Today we will start implementing the 7 point roadmap in the case of the gravitational field. Technically gravity does not form a gauge theory but since it was the starting point of Weyl's insight, I will start with this as well and next time I will show how the program works in case of the electromagnetic field.

1. The gauge group

The "gauge group" in this case is the group of general coordinate transformations in a real four-dimensional Riemannian manifold M. Now the argument against Diff M as a gauge group comes from locality. An active diffeomorphism can move a state localized near the observer to one far away which can be different. However, for the sake of argument I will abuse this today and considered Diff M as a "gauge group" because of the deep similarities (which will explore in subsequent posts) between this and proper gauge theories like electromagnetism and Yang-Mills.

2. The covariant derivative giving rise to the gauge group

For a vector field \(f^\alpha\) the covariant derivative is defined as follow:

\(D_\rho f^\alpha = \partial_\rho f^\alpha +{\Gamma}^{\alpha}_{\rho\sigma} f^\alpha\)

where \({\Gamma}^{\alpha}_{\rho\sigma}\) is called an affine connection. If we demand that the metric tensor is a covariant constant under D we can find that the connection is:

\({\Gamma}^{\sigma}_{\mu\nu} = \frac{1}{2}[g_{\rho\mu,\nu} + g_{\rho\nu,\mu} - g_{\mu\nu,\rho}]\)

where \(f_{\rho,\sigma}  = \partial_\sigma f_\rho\) 

3. The integrability condition

We define this condition as the commutativity of the covariant derivative. If we define the notation: \(D_\mu D_\nu f_\sigma = f_{\sigma;\nu\mu}\) we can write this condition as:

\(f_{\rho;\mu\nu} - f_{\rho;\nu\mu} = 0\)

Computing the expression above yields:

\(f_{\rho;\mu\nu} - f_{\rho;\nu\mu} = f_\sigma {R}^{\sigma}_{\rho\mu\nu}\)
\({R}^{\sigma}_{\rho\mu\nu} = {\Gamma}^{\tau}_{\rho\mu}{\Gamma}^{\sigma}_{\tau\nu} - {\Gamma}^{\tau}_{\rho\nu}{\Gamma}^{\sigma}_{\tau\mu} + {\Gamma}^{\sigma}_{\rho\mu,\nu} - {\Gamma}^{\sigma}_{\rho\nu,\mu}\)

4. The curvature

From above the integrability condition is \({R}^{\sigma}_{\rho\mu\nu} = 0\) and R is called the Riemann curvature tensor.

5. The algebraic identities

The algebraic identities come from the symmetry properties of the curvature tensor which reduces the 256 components to only 20 independent ones. I am too tired to type the proof of the reduction to 20, but you can easily find the proof online.

6. The homogeneous differential equations

If we take the derivative of the Riemann tensor we obtain a differential identity known as the Bianchi identity:

\({R}^{\sigma}_{\rho\mu\nu;\tau} + {R}^{\sigma}_{\rho\tau\mu;\nu} + {R}^{\sigma}_{\rho\nu\tau;\mu} = 0\)

7. The inhomogeneous differential equations

This equation is of the form:

geometric concept = physical concept

And in this case we use the stress energy tensor \(T_{\mu\nu}\) and we find a geometric object with the same mathematical properties: symmetric and divergenless build out of curvature tensor. The left-hand side is the Einstein tensor:

\(G_{\mu\nu} = R_{\mu\nu} - \frac{1}{2}g_{\mu\nu}R\)

The constant of proportionality comes from recovering Newton's gravitational equation in the nonrelativistic limit. In the end one obtains Einstein's equation:

\(G_{\mu\nu} = 8\pi G T_{\mu\nu}\)

Next time I will go through the same process for the electromagnetic field and map the similarities between the two cases. Please stay tuned.

Sunday, September 24, 2017

The Math of Gauge Theories

With a bit of a delay I am resuming the posts on gauge theory and today I will talk about the math involved. 

In gauge theory you consider the base space-time as a manifold and you attach at attach point an object or what is called a fiber forming what it is called a fiber bundle. The picture which you should have in mind is that if a rug.

The nature of the fibers is unimportant at the moment, but they should obey at least the properties of a linear space. 

Physically think of the fibers as internal degrees of freedom at each spacetime point, and a physical configuration would correspond to a definite location at one point long the fiber for each fibers. 

The next key concept is that of a gauge group. A gauge group is the group of transformations which do not affect the observables of the theory. 

Mathematically, the gauge symmetry depends on how we relate points between nearby fibers and to make this precise we only need (only) one critical step: define a covariant derivative.

Why do we need this? Because an arbitrary gauge transformation does not change the physics and the usual ordinary derivative sees both infinitesimal changes to the fields, and the infinitesimal changes to an arbitrary gauge transformation. Basically we need to compensate for the derivative of an arbitrary gauge transformation.

If d is the ordinary derivative, let's call D the covariant derivative and their difference (which is a linear operator) is called either a differential connection, a gauge field, or a potential:

A(x) = D - d

D and d act differently: d "sees" the neighbourhood behaviour but ignores the value of the function on which it acts, and D acts on the value but is blind to the neighbourhood behaviour.   

The condition we will impose on D is that is must satisfy the Leibniz identity because it is derivative:

D(fg) = (Df)g+f(Dg)

which in turn demands:

A(fg) = (Af)g+f(Ag)

In general only one part of A may be used to compensate for gauge transformations, and the remaining part represent an external field that may be interpreted as potential. When no external potentials are involved, A usually respects integrability conditions. Those conditions depend on the concrete gauge theory and we will illustrate this in subsequent posts.

When external fields are present, the integrability conditions are not satisfied and this is captured by what is called a curvature. The name comes from general relativity where lack of integrability is precisely the space-time curvature.

The symmetry properties arising out of curvature construction gives rise to algebraic identities.

Next in gauge theories we have the homogeneous and inhomogeneous differential equations. As example of homogeneous differential equations are the Bianchi identities in general relativity and the two homogeneous Maxwell's equations. The inhomogeneous equations are related to the sources of the fields (current in electrodynamics, and stress-energy tensor in general relativity).

So to recap, the steps used to build a gauge theory are:

1. the gauge group
2. the covariant derivative giving rise to the gauge field
3. integrability condition
4. the curvature
5. the algebraic identities
6. the homogeneous equations
7. the inhomogeneous equations

In the following posts I will spell out this outline first for general relativity and then for electromagnetism. Technically general relativity is not a gauge theory because diffeomorphism invariance cannot be understood as a gauge group but the math similarities are striking and there is a deep connection between diffeomorphism invariange and gauge theory which I will spell out in subsequent posts. So for now please accept this sloppiness which will get corrected in due time.

Monday, September 4, 2017

The Bohm-Aharaonov effect

Today we come back to gauge theory and continue on Weyl's ideas. With the advent of quantum mechanics Weyl realized that he could reinterpret his change in scale as a change in the phase of the wavefunction. Suppose we make the following change to the wavefunction:

\(\psi \rightarrow \psi s^{ie\lambda/\hbar}\)

The overall phase does not affect the Born rule and we did not change the physics (here \(\lambda\) does not depend on space and time and it is called a global phase transformation). Let's make this phase change depend on space and time: \(\Lambda = \Lambda (x,t) \) and see where it leads. 

To justify this assume we are studying charged particle motion in an electromagnetic field and suppose that \(\Lambda\) corresponds to a gauge transformation for the electromagnetic field potentials \(A\) and \(\phi\):

\(A\rightarrow A + \nabla \Lambda\)
\(\phi \rightarrow \phi - \partial_t \Lambda\)

This should not change the physics and in particular it should not change Schrodinger's equation. To make Schrodinger's equation invariant under a local \(\Lambda\) change we need to add  \(-eA\) to the momentum quantum operator:

\(-i\hbar \nabla \rightarrow -i\hbar \nabla -eA\)

And the Schrodinger equation of a charged particle in an electromagnetic field reads:

\([\frac{1}{2m}{(-i\hbar\nabla -eA)}^2 + e\phi +V]\psi = -i\hbar\frac{\partial \psi}{\partial t}\)

But why do we have the additional \(eA\) term to begin with? It's origin is in Lorentz force. If \(B = \nabla \times A\) and \(E = -\nabla \phi - \dot{A}\), the Lagrangian takes the form:

\(L = \frac{1}{2} mv^2 - e\phi + ev\cdot A\)

which yields the canonical momenta to be:

\(p_i = \partial{\dot{x}_i} = mv_i + eA_i\)

and adding \(-eA\) to the momenta in the Hamiltonian yields Lorentz force from Hamlton's equations of motion. 

Coming back to Schrodinger's equation we notice that the electric and magnetic fields E and B do not enter the equation, but instead we have the electromagnetic potentials. Suppose we have a long solenoid which has inside a non zero magnetic field B, and outside zero magnetic field. Outside the solenoid, in classical physics we cannot detect any change if the current flows or not through the wire. However the vector potential is not zero outside the solenoid (\(\nabla\times A = 0\) does not imply \(A=0\)) and the Schrodinger equation solves differently when \(A = 0\) and \(A\ne 0\). 

From this insight Bohm and Aharonov came up with a clever experiment to put this to the test: in a double slit experiment, after the slits they proposed to add a long solenoid. Record the interference pattern with no current flowing through the solenoid and repeat the experiment with the current creating a magnetic field inside the solenoid. Since the electrons do not enter the solenoid, from classical physics we should expect no difference, but in quantum mechanics the vector potential is not zero and the interference pattern shifts. Unsurprisingly the experiment confirms precisely the theoretical computation.

There are several important points to be made. First, there is no classical explanation of the effect: E and B are not fundamental, but \(\phi\) and \(A\) are. It is mind boggling that even today there are physicists who do not accept this and continue to look for effects rooted in E and B. Second, the gauge symmetry is not just a accidental symmetry of Maxwell's equation but a basic physical principle which turns out to govern all fundamental forces in nature. Third, the right framework for gauge theory is geometrical and we will explore this in depth in subsequent posts. Please stay tuned.

Due to travel, the next post is delayed 2 days.

Sunday, August 20, 2017

Impressions from Yellowstone

I was on vacation for a week in Yellowstone and I will put the physics post on hold want to share what I saw. First, the park is simply amazing and I highly recommend to visit if you have the chance. You need at least 3 days as a bare minimum. The main road is like the number 8 and on the west (left) side you get to see lots of fuming hot spots ejecting steam and sulfur.

The colors are due to bacteria and different bacteria live at different temperatures giving the hot spots rings of color.

On the south side you get the geysers and Old Faithful which erupts every 90 minutes.

You need to be there approximately 1 hour before the eruption to get a sit on the benches which surround Old Faithful. There are other geysers but you don't know when they erupt.

On the east side at the bottom of the 8 there is Yellowstone lake which gives rise to Yellowstone river and the Yellowstone canyon. Not much to do at the lake, the water is very cold.  The river forms two large waterfalls and you can visit them on both sides.

Coming north on the east side, you encounter more waterfalls and a bit of bisons. If you are lucky you get to see in the distance bears usually eating a dead moose. By the way, there is a big business ripoff in terms of bear sprays. You can buy one for $50, but you should rent one for $10/day when you hike in the forest. Even better just buy a $1 bell to wear to let the wildlife you are there (bears avoid people if they can hear them coming).

You can hike Mt. Washburn (4 hour round trip hike) to get a panoramic view of the park 50 miles in any direction.

There is nothing to see in the east-west part of the road at the middle of the 8, and on the the north of the east road there is another road leading east in Lamar's valley. Here is where you see a ton of wildlife: bisons, moose, wolves. Literally there are thousands of bisons in big herds which often cross the road.

Driving in the park is slow (25 mph) due to many attractions on the side and the traffic jams caused by animals. You need one day for north part, one day for the south loop, and one day for Lamar valley.

Yellowstone is at the spot of a supervolcano which erupted 7 times in the past: when it erupts it covers half of US with volcanic ash. There is a stationary hot spot of magma and because the tectonic plate moves different eruptions occur in different places. The past eruption locations trace a clear path on the map.

Yellowstone park is located in the caldera (the volcano crater) of the last eruption.

Sunday, August 6, 2017

The origins of gauge theory

After a bit of absence I am back resuming my usual blog activity. However I am extremely busy and I will create new posts every two weeks from now on. I am starting now a series explaining gauge theory and today I will start at the beginning with Hermann Weyl's proposal.

In 1918 Hermann Weyl attempted to unify gravity with electromagnetism (the only two forces known at the time) and in the process he introduce the idea of gauge theory. He espouse his ideas in his book "Space Time Matter" and this is a book which I personally find hard to read. Usually the leading physics people have crystal clear original papers: von Neumann, Born, Schrodinger, but Weyl's book combines mathematical musings with metaphysical ideas in an unclear direction. The impression I got was of a mathematical, physical and philosophical random walk testing in all possible ways and directions and see where he could make progress. He got lucky and his lack of cohesion saved the day because he could not spot simple counter arguments against his proposal which could have stopped him cold in his tracks. But what was his motivation and what was his approach?

Weyl like the local character of general relativity and proposed (from pure philosophical reasons) the idea that all physical measurements are relative. I particular, the norm of a vector should not be thought as an absolute value, but as a value that can change at various point of spacetime. To compare at different points, you need a "gauge", like a device used in train tracks to make sure the train tracks remained at a fixed distance from each other. Another word he used was "calibration", but the name "gauge" stuck.

So now suppose we have a norm \(N(x)\) of a vector and we do a shift to \(x + dx\). Then:

\(N(x+dx) = N(x) + \partial_{\mu}N dx^{\mu}\)

Also suppose that there is a scaling factor \(S(x)\):

\(S(x+dx) = S(x) + \partial_{\mu}S dx^{\mu}\)

and so to first order we get that N changes by:

\(( \partial_{\mu} + \partial_{\mu} S) N dx^{\mu} \)
Since for a second gauge \(\Lambda\), \(S\) transforms like:

\(\partial_{\mu} S \rightarrow \partial_{\mu} S  +\partial_{\mu} \Lambda \)

and since in electromagnetism the potential changes like:

\(A_{\mu}  \rightarrow A_{\mu} S  +\partial_{\mu} \Lambda \)

Weyl conjectured that \(\partial_{\mu} S = A_{\mu}\).

However this is disastrous because (as pointed by Einstein to Weyl on a postcard) it implies that the clocks would change their frequencies based on the paths they travel (and since you can make atomic clocks it implies that the atomic spectra is not stable).

Later on with the advent of quantum mechanics Weyl changed his idea of scale change into that of a phase change for the wavefunction and the original objections became mute. Still more needed to be done for gauge theory to become useful.

Next time I will talk about Bohm-Aharonov and the importance of potentials in physics as a segway into the proper math for gauge theory. 

Please stay tuned.

Monday, July 10, 2017

The main problem of MWI is the concept of probability

Now it is my turn to present the counter arguments against many worlds. All known derivations of Born rule in MWI have (documented) issues of circularity: in the derivation the Born rule is injected in some form or another. However the problem is deeper: there is no good way to define probability in MWI.

Probability can be defined either in the frequentist approach as limit of frequency for large trial numbers, or subjectively as information update in the Bayesian approach. Both those approaches are making the same predictions. 

It is generally assumed by all MWI supporters that branch counting leads to incorrect predictions and because of this the focused is changed on subjective probabilities and the "apparent emergence" of Born rule. However this implicitly breaks the frequentist-subjective probability relationship. The only way one can use the frequentist approach is by using branch counting. Let's have a simple example.

Suppose you work at a factory which makes fair (quantum) coins which land 50% up and 50% down. Your job is quality assurance and you are tasked with finding the defective coins. Can you do your job in a MWI quantum universe? The only thing you can do is to flip the coin many times and see if it lands about 50% up and 50% down. For a fair coin there is no issue. However for a biased coin (say 80%-20%) you get the very same outcomes as in the case of the fair coins and you cannot do your job.

There is only one way to fix the problem: consider that the world does not split in 2 up and down branches, but say in 1 million up and 1 million down branches. In this case you can think that in the unfair case the world splits in 1.6 million up worlds, and 400 thousand down worlds. This would fix the concept of probability in MWI restoring the link between frequentist and subjective probabilities, but this is not what MWI supporters claim. Plus, this has problems of its own with irrational numbers and the solution is only approximate to some limit of precision which can be refuted by any experiment run long enough.

So to boil the problem down, in MWI there is no outcome difference in case of a fair coin versus an unfair coin toss: in both cases you get an "up world" and a "down world". Repeating the coin toss any number of times does not change the nature of the problem in any way. Physics is an experimental science and we test the validity of the theories against experiments. Discarding branch counting in MWI is simply unscientific

Now in the last post Per argued for MWI. I asked him to show what would happen if we flip a fair and an unfair coin three times to simply run through his argument on an elementary example and not hid behind general equations. After some back and forth, Per computed the distribution \(\rho\) in the fair and unfair case (to match quantum mechanics predictions) but the point is that \(\rho\) must arise out of the relative frequencies and not be computed by hand. Because the relative frequencies are identical in the two cases \(\rho\) must be injected by a different mechanism. His computation of \(\rho\) is the point where circularity is introduced in the explanation. If you look back in his post, this comes from his equation 5 which is derived from equation 3. Equation 3 assumes Born rule and is the root cause of circularity in his argument. Per's equation 7 recovers the Born rule in the limit case after assuming Born rule in equation 3 - q.e.d.

Sunday, June 25, 2017

Guest Post defending MWI

As promised, here is a guest post from Per Arve. I am not interjecting my opinion in the main text but I will ask questions in the comments section.

Due to the popularity of this post I am delaying the next post for a week.

The reason to abandon the orthodox interpretation of quantum mechanics is its incompleteness. Bohr and Heisenberg refused the possibility to describe the measurement process as a physical process. This is encoded in Bohr's claim that the quantum world cannot be understood. Such an attitude served to avoid endless discussions about the weirdness of quantum mechanics and divert attention to the description of microscopic physics with quantum mechanics. Well done! A limited theory is better than no theory.

But, we should always try to find theories that in a unified way describes the larger set of processes. The work by Everett and the later development of decoherence theory by Zeh, Zurek and others have given us elements to describe also the measurement process as a quantum mechanical process. Their analysis of the measurement process implies that the unitary quantum evolution leads to the emergence of separate new "worlds". The appearance of separate "worlds" can only be avoided if there is some mechanism that breaks unitarity.

The most well-known problem of Everett's interpretation is that of the derivation of the Born rule. I describe the solution of that problem here. (You can also check my article on the arxiv [1603.01625] Postulates for and measurements in Everett's quantum mechanics)

The main point is to prove that physicists experience the Born rule. That is by taking an outside view of the parallel worlds created in a measurement situation. The question, what probability is from the perspective of an observer inside a particular branch, is more a matter of philosophy than of science.

The natural way to find out where something is located is to test with some force and find out where we find resistance. The force should not be so strong that it modifies the system we want to probe. This corresponds to the first order perturbation of the energy due to the external potential U(x),

\(\Delta E =\int d^3 x {|\psi (x)|}^2 U(x)\)  (1)

This shows that \({|\psi(x)|}^2\) gives where the system is located. (Here, spin and similar indexes are omitted.)

The argumentation for the Born rule relies on that one may ignore the presence of the system in regions, where integrated value of the wave function absolute square is very small.

In order to have a well defined starting point I have formulated two postulates for Everett's quantum mechanics.

EQM1 The state is a complex function of positions and a discrete index j for spin etc,

\(\Psi = \psi_j (t, x_1, x_2, ...) \)  (2)

Its basic interpretation is given by that the density 

\(\rho_j (t, x_1, x_2,...) = {|\psi_j (t, x_1, x_2, ...)|}^2 \)  (3)

answers where the system is in position, spin, etc.

It is absolute square integrable normalized to one 

\( \int \int···dx_1dx_2 ··· \sum_j {|\psi_j (t, x_1, x_2, ...)|}^2 = 1\)  (4)

This requirement signifies that the system has to be somewhere, not everywhere. If the value of the integral is zero, the system doesn’t exist anywhere.

EQM2 There is a unitary time development of the state, e.g.,

\(i \partial_t \Psi = H\Psi \),

where H is the hermitian Hamiltonian. The term unitary signifies that the value of the left hand side in (4) is constant for any state (2).

Consider the typical measurement where something happens in a reaction and what comes out is collected in an array of detectors, for instance the Stern-Gerlach experiment. Each detector will catch particles that have a certain value of the quantity B we want measure.

Write the state that enter the array of detectors as sum of components that enter the individual detectors, \(|\psi \rangle = \sum c_b |b\rangle\), where b is one of the possible values of B. When that state has entered the detectors we can ask, where is it? The answer is that it is distributed over the individual detectors. The distribution is 

\(\rho_b = {|c_b|}^2 \)  (5)

This derived by integrate the density (3) over the detector using that the states \(|b\rangle\) have support only inside its own detector. 

The interaction between \(|\psi \rangle\) and the detector array will cause decoherence. The total system of detector array and \(|\psi \rangle\) splits into separate "worlds" such that the different values b of the quantity B will belong to separate "worlds".

After repeating the measurement N times, the distribution that answer how many times have the value \(b=u\) been measured is

\(\rho(m:N | u)= b(N,m) {(\rho_u)}^m{(\rho_{¬u})}^{N−m} \)  (6)

where \(b(N,m)\) is the binomial coefficient \(N\) over \(m\) and \(\rho_{¬u}\) is the sum over all \(ρ_b\) except \(b=u\).

The relative frequency \(z=m/N\) is then given by

\(\rho(z|u) \approx \sqrt{(N/(2\pi \rho_u \rho_{¬u}))} exp( −N{(z−\rho_u)}^2/(2\rho_u \rho_{¬u}) ) \)  (7)

This approaches a Dirac delta \(\delta(z − \rho_u)\). If the tails of (7) with low integrated value are ignored, we are left with a distribution with \(z \approx u\). This shows that the observer experiences a relative frequency close to the Born value. Reasonably, the observer will therefore believe in the Born rule.

The palpability of the densities (6) and (7) may be seen by replacing the detectors by a mechanism that captures and holds the system at the different locations. Then, we can measure to what extent the system is at the different locations (4) using an external perturbation (1). In principle, also the distribution from N measurements is directly measurable if we consider N parallel experiments. The relative frequency distribution (7) is then also in principle a directly measurable quantity.

A physicist that believes in the Born rule will use that for statistical inference in quantum experiments. According to the analysis above, it will work just as well as we expect it to do using the Born rule in a single world theory.

A physicist who believes in a single world will view the Born rule as a law about probabilities. A many-worlder may view it as a rule that can be used for inference about quantum states as if the Born rule is about probabilities.

With my postulates, Everett's quantum mechanics describe the world as we see it. That is what should be discussed. Not whether it pleases anybody or not.

If the reader is interested what to do in a quantum russian roulette situation, I have not much to offer. How to decide your future seems to be a philosophical and psychological question. As a physicist, I don't feel obliged to help you with that.

Per Arve, Stockholm June 24, 2017