Anson Ho

What is physical intuition?

2022-01-30T00:00:00+00:00

How can qualitative physical understanding help us solve problems? When does it work and fail?

Leveraging physical understanding

One thing that I’ve been interested in for some time is trying to understand solutions to complicated equations at an intuitive level. This is particularly relevant in physics because the equations that we encounter pertain to things in the natural world, and a good “understanding” of the equation necessitates a good grasp of the physical concepts. This is what Paul Dirac, one of the greatest physicists of the 20th century, had to say about understanding equations:

“I consider that I understand an equation when I can predict the properties of its solutions, without actually solving it.”
Paul Dirac

What could this mean? How is it possible to understand the properties of an equation’s solutions without actually solving it? As a basic example of this, suppose I asked you to solve the following integral:

\[I:= \int\limits_{0}^{\infty} \left( \frac{m}{2 \pi k_B T} \right)^{\frac{3}{2}} 4\pi v^2 e^{-\frac{mv^2}{2 k_B T}} \text{d}v\]

If you don’t recognise this integrand, a reasonable approach would be to just go ahead with some algebra. In fact, if you know a bit about Gaussian integrals, this isn’t a particularly challenging problem. But before you demonstrate your finely-tuned integration skills, let’s pause for a moment - with a bit of extra information about the system, we can use our physical intuition to determine the answer to this integral without any calculation.

The first question to ask is, “what is this equation describing?”. If we inspect the terms, we find that $m$ is mass of particles, $k_B$ is the Boltzmann constant, $T$ is temperature of a gas, and $v$ is the speed of each particle. In addition, the integral goes from $0$ to $\infty$, integrating over the particle speeds.

If you haven’t already guessed, this describes a probability distribution over the speeds of particles in an ideal gas¹. Once we know this, the answer becomes clear! Imagine a box filled with particles bouncing around like billiard balls. These particles have different speeds, which we can describe using a distribution.

The integral tells us the probability that each particle in the gas has any speed from 0 to $\infty$, which is clearly 1. You can verify this by doing integration by parts².

With a touch of physical understanding we were able to get to the solution without actually solving it! The benefits of understanding what the equation describes don’t just stop there - it also allows us to check whether or not the integrand itself makes sense in the first place. Essentially, the question is this: we’ve solved the integral and gotten a reasonable answer, but why should the probability distribution take this form? To answer this, let’s break down the integrand $p(v)$:

\[p(v) = \left( \frac{m}{2 \pi k_B T} \right)^{\frac{3}{2}} 4\pi v^2 e^{-\frac{mv^2}{2 k_B T}}.\]

The probability that a gas particle has a speed that lies within a particular range is found (as we saw) by integrating over the speeds. This depends on two key variables:

Particle mass $m$: intuitively we would expect that heavier particles should move slower than lighter particles.
Temperature $T$: roughly this is an average measure of the speeds of the gas particles. If we increase the temperature, we’d expect that on average the each particle is more likely to move faster.

If you try and compare your physical intuition with what the equation is showing, it can be quite difficult to tell exactly what the behaviour will be like. Even if you recognise that the exponential term dominates the function’s behaviour at large speeds, when exactly it becomes the dominant contributor can be hard to discern. To try and get a better feel for this, let’s take a look at the graph of this.

If you increase the temperature then the peak of the distribution shifts to the right, meaning that the particles on average have a higher speed. You can try having a fiddle around with the controls in the GeoGebra applet and see if you can justify to yourself why this behaviour makes sense - for example, what happens if you increase the mass? Does this agree with what you expect?

From this simple example we can see that having a grasp of the physics gives an understanding of the situation that would allow us to make predictions about how the system should behave. Therefore, what I mean by having “physical understanding” or “physical intuition” is this: having the ability to understand a system based on a mental physical model, as opposed to grinding through equations.

Simplifying assumptions

This one has become a bit of a joke, even among physicists! The classic example of is this: to determine how much heat gets radiated from a cow, the first simplifying step is to “imagine a spherical cow”³, which is only “a bit” of a caricature of what cows actually look like.

Let’s continue the example from the previous section. Suppose we’re trying to model the behaviour of a real gas (say the box is filled with air), what assumptions have we made? (Go back and have a look, the assumptions were introduced in rather sneaky fashion.)

If you’re a really careful reader, you’ll notice that the Maxwell-Boltzmann distribution describes the behaviour of an ideal gas. This is loaded with simplifications! For instance:

Gas particles can be thought of as point particles, such that they don’t take up any space
There are no interactions between particles apart from collisions
The particle collisions are perfectly elastic, such that the total kinetic energy of the gas in the box stays the same

Another simplification is that there are a sufficiently large number of particles, such that we can define properties like “temperature”, and model the system using a statistical distribution.

On the outset, these assumptions might seem quite sketchy, so why might it be reasonable to do so?

Point particles: This assumption is actually made relative to the environment - gas particles are typically very small, on the order of $\approx 10^{-10}$ metres in diameter. In contrast, if we think of a “box” of particles, we’re typically thinking of things that are at macroscopic scales, on the order of a metre in length. This is a huge difference, and to us they seem basically like points.
No interparticle forces: Related to the previous assumption, is that the particles don’t form any intermolecular bonds. This seems pretty consistent with what one might expect from the kinetic theory of matter, where solids have strong interparticle bonds and gas particles have very weak ones (such that they can move around independently).
Elastic collisions: If particles ever collide with the containing box, we’re assuming that the box never changes in energy and that the particles maintain their speed. If particles collide with each other, the total kinetic energy is the same before and after collisions. This lets us focus on the “average” behaviour of the system over time⁴.
Many particles: Gases in general have lots of particles in them - for instance, at standard temperature and pressure there are close to $10^{22}$ particles per litre of oxygen⁵. This is nice if we want to take a statistical approach to analysing the system, as we’ve seen in the previous section.

In essence, these assumptions make things much easier to handle without losing too much explanatory power.

I think the extent of the simplifications probably depends on how well you understand the physical system that you’re interested in. If you have no idea where to start, I think it really does make sense to just simplify things and then “see how things go”. This can often yield very good results, because not every physical characteristic of the system you’re interested in is super relevant. If the simplification only leaves behind useful stuff, then you probably already have a pretty good approximation for how things work in reality. This is why classical Newtonian gravity can be highly useful, even though technically General Relativity gives a better description of reality. Relativity gives us more explanatory power overall, but it’s a lot more fiddly and simply not necessary in everyday situations.

A corollary of this is that breaking your previous simplifying assumptions is plausibly a great way to make progress - in fact, I think a lot of modern physics research is in this vein. For instance, the fourth assumption from earlier assumes that the box has many gas particles (e.g. $n \approx 10^{22}$). If we imagine the spectrum of systems of particles that physicists tend to think about, this would be at one extreme, and at the other extreme we have systems with very few (e.g. $n = 1,2$) particles. Both of these cases are well understood, with statistics helping in the former and the basic laws of physics easily applying in the latter. What’s challenging is the cases in between, with say $n = 20$ to $50$ particles - now we’re suddenly at the forefront of condensed matter, where researchers deal with quantum dots, which contain a comparable number of electrons. In this sense, simplifications can make our lives a lot easier, and also suggest further paths for investigation.

Dimensional Analysis

Of course, even when we’re armed with information about the system, knowing the properties of the solutions isn’t always an easy thing to do! Fortunately, we have more tricks up our sleeve. In the previous example we started off by looking at what the symbols describe - we can add a level of nuance to this by looking at their dimensions (e.g. momentum $p$ has dimensions of $\text{mass} * \text{length} * \text{time}^{-1} $). A stronger form of this involves looking at the units (e.g. one possible unit of momentum is $\text{kilograms} * \text{metres} / \text{second}$).

This can be used if we want to check that our equations make sense, by comparing the dimensions on the left and right hand sides of each equation⁶. Looking at units is also very important, not just in exams but also in ensuring that you don’t accidentally crash Mars Climate Orbiters (dear reader, please use metric units in science).

In some cases we can do even better - rather than just using dimensional analysis as a sanity check, we can use it to predict what our solutions might look like. A great example of this is given by John Adam in Mathematics in Nature, which makes a dimensional argument as to why the sky should be blue.

Why the sky is blue seems like a surprisingly difficult question to answer, so let’s try and break it down a bit. What we do know is that light from the sun comes in a spectrum, which to the naked eye looks roughly white. Through some process of interaction with the particles in the atmosphere, the light that reaches our eyes is blue.

What is this interaction? If we take a random patch of atmosphere and zoom in really close, we’ll find that it consists of lots of small gas molecules, mainly Nitrogen and Oxygen. As we discussed earlier, these have length on the order of $10^{-10}$ metres. At the same time, visible light from the sun has a wavelength of about $10^{-7}$ metres, a factor of about a thousand. Unlike in the previous section, things start to become a little ambiguous - should we still treat these gas molecules as point particles?

The important consideration here is that these particles often are likely to interact with light through charge separation (since light is an electromagnetic field), where positive charges get pushed in the direction of the electric field, and negative charges in the opposite direction. This forms an oscillating dipole, releasing radiation. As a result, one might expect that the wavelength is quite important here - the shorter the wavelength of incident sunlight, the higher the oscillation frequency.

One might also guess that this increased frequency results in emitted radiation of a higher frequency. To check this, let’s think about the intensity of the emitted radiation that an observer sees. Intuitively, this probably depends on several factors:

The number of particles $N$ - we can think of these as forming a “composite particle” in 3D space consisting of the particles that scatter light into the observer’s eyes. Let’s call the volume of this composite particle $V$
The distance from the composite particle to the observer $r$
The wavelength of incident light $\lambda$
The intensity of incident light $I_0$

Let’s rewrite the above in mathematical form - if the emitted radiation has intensity $I$, then it is a function of the above factors:

\[I = f(V, r, \lambda, I_0).\]

Now comes the dimensional analysis. From the equation above, we substitute in constants for the powers:

\[I \propto V^\alpha r^\beta \lambda^\gamma I_0.\]

If we do dimensional analysis on this, then we notice that $V$, $r$ and $\lambda$ have dimensions of $[\text{Length}]^3$, $[\text{Length}]$ and $[\text{Length}]$ respectively. The two intensities have the same dimensions, and thus we have

\[[\text{Length}]^{3\alpha}[\text{Length}]^\beta[\text{Length}]^\gamma = [\text{Length}]^0.\]

What we need now is two extra pieces of physical information that tell us what values these constants $\alpha, \beta, \gamma$ take.

First, we know that the amplitude of radiation is proportional to the number of scatterers (electric fields supoerpose linearly), which is in turn proportional to $V$. Since intensity is proportional the square of the amplitude, we have $\alpha = 2$
Second, we should expect the radiation intensity to decrease inversely as the square of the distance (i.e. as $\frac{1}{r^2}$) Why? Far away from the dipole, the emitted radiation looks like it’s spread out over a sphere - the intensity of this light decreases at the rate at which the area increases, and since the area of the sphere is $4\pi r^2$ the result follows. Thus $\beta = -2$.

Substituting this back into the proportionality relation suggests that $\gamma = -4$, and thus $I \propto \lambda^{-4}$.

Now we have everything we need to answer the original question. We’ve just shown that light of a longer wavelength tends to have scattered light of a lower intensity, and this dependence is strong - slightly longer wavelengths are probably going to lead to significantly weaker emitted radiation. At the same time, we also know blue light has a shorter wavelength, so this relation tells us that blue light is scattered much more strongly in the atmosphere than other colours, and thus the sky appears blue⁷.

I think this is argument is quite remarkable! The usual approach requires slogging through Maxwell’s equations, but the dimensional argument circumvents these tedious steps - who would’ve thought?

Checking equations

Another way of getting a better grip over opaque-seeming equations is to look at its behaviour at certain extremes, and using our physical understanding to see if they make sense. Consider a charged circular disk in the $x-y$ plane centred at the origin with a radius $a$, which is uniformly charged with a charge density (charge per unit area) $\sigma$.

Now suppose I tell you that the electric field at some point $(0,0,z)$ on the $z$-axis has an electric field of

\[E(0,0,z) = \frac{\sigma}{2 \epsilon} \left\{1 - \frac{z}{\sqrt{a^2 + z^2}} \right\},\]

where $\epsilon$ is the permittivity (which for the purposes of this example you can think of as just a constant).

We now apply a series of sanity checks:

First of all, the electric field should be symmetric with respect to the $z$ axis, because the charges in the system are too. Thus it must be pointing along the $z$ axis, and the equation should have no dependence on $x$ or $y$. Check!
If the disc is positively charged, then the electric field should be point in the positive $z$ direction if $z>0$, and in the negative direction if $z < 0$. Whether this is satisfied is completely determined by the charge density $\sigma$, since the term in curly brackets is always positive if $a > 0$. When the plate is positively charged $\sigma > 0$, then $E(0,0,z) > 0$ as expected, and we can do a similar analysis in other cases. Check!
As $z \to 0$, we have $E(0,0,0) \to \frac{\sigma}{\epsilon}$ - this matches our expectations for the electric field from a charged plane, and is what the field should “look like” locally, very close to the disc⁸. Check!
As $z \to \infty$, the term in the curly brackets approaches 0, and thus the electric field tends to zero. This again matches our expectations, because the electric field from a charged plate infinitely far away should be close to zero. Check!

We can also try to be more nuanced with our analysis and describe what happens for large but finite $z$. The trick is to do a Taylor series expansion⁹, which you can show to be

\[E(0,0,z) = \frac{\sigma}{2 \epsilon} \left\{1 - \left\{ 1 - \frac{1}{2} \left( \frac{a}{2} \right)^2 + ...\right\} \right\}.\]

If we only pay attention to the lowest order terms (higher order terms will go to zero since $z$ is large¹⁰), then we see that

\[E(0,0,z) \approxeq \frac{\sigma_0 a^2}{4 \epsilon z^2} = \frac{\sigma_0 \pi a^2}{4 \pi \epsilon z^2} = \frac{Q}{4 \pi \epsilon z^2},\]

which is just what Coulomb’s law tells us is the electrostatic field from a point charge. This makes sense, because when we’re very far away from the disc such that $|z| » a$, the disc looks like a point. Check!

Thus the equation has passed all of the sanity checks, and it seems pretty likely that it correctly describes the electric field at $(z,0,0)$. We could also have applied other techniques that we’ve already seen, like dimensional analysis, to this problem. While these “tricks” don’t guarantee that your solution is correct, in my experience it does catch mistakes really often, and it is typically worth thinking about at least briefly.

Symmetries and abstraction

Another really powerful tool in a physicist’s toolbox is to look at the symmetries of a system, and tell you what form your equations should have as a result. Let’s take the example of the temperature of a metal rod:

Suppose we heat the rod at the centre, which we’ll label as $x = 0$. What’s going to happen to the rod next? Even if you’ve never thought about this problem before, you probably have some intuition for this. The heat spreads over some period of time, and probably eventually evens out. Moreover, you never see this process running in reverse - the rod never goes from having a uniform temperature to being hot at a single point.

If we also think in terms of the spatial behaviour, then if we have a homogeneous material, there shouldn’t be any particular preference for the heat to spread in any direction (in the 1D case, heat shouldn’t preferentially spread either to the left $x < 0$ or to the right $x > 0$).

How do we translate what we’ve just said into an equation? The claims in the last two paragraphs are really statements about the rates of change of the system under one variable; i.e. partial derivatives. If we call the temperature of the rod $T$, which is a function of the time $t$ and the position $x$, then we’re interested in forming a partial differential equation with $T, \frac{\partial T}{\partial t}, \frac{\partial T}{\partial x}, \frac{\partial^2 T}{\partial t^2}, \frac{\partial^2 T}{\partial x^2}, \frac{\partial^2 T}{\partial t \partial x}, …$

Which of these terms can we rule out? We argued that $T(t)$ shouldn’t give the same result as $T(-t)$, so that means that we’re only interested in derivatives of $T$ that are of odd order - i.e. $\frac{\partial T}{\partial t}, \frac{\partial^3 T}{\partial t^3},$ and so on. Similarly, we argued that “left” or “right” has no physical significance, and so at any instant, $T(x) = T(-x)$ ¹¹. If we want this property to be true, then we rule out all of the derivatives of odd order, leaving us with $T, \frac{\partial^2 T}{\partial t^2}$, and so on.

Should we expect to see a term that depends directly on the temperature distribution $T$ in the PDE? Since we’re describing heat transfer, what’s important is differences in temperature, so the value of $T$ itself shouldn’t matter, and we can discard this term too.

Now we appeal to simplicity, and consider the most basic possible scenario that satisfies the previous conditions. That is, we want to relate just the lowest order derivatives possible, $\frac{\partial T}{\partial t}$ and $\frac{\partial^2 T}{\partial x^2}$ in a PDE. This gives us

\[\frac{\partial T}{\partial t} = D \frac{\partial^2 T}{\partial x^2},\]

where $D$ is a constant.

If you’re familiar with differential equations, you’ll recognise this as the heat equation or diffusion equation (here $D$ is known as the diffusion constant). This describes particles in a gas, electrons in semiconductors, and more!

Let’s reflect on what we just did. Through symmetry arguments, we were able to “derive” (albeit non-rigorously) an equation that has a ton of explanatory power¹². The symmetries that we considered were:

Time-reversal symmetry: The system is fundamentally the same if $t \mapsto -t$ (which doesn’t apply to the metal rod)
Parity: The system is fundamentally the same if we reflect space around the origin, i.e. if $x \mapsto -x$ (which applies to the metal rod)
Isotropy: The system is fundamentally the same in all directions - this is subtly different from the previous one, in that we’re rotating space rather than reflecting it. The difference is perhaps better illustrated by going up to 2D - imagine a metal plate rather than a metal rod, in the $x-y$ plane. Then if the origin is heated, the heat spreads evenly in all directions. An isotropic system would “look the same” even if the axes were rotated, and a system with parity would “look the same” if one of the axes were flipped. In this case, the diffusion equation would have $\frac{\partial^2}{\partial x^2}$ be replaced by a Laplacian.

Many PDEs in physics can be described using similar symmetry arguments, like the Schrödinger and wave equations. I encourage you to try seeing how these arguments apply in those circumstances too.

If you’re a mathematician, then you’ll probably recognise this as a form of abstraction - we’re removing the details that are only specific to a particular situation, and leaving behind the “essence” of the physical system; general rules that describe how similar systems behave. This is very common in theoretical physics, where seemingly disparate facts seem to be related! However we’re now starting to drift a bit too far off topic, so I’ll leave discussion about this for another post.

Caveats

Applicability

If you’re a mathematician (and especially if you’re a pure mathematician), you may be complaining that the above methods don’t work very well in many cases, and aren’t good enough to produce proofs. Classic examples of these are rife in real analysis and topology, after all! If we’re just thinking qualitatively and geometrically, then coming up with a function that is both uniformly continuous and nowhere differentiable is pretty hard - one might even doubt that such a function exists. In 1872 however, Karl Weierstrass shocked the mathematical community with the discovery of just this - the infamous Weierstrass function:

\[f(x) = \sum_{n=0}^{\infty} a^n \cos (b^n \pi x),\]

where $0 < a < 1$, $b$ is an odd positive integer and $ab > 1 + \frac{3}{2} \pi$. This is a type of self-similar fractal curve, and really sucks if you’re leaning heavily on geometric intuitions. I mean, just take a look at the function!

Image source: Wikipedia

What makes things worse is that in the realm of physics, you don’t even need to turn to such pathological examples for lack of rigour. In the department of physics, swapping the order of integrals is something you can always do, Taylor expansions always exist, and Dirac deltas are functions. At this point, “physical intuition” becomes akin to “hand-waving”.

A similar kind of failure mode is when the intuition fails not to mathematical rigour, but to the intrinsic (physical) unintuiveness of the physical system. Our physical intuitions are heavily influenced by our experiences and what we’re more familiar with - perhaps we need to turn to other kinds of intuition that are more abstract; an intuition that is gained simply by thinking about the equations a lot.

Despite these problems, I claim that physical intuition can still be useful. It really does just happen to be the case the many things in the physical world are sufficiently well-behaved that the aforementioned abuses of mathematics work, and have a ton of explanatory power¹³. Using physical intuition can help you make sense of phenomena in a deeper way, and guide lines of research.

Uniqueness

Another objection you might have is that physical intuition isn’t all that unique - other methods could give similar results, and analogous forms of reasoning exist in other fields. For instance, chess players tend to talk conceptually about “positional imbalances”, like “I need counterplay, so I’m going to go for a queenside minority attack”. Humans don’t decide on moves via alpha-beta tree search (or at least most don’t). If these are true, then what’s the case for using “physical intuition” rather than these other methods?

I think this is a pretty good point. It’s not immediately clear to me what the difference between different kinds of intuition is, probably because “physical intuition” is a rather nebulous concept. It seems to encapsulate geometric thinking, “imagining what’s going to happen”, determining the most important features of a system, and having a feel for what’s right or wrong - but these are just my impressions, and I think active physics researchers may tell you something different. If you have any thoughts on this, I’d love to hear about it!

Conclusions

In this post we’ve seen a couple of techniques by which one can leverage physical understanding in order to make progress on challenging problems:

Understanding what the equation is describing and making simplifying assumptions, as in the ideal gas
Checking dimensions, as in Rayleigh scattering
Checking solutions and Taylor expansions, as in the uniformly charged disc
Looking for symmetries, as in the heat equation

These have their pitfalls, but I think they can still be useful in many cases where 100% rigour isn’t necessary.

How might one go about developing these intuitions? I’m hesitant to give too much advice here, because I don’t consider myself to be much of an expert in this regard. However, prima facie it seems pretty reasonable that the main way to gain this intuition is to deliberately practise the techniques mentioned above - check your dimensions, think about the symmetries of the problem, and so on. If you’re feeling up to it, I’d encourage you to try and come up with examples of situations where the above techniques can be helpful too - I’ve listed a few relatively technical ones in the footnotes which you’re welcome to dig into deeper¹⁴. I imagine this is somewhat akin to how you might develop intuitions in maths, for instance by modifying theorem statements to try and understand why the conditions of the theorem are as stated.

I would love it if somebody with a deeper understanding of physics wrote a more advanced version of this post. It would also be nice if a mathematician wrote the dual version of this post, i.e. “what is mathematical intuition?”, targeted towards physicists. Perhaps a future version of me will write this, but no guarantees!

If you know a bit of physics, you might recognise this as the distribution of speeds of particles in an ideal gas called the Maxwell-Boltzmann distribution. ↩
There’s actually a bit of a subtlety here - if the probability distribution were multiplied by the total number of particles in the box (usually denoted $N$), then integrating over this would give the total number of gas particles rather than 1. ↩
I’m not sure how funny this really is - I tend to think that the “jokes” I’ve come across in maths and physics aren’t all that funny. Allegedly, “bra-ket” notation is supposed to be funny, and “donut $\cong$ mug” is too. Maybe I’m just not thinking about this the right way! ↩
For the dynamical systems enthusiasts, this is very closely related to the Ergodic hypothesis. ↩
A mole of oxygen at STP corresponds to about 20 litres of gas. ↩
This has also saved me a few times in exams, where I’ve made some mistake manipulating symbols. ↩
This technically isn’t sufficient, because this would also imply that the sky ought to be purple, rather than blue! What resolves things is that (1) human eyes are most sensitive to light closer to green to we naturally see blue more strongly than purple, and (2) sunlight has a peak intensity at a wavelength that is close to green and blue. ↩
Technically this comes from a result that can easily be derived from Gauss’ law, but I’ll not go into this here. ↩
This of course assumes that the function is analytic, which is sloppy. It turns out that this is generally a pretty helpful thing to do in physics! More on this in the conclusion. ↩
One might worry that this is too much hand-waving - I think this is a very valid concern, which I’ll address in the final section. ↩
You could think of this as saying that $T$ is even with respect to $x$, but I don’t want to think this way because (as we’ll later see) we can generalise this argument to higher dimensions. ↩
I first heard about this argument from Dr Chris Hooley in my Mathematics for Physicists lectures at university, and was absolutely amazed! ↩
As a side philosophical note, I’m not really sure how to think about why things are so well-behaved, especially with anthropic considerations. I’d be very curious if anyone reading this has some thoughts! ↩
I think “understanding what the equation says” and “checking dimensions” are pretty universally applicable, so I’ll focus on the other three techniques. For “Taylor expanding the result”, try Landau Theory. Many examples of “looking for symmetries” are present in electromagnetism problems - e.g. justify why the electrostatic field from a uniformly charged plane is itself uniform. “Making simplifying assumptions” is present in most theories! An example of this is in my article on the physics of rainbows, where I try to illustrate the shifts in these theories, with increasing levels of technical rigour. ↩

Dynamical Billiards

2021-03-22T00:00:00+00:00

What happens if you play billiards on an elliptical table, with no friction?

One way of defining an ellipse is in terms of two points, each of which is called a focus point. The ellipse is then the locus of all points such that the sum of the distances from these two foci is always a constant. You can visualise it like this: put a loop of string around pins located at the foci, then pull the string taut at one point using a pencil. If you then slide the pencil while keeping the string tight, then the shape that you get is an ellipse.

Image source: Alex Through the Looking Glass

The reflective property of ellipses

What’s so cool about this definition? One thing that I think it brings to the table is that it allows us to intuitively understand a curious property of ellipses. Suppose you want to play a game of billiards (or pool, or snooker, or whatever takes your fancy), but instead of playing on a rectangular table, you play it on an elliptical table. What happens if when you hit the billiard ball, it passes through one of the focus points of the elliptical table?

Regardless of the initial direction, after passing through one focus, the billiard ball reflects off the ellipse and passes through the other focus.

Suppose you hit a billiard ball at $P_0$, sending it through the focus point $F_2$ to $P_1$. What happens next?

But why stop there? What happens after the ball bounces off the elliptical table a second time?

And again…

Each time the ball passes through one of the foci, it reflects off the elliptical table and passes through the other focus. Let’s take this further - what happens if we keep doing this?

We can see that after many bounces, the trajectory of the ball converges to the horizontal.

This poses the question: why does this happen? There are many ways to attack this problem, some of which are heftier than others (coordinate geometry!). If you’re familiar with Fermat’s principle of least time, then you can directly apply it here but with a billiard ball instead of light rays. In this case, we’re trying to get from the focus $F_2$ to $F_1$, and we’re subject to the constraint that the ball must also touch the elliptical table. But we know from the aforementioned definition of the ellipse that any such path must have exactly the same length! So this means that no matter the direction with which the billiard ball passes through $F_2$, any path that it takes to $F_1$ is a reflection, and so we get the reflective property of ellipses.

Caustic curves

Interestingly, what we get is an elliptical caustic curve that shares the same foci as the elliptical table, and so these are confocal ellipses.

What if we change things slightly and have the ball initially pass between the two foci? In this case we get a confocal hyperbola, which is really interesting.

Another interesting thing to note is that the caustic ellipse can be obtained via a conformal map (locally angle preserving). If the elliptical table has the equation

\begin{equation} \frac{x^2}{a^2} + \frac{y^2}{b^2} = 1, \end{equation} where $a$ and $b$ are the semi-major and semi-minor axes respectively. The the conformal caustic has equation

\begin{equation} \frac{x^2}{a^2 + \lambda} + \frac{y^2}{b^2 + \lambda} = 1, \end{equation} where $\lambda$ is a constant. When the denominator of the first term satisfies $a^2 + \lambda > 0$, then the caustic is an ellipse. If instead $-b^2 < \lambda < -a^2$, then it is a confocal hyperbola.

If we play around with the eccentricity of the ellipse (by changing the positions of the foci), then we get the special case of the circle. In this case, we get a concentric circle as a caustic, as you might expect.

Simulation and Resources

I’ve made a GeoGebra applet that shows this for 50 reflections in the ellipse, which you can play around with. You can find more information on my GeoGebra page or in my GitHub repository.

GeoGebra

You can change the eccentricity of the ellipse, and also hide or show the caustic curve (in blue) to see how things change. The initial trajectory of the billiard ball is uniquely determined by the positions of $P_0$ and $P_1$.

Resources

There are also other situations that could be investigated, such as the high-symmetry case where the eccentricity becomes zero, and analysing periodic orbits using a group/number theoretic approach. If you’re interested in finding out more, I recommend:

Elliptical Pool Table (Numberphile video)
The Birkhoff billiards page at dynamical-systems.org
Math 188r Dynamical systems lecture notes by Oliver Knill

Conway’s Game of Life

2021-02-04T00:00:00+00:00

Implementing Conway’s Game of Life in Python

This is a simulation of the late John Conway’s Game of Life that I made using Python 3 and Pygame. I’d heard of this game many times before, but I’d never gotten round to learning more about it until recently.

The GitHub repo for this project can be found here. I drew a lot of inspiration for how to start the project from Robert Heaton’s blog post, which you can find here.

Conway’s Game of Life

Conway’s Game of Life isn’t really a “game” in the conventional sense of the word because there are no players involved, making it a zero-player game. This might not seem very interesting on the surface, but the Game of Life has managed to grab the same level of attention (or more) as some of Conway’s other discoveries. So what makes it so intriguing?

One of the reasons for this is that the rules of Life are very simple, but somehow lead to really complex and unpredictable (not a technical term)¹ behaviour. There are only three simple rules of Life (see next section), but some of the patterns that people have come up with are truly remarkable.

Another reason why people find the Game of Life interesting is because it is able to simulate any computer algorithm, which can be carried out by a Turing Machine (assuming the Church-Turing thesis). Paul Rendell has even managed to do this!

Description

The basic rules of the game are (from Wikipedia):

Any live cell with two or three live neighbours survives.

Any dead cell with three live neighbours becomes a live cell.

All other live cells die in the next generation. Similarly, all other dead cells stay dead.

The game starts off in a dead state by default, where a blank grid is displayed. The option is provided to set the state to a random state, or a particular initial state can be set up by clicking on desired squares (i.e. mouse editing). To run or move to the next iteration, mouse editing must be disabled. “Runs” or automatic iterations can be paused, and after enabling mouse editing the new state can then be modified.

The Simulation

You can try out my implementation of the Game of Life here. Below are some snapshots of the simulation:

Gosper’s glider gun periodically shoots gliders down to the bottom right of the screen

A pseudo-randomly generated start state

Most cells have died after 108 iterations

Controls

SPACE: Enable/disable mouse editing
RIGHT_ARROW: Next iteration of Life. Requires disabled mouse editing.
P: Play/pause automatic iterations. Requires disabled mouse editing.
D: Dead state, or an empty board. Requires disabled mouse editing.
R: Random state. Requires disabled mouse editing.

When mouse editing is enabled, squares can be clicked on to change their state. Live cells are blue and dead cells are white. Other features like the board size, rate of iteration, colours, etc. can be changed in the “constants.py” file – see the GitHub repo for this.

Note that this isn’t a technical definition – in principle, the entire future of Life could be predicted on the basis of a known initial state, given enough computation. What I mean by “unpredictable” is more along the lines of “doesn’t match what we intuitively expect to see”. ↩

The Physics of Rainbows

2021-01-23T00:00:00+00:00

A little simulation of how rainbows are formed

Description

This simulation shows how a rainbow is formed from the perspective of geometric optics. Due to the wavelength dependence of refractive index, white light from the Sun disperses into a spectrum of colours. A single internal reflection forms what is known as the primary bow, whereas two internal reflections forms the secondary bow. Note that these are not total internal reflections, since this would imply that the light gets trapped in the drop forever!

The option is provided to show refraction out of the droplet at the 1st internal reflection point. This highlights the question of why the rainbow needs internal reflections to be formed, and can be understood in terms of maximum/minimum points. Specifically, one aims to find the stationary point of:

\[D_k(i) = k(\pi - 2r) + 2(i - r)\]

Where $D_k$ is the angular deviation the an exiting ray after the $k^{th}$ internal reflection, $i$ is the angle of incidence, and $r$ is the angle of refraction. At such a stationary point, the change in angular deviation is small, leading to a higher concentration of rays exiting the drop at the angle in question and forming the rainbow. An analogous stationary point does not exist for light exiting the drop without any internal reflections, so there is no angular region of increased ray concentration due to these rays, i.e. no rainbow.

The Simulation

Assumptions

There are two key physical assumptions made:

Raindrops are spherical: This is a good assumption when raindrops are small since surface tension effects dominate over other deforming forces. It’s also quite helpful, because it makes the geometry and calculations a lot simpler! As the drops get larger, like closer to the bottom of a rainbow (since the drops have fallen further and are more likely to coalesce), they become more susceptible to air resistance. This causes the drops to flatten, with roughly circular horizontal cross sections but noncircular vertical ones, and is a key reason why the bottom of the rainbow is brighter than the top.
Droplet size is irrelevant: This is based on thinking of light as rays and builds on the previous assumption, given the symmetry of the drop. This means that we can apply simple rules of geometric optics (reflection and refraction) and makes the problem a lot more computationally manageable.

Given these assumptions, we can take advantage of the problem symmetry by saying that incident light is horizontal, and changing the vertical displacement (impact parameter) of the ray. An alternative formulation would be to define two parameters: the angle of incidence and angle of refraction, as in the equation above.

Simulation

Controls

Horizontal axis (Not shown by default): Horizontal line drawn through the centre of the drop.
Impact parameter (Set to 0.68 by default): The vertical displacement of the incident ray from the horizontal axis, expressed as a decimal fraction of the drop radius (from 0 to 1 in magnitude). Negative values represent impingement on the bottom half of the drop.
Refractive index ($n$) ($n = 1.33$ by default): The refractive index of the raindrop. For simplicity, the refractive index of air is taken to be 1.
Spread: Used to help visualise the dispersion of light in the drop – in reality this effect is fairly small.
Secondary bow (Not shown by default): Shows the rainbow that is formed after two internal reflections.
Refraction at 1st internal reflection point (Not shown by default): Shows light that exits the drop without internal reflections – this is typically left out of diagrams because it doesn’t contribute to forming the rainbow, but I thought it would be instructive to see why this is the case.

It should be noted that this amount of spread may not necessarily be representative of a real rainbow, but is nevertheless useful as a visual aid. Note also that when the refractive index $n$ is set to 1, the simulation shows not white light (as it should) but still some dispersion.

How to travel upwards in time

2021-01-16T00:00:00+00:00

What does language reveal about how we think about time? Why do look forward to the future, and back into the past? And how does this vary across languages?

Long Time No See

Time is strange. We can’t see or hear it directly, and philosophers seem to be having a hard time pinning down what it actually is. Regardless of how abstract it is, it at least feels really real – the years pass by faster and faster, and deadlines are always just around the corner.

For such an abstract concept, it helps to have some way of thinking about it that makes it more tractable. Clocks and calendars are really good at this, because they allow us to visualise time spatially. But what about words? How do we talk about time?

Here’s where things start to get interesting. If you do a bit of searching, it’s not hard to find examples where we talk about time as if had spatial attributes. For instance, we might arrange to meet a friend in an hour, as if an “hour” were something that we could be “inside”. We can also be on time, work from nine to six, and reach home at seven. In English, there’s no term specifically devoted to measuring temporal quantities, like “duratiousness”. Instead, we say that things take a “short” or “long” time.

So it seems that we often speak of time using spatial metaphors. This in itself is a curious idea, but just for the fun of it, let’s do perhaps a bit more thinking than the situation deserves!

A Relatively Good Time

All this talk about space and time seems to suggest that we can try and map things out – what does our language suggest about our mental models of time? To be more precise, let’s draw a 1-dimensional time axis¹.

Fig. 1: A simple 1D time axis

From our experience, we might expect it to look something like this, with past, present and future. But wait – words like “present” are defined relative to an observer, so we’ll need to draw in a person as well.

One way to do this would be to make the person moving along the time axis, with the past constantly behind and the future constantly ahead. This assumes a stationary time axis, but what if we’ve gotten it the wrong way round? Perhaps we’re standing at this point called the “present”, and time is flowing towards us!

Fig. 2: Two perspectives of time

How do we know which of these two views is correct? To test this, try and answer the following question:

Next Wednesday’s meeting is moved forward two days. Which day is the meeting now?

If you think that the answer is obvious, I encourage you to ask your friends for their opinions. Trust me, you’ll be surprised!

Most answers are either “Monday” or “Friday”, and the original psychology experiment that asked this question found a 50-50 split in the responses. This certainly came as a surprise to me, because I was adamant that the correct answer would be “Friday”. So what’s behind this result?

In the study, the authors suggest that different people have different mental representations of time. If you thought that the answer was “Monday”, then you see yourself as stationary and time as flowing towards you, i.e. the time-moving perspective. Moving the meeting “forward” would mean moving the meeting in the direction that it was already heading; relative to the meeting.

Fig 3: Is the meeting now on Monday or Friday?

If you’re like me thought that the meeting would be shifted to Friday, then you instead see time as a stationary axis that you move along, i.e. the ego-moving perspective. In this case, “forward” is direction that the observer “moves” along; relative to the observer.

The researchers further found that they could change the proportion of answers from participants using spatial cues. For instance, subjects that rode an office chair across a room were more likely to answer “Friday”, because this primes the brain to think of us as moving through a fixed absolute space. Conversely, those who pulled an office chair toward them with a rope were more likely to answer Monday, with the environment “moving” past us and priming the time-moving perspective.

Fig. 4: Changing our mental representations of time using spatial cues

So depending on how you want to look at it, both “Monday” and “Friday” could be correct. What this implies is that the way we think about time is very closely tied to how we think about space, and also changes with context. If our brains have been primed appropriately, we may be more likely to adopt an ego-moving perspective and vice versa. Good luck making it to the meeting on time!

How to Travel Upwards in Time

Fig. 5: Representing the future as ahead and the past behind

Another thing that we could consider is the direction of the observer relative to the time axis. We’ve assumed that the future “lies ahead”, but what if it isn’t? Why don’t we say that the “future lies behind”, for example?

Fig. 6: Seeing the past in front of us

Believe it or not, this is exactly how the Aymara people seem to visualise time, where the word q”ipa, roughly meaning “behind” or “the back”, is used to talk about the future. We also see similar features in other languages. The Māori proverb, Kia whakatōmuri te haere whakamua translates to “I walk backwards into the future with my eyes fixed on my past”. In Malagasy, past events are referred to with words that connote forwardness. An interesting explanation is offered for this: the past and present are known and so exists “before one’s eyes”, but the future is as yet unseen and so lies behind.

But why stop here? What if the time axis isn’t horizontal? In Chinese languages, “last year” is literally said as 上年 “up year”, and next year is 下年 “down year”. There’s one more example that I find especially intriguing. Thus far we’ve seen examples where the axes are oriented relative to the observer, but what if the time axis were absolute and fixed?

Fig. 7: A vertical time axis (left) and an absolute time axis (right)

For the aboriginal Pormpuraaw people, the time axis is fixed according to the cardinal directions. Unlike in English, where we talk about space in relative terms like “left” and “right”, Pormpuraawan languages are inextricably linked to absolute space². To able to communicate, you would need to be constantly aware of where East and West are! In a study by Lera Boroditsky and Alice Gaby, the authors write:

In Kuuk Thaayorre (one of the languages included in this study), to say hello, one says, “Where are you going?” and an appropriate response would be, “a long way to the south-southwest.” Thus, if you do not know which way is which, you literally cannot get past hello.

Their mental representations of time seemed also to resemble the way that they thought about space, with the past in the East and the future in the West, perhaps relating to the apparent rising and setting of the Sun.

So when we say things like “the future lies ahead”, or “I’ve put the past behind me”, we’re really using a very particular way of visualising things that may not be common across different cultures. Different people can look at the same world in myriad ways, and it can be all too easy to treat our views as gospel.

Much Time No See?

For most of this article, I’ve talked about how our temporal understanding of time is often achieved using a metaphor with space. While there are many examples to support this view, it worth noting that this is by no means a hard-and-fast rule, and that many exceptions exist.

For instance, if you’re a Spanish speaker, you’re more likely to say mucho tiempo or “much time” instead of “long time”, and similar patterns can be found in Greek and Italian. Rather than picturing time with spatial axes and lengths, it is interpreted as something with volume.

Many of the patterns we’ve seen are also heavily context dependent. Chinese speakers don’t only use vertical space-time metaphors – while “up year” might refer to “last year” with a vertical time axis, 前年 “front year” means “the year before last year” with a horizontal time axis. Linguists seem to have different interpretations of how to reconcile these differing metaphors, and in general it can be quite hard to give consistent explanations across multiple contexts.

There are countless other interesting questions that we could examine if we dive deeper, like the effect of writing direction and the mental representations of time in bilinguals and polyglots. But these are probably best left for another day. What I hope I’ve shown through this article is that even for something as universal as time, there is a diverse set of ways that we can think about it. In English, we have a habit of speaking of time in terms of front-back spatial metaphors, but maybe there’s more to that than meets the eye.

Perhaps you’re wondering why time ought to be a constrained to a 1-dimensional axis. Why not 2? Or 3? Or maybe 2.5? Frankly, I’m not sure, but for the sake of argument let’s just assume a single past-present-future axis. ↩
Of course, physicists now know that thinking in terms of absolute space and time aren’t quite right, but let’s ignore that for the time being! ↩

Look out for the N-rays!

2020-09-29T00:00:00+00:00

How does confirmation bias affect physics?

The Discovery of N-rays

France, 1904
Professor René Blondlot is about to make physics history.

Just last year, he announced his discovery of a previously unknown part of the electromagnetic spectrum: N-rays. Allegedly, these rays are emitted by almost all objects and have a host of bizarre properties, like the ability to change the brightness of sparks and improve the resolution of distant objects.

News of his discovery spread rapidly throughout the physics community, but many were unable to reproduce his experiments. Amidst the resulting scepticism, Nature sent Professor Robert W. Wood to Blondlot’s lab to investigate things further.

This is where we are now – a dimly lit physics laboratory at the University of Nancy. Robert Wood watches with scrutiny as Blondlot and his lab assistant run through their demonstration. The alleged N-rays are split using an aluminium prism and observed using phosphorescent paint on a screen, which should glow ever so slightly when the rays impinge upon it. In order to see the effects, the room has to be very dark, and the eyes of the observer sensitive to even the most minute changes. If Blondlot is right and N-rays do indeed exist, a distinct spectrum will be observed. With everything meticulously set up, all that is left to do is to let physics do its thing.

Wood watches with anticipation and sees something truly remarkable: absolutely nothing. Blondlot on the other hand, is absolutely convinced that the spectrum is distinctly visible, and he attributes Wood’s inability to see the pattern to a “lack of sensitiveness in the eye”. In light of these assertions, Wood proposes a follow-up test: he would adjust the aluminium prism, and Blondlot would have to tell when the changes were being made and how the prism was aligned.

Secretly though, Wood removes the prism altogether, ruining any chance of the spectrum being observed. But neither Blondlot nor his assistant detect any change – they remain convinced that they see the same pattern, utterly oblivious to what has happened.

Wood’s conclusions are quite self-explanatory:

“I am not only unable to report a single observation which appeared to indicate the existence of the rays, but left with a very firm conviction that the few experimenters who have obtained positive results have been in some way deluded.”

Robert W. Wood, in his paper The n-Rays

Confirmation Bias

Coming back to the present, we now know that the purported N-rays were just a figment of the imagination. This is true not just of Blondlot and his assistant, but also of the other physicists who claimed to have replicated his experimental findings. How could all of these scientists have believed so firmly in the existence of non-existent rays? The answer is confirmation bias, and is also the delusion that Wood alluded to. This involves favouring information that conforms to our beliefs while ignoring evidence to the contrary.

The reason for this bias is not entirely clear, but it likely stems from our natural impulses, such our tendency to prefer coherent narratives in thinking. This has been shown in MRI studies, where observed brain activity suggests active yet unintentional attempts to minimise cognitive dissonance. This leads to both avoidance of contradicting information because we do not like to be wrong, and seeking supporting evidence because we like to be correct.

In physics, there are two main ways in which this happens. The first of these is the biased search for information. In Blondlot’s experiment, he was explicitly looking for a miniscule change in the luminosity of phosphorescent paint. It appears that he tried so hard that he managed to force it into (his) existence!

The second way is through the biased interpretation of information. This could be through assigning more weight to certain results, or by simply ignoring evidence that supports alternative explanations. As previously mentioned, one of the apparent features of the N-rays was that it could change the brightness of sparks. It turns out that these changes were in fact due to normal, random fluctuations that sparks usually undergo regardless of any intervention. However, when Blondlot observed such fluctuations, he quickly attributed them to the effect of the N-rays. This just goes to show that our expectations can really distort how we view things.

What should I do?

The damage can be great if confirmation bias is ignored. Incidentally, Blondlot’s reputation was brought to its knees, and he never recovered from the resulting public humiliation. Allowing biases to remain unchecked during the research process can also diminish both the quality of collected data and the accompanying analysis.. Imagine having to repeat an experiment from scratch all because of this… ugh! It can even sustain false theories despite contradicting evidence.

Eradicating the bias, however, is not always a simple process. Consider the following question:

Q: Your physics experiment stubbornly refuses to obey the laws of physics. What should you say? (You may choose one or more answers.)
(A) “It’s probably just random error”
(B) “I’ll fake the data… Who could possibly know?”
(C) “What have I done wrong this time?”
(D) “Physics is broken!”

Options A to C all describe situations of confirmation bias – in some sense, a denial of the existing evidence. Option D would therefore be the “correct” option, right? As you can probably tell, things are not so simple. For instance, if the physical law that is violated is very well established experimentally, there is probably more reason to believe that something has gone wrong than to declare “the end of physics as we know it”. Of course, there is also some chance that the law is indeed wrong, but this would be very unlikely and the chances would have to be weighed accordingly. You could therefore imagine how difficult it might be to suggest that a widely accepted law is flawed in some way.

Although it is practically impossible to eliminate confirmation bias completely, there are fortunately still some things that we can do to mitigate it.

List ideas as to why the hypothesis might be incorrect. It can also be helpful to list alternative hypotheses and to see whether these are consistent with findings. Knowing about these other possibilities can help limit how much we search for evidence that only supports a particular view.
Constantly ask yourself whether you are processing information in biased fashion. What is your first reaction to a book titled “The Moon Landing Hoax”? Why do you have that particular impression? Would it be a good idea to look at what the book says? Doing this does not mean that you have to agree with whatever you see; rather, it means being aware of our reasoning and identifying weaknesses in how we see information. Beyond the individual level, there are also other ways of combatting confirmation bias. One of these is scientific peer review, although it is important to note that this process is itself prone to the bias! It is also useful to have attempts to replicate the original experiment. As a matter of fact, this was what eventually spelled doom for Blondlot’s N-ray theory – the failure to replicate the experiment consistently and independently. These methods are far from being a cure-all, however.

With the benefit of hindsight, it can be very easy to dismiss the N-ray debacle as a singular case of delusion, and to think that this is not relevant to ourselves. But it is important to realise that no one is infallible. Everybody is susceptible to confirmation bias, and if we’re not careful, there’s no telling whether or not we might see N-rays of our own.