# Particle Physics Planet

## January 20, 2018

### Emily Lakdawalla - The Planetary Society Blog

## January 19, 2018

### Christian P. Robert - xi'an's og

**T**oday, I received the Norse Farce cups I had designed with the help of Thomas! While just as easy to replicate on sites like Vistaprint, I have a few left in case some Og’s readers are interested!

### Lubos Motl - string vacua and pheno

Even Edward Measure knew that dark matter was a bad fit.

While young Sheldon was finally bought a computer (while Missy got a plastic pony and their father bought some beer which almost destroyed his marriage), the old Sheldon took pictures of himself with his baby, some work in string theory.

I wonder how many of you can isolate the paper(s) that contain the same (or similar) diagrams and equations as Sheldon's new work. ;-)

Two screenshots are embedded in this blog post. Click at them to magnify them. A third one is here, too.

Incidentally, in order to focus on work, Sheldon had to hire his old bedroom where he no longer lives. He acted pleasantly which drove Leonard up the wall. Meanwhile, Rajesh met a blonde in the planetarium and had sex with her. It turned out she was married, her husband was unhappy about the relationship, Rajesh restored that broken marriage, and almost dated the husband afterwards. ;-)

by Luboš Motl (noreply@blogger.com) at January 19, 2018 07:10 PM

### Peter Coles - In the Dark

After spending a big chunk of yesterday afternoon chatting the cosmic microwave background, yesterday evening I remembered a time when I was trying to explain some of the related concepts to an audience of undergraduate students. As a lecturer you find from time to time that various analogies come to mind that you think will help students understand the physical concepts underpinning what’s going on, and that you hope will complement the way they are developed in a more mathematical language. Sometimes these seem to work well during the lecture, but only afterwards do you find out they didn’t really serve their intended purpose. Sadly it also sometimes turns out that they can also confuse rather than enlighten…

For instance, the two key ideas behind the production of the cosmic microwave background are recombination and the consequent decoupling of matter and radiation. In the early stages of the Big Bang there was a hot plasma consisting mainly of protons and electrons in an intense radiation field. Since it was extremely hot back then the plasma was more-or-less fully ionized, which is to say that the equilibrium for the formation of neutral hydrogen atoms via

lay firmly to the left hand side. The free electrons scatter radiation very efficiently via Compton scattering

thus establishing thermal equilibrium between the matter and the radiation field. In effect, the plasma is *opaque* so that the radiation field acquires an accurate black-body spectrum (as observed). As long as the rate of collisions between electrons and photons remains large the radiation temperature adjusts to that of the matter and equilibrium is preserved because matter and radiation are in good thermal contact.

Eventually, however, the temperature falls to a point at which electrons begin to bind with protons to form hydrogen atoms. When this happens the efficiency of scattering falls dramatically and as a consequence the matter and radiation temperatures are no longer coupled together, i.e. decoupling occurs; collisions can longer keep everything in thermal equilibrium. The matter in the Universe then becomes transparent, and the radiation field propagates freely as a kind of relic of the time that it was last in thermal equilibrium. We see that radiation now, heavily redshifted, as the cosmic microwave background.

So far, so good, but I’ve always thought that everyday analogies are useful to explain physics like this so I thought of the following.

When people are young and energetic, they interact very extensively with everyone around them and that process allows them to keep in touch with all the latest trends in clothing, music, books, and so on. As you get older you don’t get about so much , and may even get married (which is just like recombination, not only that it involves the joining together of previously independent entities, but also in the sense that it dramatically reduces their cross-section for interaction with the outside world). As time goes on changing trends begin to pass you buy and eventually you become a relic, surrounded by records and books you acquired in the past when you were less introverted, and wearing clothes that went out of fashion years ago.

I’ve used this analogy in the past and students generally find it quite amusing even if it has modest explanatory value. I wasn’t best pleased, however, when a few years ago I set an examination question which asked the students to explain the processes of recombination and decoupling. One answer said

Decoupling explains the state of Prof. Coles’s clothes.

Anyhow, I’m sure there’s more than one reader out there who has had a similar experience with an analogy that wasn’t perhaps as instructive as hoped or which came back to bite you. Feel free to share through the comments box…

Follow @telescoper## January 18, 2018

### Christian P. Robert - xi'an's og

**T**his afternoon, Alexander Ly is defending his PhD thesis at the University of Amsterdam. While I cannot attend the event, I want to celebrate the event and a remarkable thesis around the Bayes factor [even though we disagree on its role!] and the *Jeffreys’s Amazing Statistics Program* (!), otherwise known as JASP. Plus commend the coolest thesis cover I ever saw, made by the JASP graphical designer Viktor Beekman and representing Harold Jeffreys leading empirical science workers in the best tradition of socialist realism! Alexander wrote a post on the JASP blog to describe the thesis, the cover, and his plans for the future. Congratulations!

### Emily Lakdawalla - The Planetary Society Blog

### Lubos Motl - string vacua and pheno

Beyond Falsifiability: Normal Science in a Multiverse (arXiv, Jan 2018)On his self-serving blog, Carroll promoted his own preprint.

Well, once I streamline them, his claims are straightforward. Even though we can't see outside the cosmic horizon – beyond the observable Universe – all the grand physical or cosmological theories still unavoidably have something to say about those invisible realms. These statements are scientifically interesting and they're believed to be more correct if the corresponding theories make correct predictions, are simpler or prettier explanations of the existing facts, and so on.

There's a clear risk that my endorsement could be misunderstood. Well, I think that Sean Carroll's actual papers about the Universe belong among the bad ones. So while I say it's right and legitimate to be intrigued by all these questions, propose potential answers, and claim that some evidence has strengthened some answers and weakened others, it doesn't mean that I actually like the way how Carroll is using this freedom.

In particular, his papers that depend on his completely wrong understanding of the probability calculus – and that promote as ludicrously wrong concepts as the Boltzmann brains – are rather atrocious as argued in dozens of TRF blog posts.

Peter W*it isn't quite hysterical but unsurprisingly, he still trashes Carroll's papers. According to W*it, Carroll is arguing against a straw man – the "naive Popperazism" – while he ignores the actual criticism which is that the evaluation of these multiverse and related theories (and even all of string theory, according to W*it) isn't just hard: it's impossible. The straw man is the claim that "things that can't be observed should never be discussed in science". W*it asserts that he has never made this claim; well, I would disagree because that's what he has said a few times and what he wanted his fans to believe very many times.

But let's ignore that the straw man isn't quite a straw man. Let's discuss W*it's claim that it's impossible to validate the multiverse-like cosmological theories even in principle. Is it impossible?

Well, it just isn't impossible. The literature is full of – correct and wrong – arguments saying that one theory or one model is more likely or less likely because it implies or follows from some results or assumptions that are rather empirically successful or unsuccessful. I found it necessary to say that the literature sometimes contains wrong claims of this type as well. But they're wrong claims of the right type. The authors are still trying to do science properly – and many other scientists do it properly and it's clearly possible to do it properly, even in the presence of the multiverse.

As Carroll correctly says, all this work still derives the scientific authority from abduction, Bayesian inference, and empirical success. For example, Jack Sarfatti has a great scientific authority because he was abducted by the extraterrestrial aliens. ;-) OK, no, this isn't the "abduction" that Carroll talks about. Carroll recommends "abduction" as a buzzword to describe the "scientific inference leading the scientists to the best explanation". So "abduction" is really a special kind of inference or induction combined with some other methods and considerations that are common in theoretical physics and a longer explanation may be needed – and there would surely be disagreements about details.

If you're a physics student who knows how to do physics properly, you don't need to know whether someone calls it inference, induction, or abduction!

But it's possible to do science even in the presence of unobservable objects and realms that are needed by the theory. The theory still deals with observable quantities as well. And if the agreement with the observed properties of the observable entities logically implies the need for some particular unobserved entities and their properties, well, then the latter are experimentally supported as well – although they are unobservable directly, they're indirectly supported because they seem necessary for the right explanation of the things that are observable.

Also, W*it observes that "some theoretical papers predict CMB patterns, others don't". But even if one proposes a new class of theories or a paradigm that makes no specific observable CMB or other predictions, it may still be a well-defined, new, clever class of theories and the particular

*models*constructed within this new paradigm will make such CMB predictions. Because the discovery of the class or the paradigm or the idea is a

*necessary condition*for finding the new models that make (new and perhaps more accurate) testable predictions, it's clearly a vital part of science as well – despite the particular papers' making no observable predictions!

Peter W*it has never done real science in his life so he can't even imagine that this indirect reasoning and "abduction" – activities that most of the deep enough theoretical physics papers were always all about – is possible at all. He's just a stupid, hostile layman leading an army of similar mediocre bitter jihadists in their holy war against modern physics.

There's another aspect of W*it's criticism I want to mention. At some moment, he addresses Carroll's "another definition of science":

Well, I am not 100% certain it's right to say that we can't avoid the multiverse. On the other hand, I understand the case for the multiverse and I surely agree that physics is full of situations in which "we don't have a choice" is the right conclusion.Carroll:The best reason for classifying the multiverse as a straightforwardly scientific theory is that we don’t have any choice. This is the case for any hypothesis that satisfies two criteria:

- It might be true.
- Whether or not it is true affects how we understand what we observe.

Peter W*it doesn't like Carroll's quote above because it also allows "supreme beings" as a part of science. I don't see what those exact sentences have to do with "supreme beings" – why would an honest person who isn't a demagogue suddenly start to talk about "supreme beings" in this context. Nevertheless, I see a clear difference in W*it's recipe what science should look like and it's the following principle:

Whatever even remotely smells of "supreme beings", "God", or any other concept that has been labeled blasphemous ;-) by W*it has to be banned in science.W*it hasn't articulated this principle clearly – because he doesn't have the skills to articulate

*any*ideas clearly. But one may prove that he has actually

*applied*this principle thousands of times. Apologies but this principle is incompatible with honest science that deals with deep questions.

Important discoveries in theoretical physics may totally contradict and be the "opposite" of stories about "supreme beings" (or any other "unpopular" concepts); but they may also resemble the stories about "supreme beings" in any way – in a way that simply cannot be "constrained" by any pre-existing assumptions. The correct theories of physics must really be allowed to be

*anything*. Any idea, however compatible with Hitler's or Stalin's or W*it's ideology or political interests, must be allowed to "run" as a candidate for the laws of physics.

W*it clearly denies this basic point – that science is an open arena without taboos where all proposed ideas must compete fairly. He wants science to be just a servant that rationalizes answers that were predetermined by subpar pseudointellectuals such as himself and their not terribly intelligent prejudices.

That's not what real good science looks like. In real good science, answers are only determined once a spectrum of hypotheses is proposed, they are compared, and one of them succeeds in the empirical and logical tests much more impressively than others. Only when that's done, the big statements about "what properties the laws of physics should have" can be made authoritatively. W*it is making them from the beginning, before he actually does or learn any science, and that's not how a scientist may approach these matters.

If the vertices in the Feynman diagram were found to be generalized prayers to a supreme being, and if the corresponding scattering amplitudes could be interpreted as responses of the supreme being that generalize God's response to Christian prayers ;-), then it's something that physicists would have to seriously study. I don't really propose such a scenario seriously and my example is meant to be a satirical exaggeration (well, even if such an analogy were possible, I still think it would also be possible to ignore it and avoid the Christian jargon completely). But I am absolutely serious about the spirit. Whether something sounds unacceptable or ludicrous to people with lots of prejudices should never be used as an argument against a scientific model, theory, framework, or paradigm. (See Milton Friedman's F-twist for a strengthened version of that claim.)

That's why the influence of subpar pseudointellectuals such as W*it on science must remain strictly at zero if science is supposed to remain scientific – and avoid the deterioration into a slut whose task is to improve the image of a pre-selected master ideology or philosophy in the eyes of the laymen. Just like it was wrong for the Catholic church to demand that science serves the church or its God, it was also wrong to demand science to serve the Third Reich or the Aryan race or the communist regime, and it is wrong to demand that science must serve the fanatical atheists or West's leftists in general.

P.S.: In a comment, W*it wrote:

Rod Deyo,99.5% of this stuff written by W*it is composed of bullšit because Joe Polchinski was 97% serious.

Polchinski provided a reductio ad absurdum argument against the Bayesianism business in a paper for the same proceedings as the Carroll one. He calculated a Bayesian probability of “over 99.7%” for string theory, and 94% for the multiverse.

Polchinski realizes that none of these values is "canonical" or "independent of various choices" and he likes to say (and explicitly wrote in his explanation of his usage of the Bayesian inference) that the Bayesian reasoning isn't the main point – physics is the main point – but he simply wanted to be quantitative about his beliefs and these are the actual fair subjective probabilities for the propositions he ended up with. That's completely analogous to the number 10% by which Polchinski once quantified his subjective belief that a cosmic string would be experimentally observed in some period of time (I forgot whether it was "ever" or "before some deadline"). I have repeatedly written similar numbers expressing my beliefs. It makes some sense. We don't need to talk in this way but we may and it's sometimes useful.

So Polchinski hasn't provided any argument against the Bayesian inference. He has pretty much seriously used the Bayesian inference in a somewhat unusual setup.

by Luboš Motl (noreply@blogger.com) at January 18, 2018 08:33 PM

### Tommaso Dorigo - Scientificblogging

### Peter Coles - In the Dark

Since the Bayeux Tapestry (which, being stitched rather than woven, is an embroidery rather than a tapestry) is in the news I thought I’d share some important information about the insight this article gives us into 11th century hairstyles.

As you know the Bayeux ~~Tapestry~~ Embroidery concerns the events leading up to the Battle of Hastings between the Saxons (who originated in what is now a part of Germany) led by Harold Godwinson (who had relatives from Denmark and Sweden) and the Normans (who lived at the time in what is now France, but who came originally from Scandinavia).

Most chronicles of this episode leave out the important matter of the hair of the protagonists, and I feel that it is important to correct this imbalance here.

Throughout the Bayeux Untapestry, the Saxons are shown with splendid handlebar moustaches, exemplified by Harold Godwinson himself:

This style of facial hair was obviously *de rigueur* among Saxons. The Normans on the other hand appeared to be clean-shaven, not only on their front of their heads but also on the back:

This style of *coiffure* looks like it must have been somewhat difficult to maintain, but during the Battle of Hastings would mostly have been hidden under helmets.

With a decisive advantage in facial hair one wonders how the Saxons managed to lose the battle, but I can’t help thinking the outcome would have been different had they had proper beards.

Follow @telescoper### Emily Lakdawalla - The Planetary Society Blog

## January 17, 2018

### Christian P. Robert - xi'an's og

**A** question that came out on X validated today kept me busy for most of the day! It relates to an earlier question on the best unbiased nature of a maximum likelihood estimator, to which I pointed out the simple case of the Normal variance when the estimate is not unbiased (but improves the mean square error). Here, the question is whether or not the maximum likelihood estimator of a location parameter, when corrected from its bias, is the best unbiased estimator (in the sense of the minimal variance). The question is quite interesting in that it links to the mathematical statistics of the 1950’s, of Charles Stein, Erich Lehmann, Henry Scheffé, and Debabrata Basu. For instance, if there exists a complete sufficient statistic for the problem, then there exists a best unbiased estimator of the location parameter, by virtue of the Lehmann-Scheffé theorem (it is also a consequence of Basu’s theorem). And the existence is pretty limited in that outside the two exponential families with location parameter, there is no other distribution meeting this condition, I believe. However, even if there is no complete sufficient statistic, there may still exist best unbiased estimators, as shown by . But Lehmann and Scheffé in their magisterial 1950 Sankhya paper exhibit a counter-example, namely the U(θ-1,θ-1) distribution:

since no non-constant function of θ allows for a best unbiased estimator.

Looking in particular at the location parameter of a Cauchy distribution, I realised that the Pitman best equivariant estimator is unbiased as well [for all location problems] and hence dominates the (equivariant) maximum likelihood estimator which is unbiased in this symmetric case. However, as detailed in a nice paper of Gabriela Freue on this problem, I further discovered that there is no uniformly minimal variance estimator and no uniformly minimal variance unbiased estimator! (And that the Pitman estimator enjoys a closed form expression, as opposed to the maximum likelihood estimator.) This sounds a bit paradoxical but simply means that there exists different unbiased estimators which variance functions are not ordered and hence not comparable. Between them and with the variance of the Pitman estimator.

### Sean Carroll - Preposterous Universe

I have a backlog of fun papers that I haven’t yet talked about on the blog, so I’m going to try to work through them in reverse chronological order. I just came out with a philosophically-oriented paper on the thorny issue of the scientific status of multiverse cosmological models:

Beyond Falsifiability: Normal Science in a Multiverse

Sean M. CarrollCosmological models that invoke a multiverse – a collection of unobservable regions of space where conditions are very different from the region around us – are controversial, on the grounds that unobservable phenomena shouldn’t play a crucial role in legitimate scientific theories. I argue that the way we evaluate multiverse models is precisely the same as the way we evaluate any other models, on the basis of abduction, Bayesian inference, and empirical success. There is no scientifically respectable way to do cosmology without taking into account different possibilities for what the universe might be like outside our horizon. Multiverse theories are utterly conventionally scientific, even if evaluating them can be difficult in practice.

This is well-trodden ground, of course. We’re talking about the cosmological multiverse, not its very different relative the Many-Worlds interpretation of quantum mechanics. It’s not the best name, as the idea is that there is only one “universe,” in the sense of a connected region of space, but of course in an expanding universe there will be a horizon past which it is impossible to see. If conditions in far-away unobservable regions are very different from conditions nearby, we call the collection of all such regions “the multiverse.”

There are legitimate scientific puzzles raised by the multiverse idea, but there are also fake problems. Among the fakes is the idea that “the multiverse isn’t science because it’s unobservable and therefore unfalsifiable.” I’ve written about this before, but shockingly not everyone immediately agreed with everything I have said.

Back in 2014 the *Edge* Annual Question was “What Scientific Theory Is Ready for Retirement?”, and I answered Falsifiability. The idea of falsifiability, pioneered by philosopher Karl Popper and adopted as a bumper-sticker slogan by some working scientists, is that a theory only counts as “science” if we can envision an experiment that could potentially return an answer that was utterly incompatible with the theory, thereby consigning it to the scientific dustbin. Popper’s idea was to rule out so-called theories that were so fuzzy and ill-defined that they were compatible with literally anything.

As I explained in my short write-up, it’s not so much that falsifiability is completely wrong-headed, it’s just not quite up to the difficult task of precisely demarcating the line between science and non-science. This is well-recognized by philosophers; in my paper I quote Alex Broadbent as saying

It is remarkable and interesting that Popper remains extremely popular among natural scientists, despite almost universal agreement among philosophers that – notwithstanding his ingenuity and philosophical prowess – his central claims are false.

If we care about accurately characterizing the practice and principles of science, we need to do a little better — which philosophers work hard to do, while some physicists can’t be bothered. (I’m not blaming Popper himself here, nor even trying to carefully figure out what precisely he had in mind — the point is that a certain cartoonish version of his views has been elevated to the status of a sacred principle, and that’s a mistake.)

After my short piece came out, George Ellis and Joe Silk wrote an editorial in *Nature*, arguing that theories like the multiverse served to undermine the integrity of physics, which needs to be defended from attack. They suggested that people like me think that “elegance [as opposed to data] should suffice,” that sufficiently elegant theories “need not be tested experimentally,” and that I wanted to “to weaken the testability requirement for fundamental physics.” All of which is, of course, thoroughly false.

Nobody argues that elegance should suffice — indeed, I explicitly emphasized the importance of empirical testing in my very short piece. And I’m not suggesting that we “weaken” anything at all — I’m suggesting that we physicists treat the philosophy of science with the intellectual care that it deserves. The point is not that falsifiability used to be the right criterion for demarcating science from non-science, and now we want to change it; the point is that it never was, and we should be more honest about how science is practiced.

Another target of Ellis and Silk’s ire was Richard Dawid, a string theorist turned philosopher, who wrote a provocative book called *String Theory and the Scientific Method*. While I don’t necessarily agree with Dawid about everything, he does make some very sensible points. Unfortunately he coins the term “non-empirical theory confirmation,” which was an extremely bad marketing strategy. It *sounds like* Dawid is saying that we can confirm theories (in the sense of demonstrating that they are true) without using any empirical data, but he’s not saying that at all. Philosophers use “confirmation” in a much weaker sense than that of ordinary language, to refer to any considerations that could increase our credence in a theory. Of course there are *some* non-empirical ways that our credence in a theory could change; we could suddenly realize that it explains more than we expected, for example. But we can’t simply declare a theory to be “correct” on such grounds, nor was Dawid suggesting that we could.

In 2015 Dawid organized a conference on “Why Trust a Theory?” to discuss some of these issues, which I was unfortunately not able to attend. Now he is putting together a volume of essays, both from people who were at the conference and some additional contributors; it’s for that volume that this current essay was written. You can find other interesting contributions on the arxiv, for example from Joe Polchinski, Eva Silverstein, and Carlo Rovelli.

Hopefully with this longer format, the message I am trying to convey will be less amenable to misconstrual. Nobody is trying to change the rules of science; we are just trying to state them accurately. The multiverse is scientific in an utterly boring, conventional way: it makes definite statements about how things are, it has explanatory power for phenomena we do observe empirically, and our credence in it can go up or down on the basis of both observations and improvements in our theoretical understanding. Most importantly, it might be *true*, even if it might be difficult to ever decide with high confidence whether it is or not. Understanding how science progresses is an interesting and difficult question, and should not be reduced to brandishing bumper-sticker mottos to attack theoretical approaches to which we are not personally sympathetic.

### Tommaso Dorigo - Scientificblogging

### Emily Lakdawalla - The Planetary Society Blog

### Peter Coles - In the Dark

There’s a new paper on the arXiv by Sean Carroll called *Beyond Falsifiability: Normal Science in a Multiverse*. The abstract is:

Cosmological models that invoke a multiverse – a collection of unobservable regions of space where conditions are very different from the region around us – are controversial, on the grounds that unobservable phenomena shouldn’t play a crucial role in legitimate scientific theories. I argue that the way we evaluate multiverse models is precisely the same as the way we evaluate any other models, on the basis of abduction, Bayesian inference, and empirical success. There is no scientifically respectable way to do cosmology without taking into account different possibilities for what the universe might be like outside our horizon. Multiverse theories are utterly conventionally scientific, even if evaluating them can be difficult in practice.

I’ve added a link to `abduction’ lest you think it has something to do with aliens!

I haven’t had time to read all of it yet, but thought I’d share it here because it concerns a topic that surfaces on this blog from time to time. I’m not a fan the multiverse because (in my opinion) most of the arguments trotted out in its favour are based on very muddled thinking. On the other hand, I’ve never taken seriously any of the numerous critiques of the multiverse idea based on the Popperian criterion of falsifiability because (again, in my opinion) that falsifiability has very little to do with the way science operates.

Anyway, Sean’s papers are always interesting to read so do have a look if this topic interests you. And feel free to comment through the box below.

Follow @telescoper## January 16, 2018

### Emily Lakdawalla - The Planetary Society Blog

### CERN Bulletin

**Early in the new year, the Staff Association wishes you and your loved ones its best wishes for a happy and healthy New Year 2018, as well as individual and collective success. May it be filled with satisfaction in both your professional and private life.**

**A Difficult start**

The results of the election of the new Staff Council were published on 20^{th} November 2017 in Echo N° 281.

The process of renewing the Staff Council proceeded very well: candidates in numbers, from all departments, ranks and categories (staff, fellows and associates); and the turnout rate in this election is up compared to previous elections ... something to be celebrated and congratulated.

In accordance with the statutes of the Staff Association, the new Staff Council shall, at its first meeting, elect an Executive Committee comprising a “Bureau” with four statutory posts: President, Vice-President, Secretary and Treasurer.

However, while the composition of the Executive Committee was easily established, the appointment of the “Bureau” proved to be more complicated, for too few delegates were ready to get involved at this level in the current context.

Chief among the reasons put forth, is the impossibility for many delegates to devote at least 50 % of their work time in the Staff Association; considering the workload linked to professional activities and the chronic shortage of personnel in many services at CERN, this argument is well known within the Staff Association.

Secondly, fears were raised about putting one’s career “on-hold” by spending more time with the Staff Association. More time devoted to the Staff Association can mean a degraded recognition of the merit. This argument is new and worrying to say the least.

Finally, and we are down to the heart of the issue, several delegates consider that they cannot trust our interlocutors in the consultation process (“Concertation” in French). This feeling certainly follows the difficult management of some issues, but also stems from a difference in the understanding, between the Management and the Staff Association, of the principle of consultation which governs our relations.

**Executive Crisis Committee**

The Staff Council, in its meeting of 5 December 2017, finally elected and established a __Crisis Executive Committee__ for a three months period ending 31 March 2018, with the main objective to find a resolution for the "Nursery and School" issue, (see Echo N° 282 of 11/12/2017).

The composition of this interim crisis committee is as follows:

**A year of challenges**

2018 is therefore from the start a year of challenges for the Staff Association, and the consultation process (“Concertation”) with the Management.

The Staff Association reaffirms its will to work in a climate of trust and good faith, two necessary elements of a fruitful and constructive consultation process.

We wish you once again a happy new year 2018!

### CERN Bulletin

**M. Bernard Dormy, Président du TREF (Tripartite Employment Conditions Forum) (voir Echo n° 242) a terminé son mandat à la fin de l'année 2017.**

**L’Association du personnel a souhaité s’entretenir avec lui sur le CERN et son personnel et, entre autres, sur le modèle de concertation.**

Cette publication est également l’occasion pour l’Association du personnel de saluer M. Dormy pour l’engagement dont il a fait preuve depuis 2003, année où il a débuté au TREF comme délégué français. Il faut croire que son mandat au sein de ce forum lui a particulièrement plu, puisque M. Dormy a occupé les fonctions de Vice-président du TREF de 2007 à 2011, puis de Président de 2012 à 2017.

Sous sa présidence, M. Dormy a toujours veillé à ce que la concertation se déroule dans les meilleures conditions possibles, dans un esprit constructif et de respect mutuel. Il a également mis un accent tout particulier sur la diversité au CERN; pas une seule réunion du TREF sans un point sur la diversité et sur les avancées dans ce domaine.

Avant de passer aux questions-réponses, l’Association du personnel tient à remercier M. Dormy qui a, il nous l’a souvent dit, un profond respect pour le personnel du CERN. À notre tour, nous lui adressons nos très respectueuses et chaleureuses salutations.

**M. Dormy, quels ont été vos premiers contacts avec le CERN et quels souvenirs en avez-vous ?**

J’ai rejoint la délégation française au TREF du CERN il y a quinze ans, et son comité des finances quelques années plus tard. Mais mon premier contact avec le CERN est bien plus ancien. Au lycée, un professeur de physique doué pour éveiller des vocations en recherche fondamentale nous avait persuadés de visiter le CERN si nous nous trouvions à Genève; le moment venu, je n’ai malheureusement pas réussi à le faire, et j’ai dû me rabattre sur une carte postale achetée en ville. Quant à ma vocation, malgré un goût certain pour les sciences, j’ai dû plus tard me contenter d’HEC et de l’ENA.

Mon second contact remonte aux années 80. Le directeur scientifique des humanités du CNRS, dont j’étais l’adjoint, relatait un échange avec un célèbre physicien lors d’un Conseil du CNRS :

- Monsieur N., avec une miette de vos accélérateurs, je fais vivre tous les laboratoires de mon secteur.
- Monsieur P., je suis bien élevé, je ne laisse jamais tomber de miettes.

Amusant certes, mais j’ai heureusement trouvé plus d’ouverture d’esprit chez les scientifiques que j’ai ensuite fréquentés, fussent-ils utilisateurs de très grands instruments comme le CERN.

Un troisième contact enfin a été l’écho d’un dialogue entre un ministre semblant intéressé par la seule recherche appliquée (si possible en entreprise) et un autre grand physicien. À la question « Monsieur le ministre, savez-vous qui a inventé le web ? », celui-ci aurait répondu « Bill Gates, bien sûr ». De quoi vous donner l’envie de connaître enfin le berceau du web de l’intérieur !

**Qu’avez-vous « découvert » au CERN ?**

Ma découverte concrète du CERN au sein du TREF a été d’abord celle du multilatéralisme. On le décrit souvent comme l’art du compromis entre des positions différentes, ce qui est un peu réducteur, car il arrive souvent, je l’ai constaté au TREF ou au Comité des finances, qu’un accord unanime se fasse sans qu’il soit nécessaire de construire une position médiane acceptable par tous. Mais, pour moi, le multilatéralisme, c’est avant tout la découverte que les modes de pensée et surtout d’expression sont parfois assez différents selon les États, même si ceux-ci partagent une même vision du monde. Ce qui apparaît comme une formulation un peu brutale aux yeux de certains d’entre eux peut en même temps être vu comme peu clair et alambiqué par d’autres. Ce qui est aujourd’hui intégré au comportement de tous les jours dans certains pays, comme, par exemple, la place faite aux femmes dans la société demande encore une politique volontariste dans d’autres. Cela rend la présidence du TREF passionnante, et, je le pense sincèrement, conduit rapidement à vivre cette diversité plus comme un enrichissement que comme une contrainte.

**Comment pourriez-vous définir la concertation au CERN ?**

La consultation des personnels dans les grandes organisations publiques ou privées prend ou a pris des formes très diverses dans les divers États membres du CERN, allant du recueil d’un simple avis dont le poids dans la décision finale est souvent modeste, jusqu’à la cogestion. La concertation telle qu’elle est pratiquée au CERN me paraît marquer un équilibre entre ces deux extrêmes. Pour la résumer, je la décrirais volontiers comme la recherche d’une position commune entre l’Association du personnel, l’administration du CERN et ses États membres, chacun étant préalablement et loyalement informé des divers aspects du dossier et ayant eu la possibilité de confronter sa position à celle des autres. Tout comme le Comité de Concertation Permanent (CCP), le TREF joue un rôle non négligeable dans ce processus.

J’ai été frappé lors de mes premières séances comme délégué au TREF par la diversité des origines professionnelles des délégués, qui, au-delà de leurs spécialisations personnelles, ne partageaient pas tous le même socle de connaissances de base. Je suis moi-même arrivé dans un monde presque entièrement à découvrir, armé de mes seuls souvenirs du droit de la fonction publique internationale appris à la fac et, je l’espère, d’un peu de bon sens. Le partage d’un même corpus d’informations en amont des débats est pourtant essentiel au bon fonctionnement du TREF. C’est pourquoi, avec Jean-Marc Saint-Viteux, nous avons décidé de présenter le CERN et son environnement (notamment économique), les modes de fonctionnement du TREF et l’historique de ses décisions, ainsi que les conditions juridiques du processus de concertation, notamment lors de la Revue quinquennale des salaires et des conditions d’emploi. Cette information de base est depuis six ans offerte à tous les nouveaux délégués des États membres, afin de faire en sorte que tous partagent un même niveau d’information commune avant les séances du TREF.

**Comment compareriez-vous la concertation au CERN avec les processus en place dans les autres Organisations ?**

Les comparaisons sont tentantes, surtout pour quelqu’un qui, comme moi, a eu la chance de présider les comités administratifs et financiers de deux autres grandes infrastructures de recherche, actuellement en construction à Darmstadt et à Lund. À l’expérience, je pense qu’il faut s’en garder.

Le CERN est une organisation internationale, dont le Conseil fixe le droit applicable à ses personnels, sous le contrôle du juge international. En Allemagne et en Suède, j’ai rencontré des organisations où les personnels sont régis par des conventions collectives nationales. Le rôle des organes de consultation des personnels y est donc limité à l’application des règles, à l’exclusion de leur élaboration. Leurs liens avec Conseil et Comité des finances existent certes, mais sont par nature plus limités qu’ils ne le sont au CERN.

Le CERN est donc singulier à cet égard, comme toute organisation internationale. Mais il est lui-même une organisation internationale singulière, la plus ancienne des grandes infrastructures scientifiques internationales en exploitation. D’où cette culture d’organisme que l’on ne rencontre pas ailleurs à un tel niveau de développement. Un lieu où les personnels se disent-ils plus « cernois » que français, allemands, polonais... ne se rencontre pas tous les jours. Comme nos voisins vaudois, les gens du CERN pourraient dire « y’en a pas comme nous ».

**Que pensez-vous du personnel du CERN et du travail fait par l’Association du personnel ?**

C’est une banalité de dire que le CERN et les investissements considérables qui y ont été et y seront faits ne seraient rien sans les femmes et sans les hommes qui le composent. Mais il est bon parfois de répéter des banalités, car les délégués aux divers organes du CERN ont sans cesse à composer entre les exigences d‘une maîtrise des budgets alloués par les divers États membres et celles d’une politique du personnel permettant d’attirer les meilleurs et de leur offrir de bonnes conditions de travail.

Dans ce cadre, j’ai envie de répondre à la question « À quoi sert l’Association du personnel ? » par une simple phrase : « elle sert à rendre service aux membres du personnel ». Un exemple concret de ses services, qui contribue à l’attractivité du CERN : en entrant sur le campus, on laisse à droite une crèche et un jardin d’enfants, qui sont gérés par l’Association du personnel. Un autre exemple : par sa simple présence au TREF, l’Association du personnel aide chacun à ne pas oublier que le personnel ne doit pas être vu comme un simple « coût », à se souvenir qu’il y a de vrais hommes et de vraies femmes derrière l’appellation générale de « personnel ».

**Selon vous, quel est l’avenir du CERN ?**

J’ai bien envie de répondre par la boutade attribuée à Niels Bohr (et reprise par Pierre Dac), « en matière scientifique, il est difficile de prévoir, surtout l’avenir ». Qui aurait pu prévoir que les principes du web, imaginés pour faciliter l’accès commun aux données de laboratoires de recherche, allaient conduire à des modifications si profondes de nos sociétés contemporaines. Soyons modestes, et faisons confiance à la recherche, y compris la plus fondamentale, qui est un peu, pour employer le vocabulaire des économistes, le capital-risque de nos États.

Ceci n’interdit pas de faire des souhaits. En ce qui concerne le personnel, les membres du TREF savent combien j’espère voir se développer le rôle des femmes dans la science, et particulièrement dans les grandes infrastructures de recherche. Il fallu quelque 60 ans pour voit une femme Directrice générale du CERN, plus de vingt ans pour qu’une femme préside le TREF et à peine moins pour y voir une femme au sein de la délégation de l’Association du personnel. Le fait que l’on souligne ces élections montre bien qu’on les considère comme des événements sortant de l’ordinaire. Pour en faire dans l’avenir des nouvelles banales, un changement de mentalités devrait s’opérer. Je suis persuadé que l’on ne l’obtiendra pas par la contrainte, et j’approuve totalement le CERN d’avoir rejeté les politiques dites de discrimination positive, qui jouent à terme contre celles qu’elles souhaitent aider. Je pense au contraire qu’une pédagogie continue peut aider chacun à considérer comme normal de choisir ses collaborateurs en fonction de leur seule compétence. C’est pourquoi j’ai demandé qu’une communication sur la place des femmes dans l’Organisation, et, plus largement, sur la politique de diversité, soit faite à chaque réunion du TREF. On en revient à mon propos initial : cette forme de diversité est elle aussi une chance pour tous, non une contrainte.

### Peter Coles - In the Dark

On my way to the airport yesterday I heard the sad news of the death, at the age of just 59, of the footballer Cyrille Regis. I’ll leave it to those more qualified to post full obituaries of the man – I couldn’t possibly do justice to him as a player and a person – and will confine myself to one memory that remains strong in my mind.

While I was a student at Cambridge there was a University branch of the Newcastle United Supporters Club. This was mainly for social gatherings but, during term time, and when the game was within reach of a day trip we hired a coach or minibus and went to Newcastle United’s away games. Our team had just been promoted to the old First Division at the end of the 1983/4 season and we all wanted to see as much as possible of them in the top flight.

And so it came to pass that on 13th October 1984 we went by coach to Highfield Road to see Coventry City versus Newcastle United. It wasn’t a great game. In fact, it had been picked as the featured match on *Match of the Day* that Saturday night. When we got back to Cambridge and settled in the JCR to watch it we heard Jimmy Hill (who presented the show in those days) that they were joining the action mid-way through the second half. The first half had not been deemed worthy of transmission.

Despite the generally low quality of the game, there was one star who was easily the best player on the field and that was Cyrille Regis, who even eclipsed the little magician Peter Beardsley, whom the away fans had come to watch. Powerfully built, with a good turn of speed and excellent in the air despite not being particularly tall, Cyrille Regis proved a constant handful for Newcastle’s central defenders, winning just about every contested header and beating them for pace seemingly at will. In the second half Glen Roeder stopped even bothering to challenge for the ball in the air as he knew Regis had the beating of him. Newcastle, however, played five at the back for away games in those days and they managed to stop him scoring.

The game ended 1-1 with Peter Beardsley scoring for Newcastle from the penalty spot in front of the away supporters for Newcastle and Kenny Hibbitt scoring for Coventry. Here are some of the highlights of the game:

An away draw in the First Division was an acceptable result but, unhappily, the memories I have of the match are blighted by what I recall of the actions of some of the Newcastle United supporters who shouted racist abuse and threw bananas onto the pitch whenever Regis came within range. Their behaviour was disgraceful. In mitigation there were only a few – probably a couple of dozen among 4000-odd travelling suporters doing this – and many of the rest of us shouted at them to shut the f**k up. But the fact that there were any at all is bad enough. It ruined the day for me, and left me feeling deeply ashamed, but as far as I could tell Cyrille Regis just ignored it; this sort of thing probably happened every time he played. How he managed to keep his composure I’ll never know.

Those of us who have never experienced racist abuse can’t really imagine what it must be like to be on the receiving end. The dignity of men like Cyrille Regis in the face of this sort of thing speaks volumes about his strength of character. Above all, he tried to silence the racists by concentrating on his game and being an outstanding player.

All this was over thirty years ago and we like to think that racism is nowadays far less of an issue in football. I rarely go to live games now, so I can’t really comment on how crowd behaviour has or hasn’t changed. However, judging by the comments of black players racism is still endemic, it’s just that most of the racists refrain from some of the more overt displays of obnoxious behaviour – such as throwing bananas – because they would (rightly) get the perpetrators ejected from the ground. Dealing with the symptoms, however, doesn’t cure the disease.

It seems that even Peter Beardsley (who played in the match I mentioned above and is now, at 54, Newcastle United’s Under-23 coach) has been accused by young players of bullying and racist comments. He denies the allegations, and is on leave while the charges are investigated. I’m not going to prejudge what the outcome of those investigations will be, but his case is a reminder – as if we needed it – that racism hasn’t gone away.

Follow @telescoper

## January 15, 2018

### Lubos Motl - string vacua and pheno

**Presenting such papers as revolutions in physics is a full-blown scam**

The most recent text on Backreaction is titled Superfluid dark matter gets seriously into business. At this moment, this popular text celebrates a November 2017 preprint by Justin Khoury and two co-authors which added some technicalities to Khoury's program that's been around for some three years.

Justin Khoury is a cosmologist who is well-known for his work on colliding branes cosmologies, chameleon fields, and a few other topics. You should also search Google Scholar for Justin Khoury superfluid. You will find several papers – the most famous of which has 62 citations at this moment. That's fine but much fewer than Khoury's most famous papers that are safely above 1,000 citations. The "revolutionary" November 2017 paper on the "superfluid dark matter" only has one self-citation so far.

Hossenfelder's popular text ends up with this short paragraph:

I consider this one of the most interesting developments in the foundations of physics I have seen in my lifetime. Superfluid dark matter is without doubt a pretty cool idea.These are big words. Is there some substance for such big words? Well, I could imagine there could be and 1% of the time, I could get slightly excited about the idea. But 99% of the time, I feel certain that there is no conceivable justification for such big words, and not even a justification for words that would be 90% smaller.

Superfluid dark matter is supposed to be a hybrid of the "cold dark matter" paradigm which is the standard way to explain the anomalies in the rotation curves of galaxies and "corresponding" aspects of the expansion of the Universe; and the "modified gravity" which tries to modify the equations of gravity, fails to provide us with a satisfactory picture of physics and cosmology, but could be a "simpler" theory that intriguingly explains some universal phenomenological laws that seem to be obeyed even though "cold dark matter" has no explanation for them.

OK, according to superfluid dark matter, the Universe is filled with some low-viscosity fluid, a superfluid, and it acts like dark matter. But a standardized description of the dynamics within this fluid may also be interpreted as "modified gravity".

It seems like a plausible combination of approaches but the devil is in the details. However, what I find extremely weird is the idea that this rough paradigm is enough for a revolution in cosmology or physics. You know, the "anomalous" galactic rotation curves are either explained with the help of some new matter – which may carry some variable entropy density and which is assumed not to be visible in the telescopes – or without it. This is a Yes/No question. So if there's some extra matter which is a superfluid, it's still some extra matter – in other words, it must be considered an example of dark matter. After all, even superfluid dark matter has to have some microscopic behavior which may be studied by local experiments – it must be composed of some (probably new) particle species.

The Universe must still allow the idealized "empty space" phenomena that have been measured extremely accurately and incorporated into the state-of-the-art theories of particle physics. For this reason, whether or not someone (e.g. Erik Verlinde) gets completely lost in vague, almost religious musings saying that the "spacetime might be a fluid", any "dark matter superfluid" or anything of that sort simply

*has to be*some extra matter added on top of the things we know to exist. Any such dark matter may also be captured by some macroscopic, "hydrodynamic or aerodynamic" equations, and if the dark matter is a superfluid, they may have some special features.

(The empty space might in principle be a "fluid" but if the entropy density were nonzero and variable, the conflict with the tests of relativity would be almost unavoidable because such a fluid would be nothing else than a variation of the aether even though, in this case, it wouldn't be the luminiferous aether but rather the lumo-prohibiting aether. Lumo is light, not only in Esperanto, just to be sure. The entropy density, along with an entropy flux, is a 4-vector and its nonzero value breaks the Lorentz invariance. So any matter with some entropy density does so which is bad. A Lorentz-covariant spacetime fluid could in principle exist but it would have to be a new dual description of string/M-theory and it's clearly hopeless to dream about any Lorentz-covariant "fluid" without a glimpse of evidence of such a connection to string/M-theory.)

But because every dark matter model has such emergent, "hydrodynamic" field equations, I think it's just wrong to sell the "dark matter superfluid" as a totally new paradigm. These authors still add dark matter; and they must still decide whether Einstein's equations hold at the fundamental classical level. One may spread lots of hype about a "revolution" but at the end, it's just another technical model of dark matter, like e.g. the ultralight axion model by Witten et al.

Note that Witten et al. have employed an extremely modest, technical language – which is appropriate despite the fact that their proposal is clever and attractive. This approach is so different from the approach of Ms Hossenfelder.

I don't think that the "superfluid dark matter" papers contain something that would make their reading irresistible. But I find the "framing" of these superfluid dark matter papers in the media and the blogosphere – and the "framing" of many other papers – more important and highly problematic. It seems utterly inconceivable to me that an honest yet competent physicist could consider these papers "one of the most interesting developments in her lifetime".

When you look at the response (followups) by the other physicists and cosmologists, these papers don't even make it to top 100 in the year when they were published. Especially because I know quite something about Ms Hossenfelder, it seems vastly more likely that she has a completely different agenda when she overhypes such papers. What is it?

She has written at least one paper about these MOND and Verlinde issues – the 300th most important derivative paper commenting on the 101st most influential paper in a year ;-) – and she simply has personal incentives to make the world think that this kind of work is very interesting even though it is not. She is working on similar things because she doesn't have the skills (and vitality) needed to work on more interesting and deeper things. She says "it is most interesting and cool" but she really means "its fame is beneficial for her".

The financial news servers (e.g. SeekingAlpha) usually require the authors to disclose their positions in assets that they discuss in their articles. That has good reasons. Someone's being long or short may reduce his or her integrity and encourage him or her to write positive or negative things about the asset. The readers have the right not to be scammed in an easy way – which is why fair publishers insist on informing the readers whether there could be a clash of interests. One should expect the scientific integrity to be much deeper than the integrity of the journalists in the financial media. Sadly, it isn't so these days. Self-serving scammers such as Ms Hossenfelder face no restrictions – they are free to fool and delude everybody and lots of the people in the media want to be fooled and be parts of this scam because they're as unethical as Ms Hossenfelder herself.

Readers should learn how to use Google Scholar to acquire some rough clue about the importance of a paper or idea as evaluated by the body of the genuine scientists. If the folks learned how to use this "simple branch of Google", they could instantly figure out that 99% of the hype is probably rubbish (well, of course, this method isn't waterproof so there would be false positives as well as false negatives). It's too bad that almost no laymen – and, in fact, almost no journalists – are doing these things. So they're constantly drowning in hype and in a superfluid of fairy-tales that overhype papers that are either average or totally wrong.

Self-serving, fake scientists such as Sabine Hossenfelder are obviously the main drivers that propagate this fog and misinformation.

P.S.: In an older popular article about the topic, one at Aeon.CO, Hossenfelder emphasized the point that superfluidity represents a quantum behavior across the Universe. This assertion – which is just another way to add the hype – is really a deep distortion of the issues. A superfluid is nicely described by a

*classical*field theory. Some of the fields seem to behave like the wave function but because this is a macroscopic limit of many particles in the same state, it is really a classical limit, with no minimal uncertainty etc., so the function of the spacetime coordinates isn't a wave function and shouldn't be called a wave function. It is a classical field. The classical limit isn't really any different in the case of a superfluid and in the case of electromagnetism or any other pair of a quantum field theory and its corresponding classical field theory!

by Luboš Motl (noreply@blogger.com) at January 15, 2018 05:39 PM

### Peter Coles - In the Dark

It has been a very busy weekend but yesterday afternoon I took time out to visit St David’s Hall in Cardiff to hear the Orchestra of Welsh National Opera conducted by Tomáš Hanus in a programme of music by Beethoven, Mendelssohn and Dvořák. I’ve noticed that many of the international concerts that are a regular part of Cardiff life have been moved from weekday evenings to weekend afternoons. No doubt that it is for commercial reasons. I have to admit that I’m not a big fan of matinee concerts, but as it happens I’m not going to be available for many of the weekday evening concerts for the foreseeable future so I thought I’d give this one a go. The programme was a middle-of-the-road bums-on-seats affair, but if it brings people into the concert hall that is a good thing and it was nice to see a big crowd, including a sizeable contingent of schoolchildren, there to enjoy the show.

First up we had a favourite piece of mine, Beethoven’s Egmont overture, inspired by the story of Lamoral, Count of Egmont whose execution in 1568 sparked an uprising Spanish occupation that eventually led to the independence of the Netherlands. It’s a stirring, dramatic work, ideal for opening a concert programme. I thought the tempo was a bit slow at the start, which made increase in speed towards the end a little jarring, but otherwise it was well played the full orchestra, arranged with six double-basses right at the back of the stage facing the conductor with the brass either side. That was very effective at generating a rich dark sonority both in this piece and in the Dvořák later on.

The next item was a very familiar work indeed, the Violin Concerto in E minor by Felix Mendelssohn. This is perhaps best known for its lvoely second movement (in which they key changes to C major) but the other two movements are really innovative and virtuosic. In the wrong hands the slow movement can be horribly schmaltzy but Norwegian soloist Henning Kraggerud managed to bring out is beauty without wallowing in its romanticism. It was a very fine performance, warmly appreciated by the audience. Henning Kraggerud treated us to an encore in the form of an intruguing piece by a musician previously unknown to me, Olof Bull, a fellow Norwegian and a contemporary of Mendelssohn.

After the wine break the main event of was another familiar piece, the Symphony No. 9 (“From the New World”) by Antonín Dvořák, a piece full of nostalgia for his Czech homeland written while the composer was living in America. It’s a piece I’ve heard very many times but it still manages to stay fresh, and yesterday’s performance was full of colour of verve. Tomáš Hanus (himself Czech) chose this piece as a tribute to an old friend who passed away last year, and it was was played with great passion.

I’d heard all the pieces in this programme many times, both in concert and on record, but they all stand up to repeated listening, simply because they’re so very good. I do like to hear new works – and do wish the programming at St David’s Hall were a little more adventurous – but they do have to make ends meet and there’s in any case much to enjoy in the standard repertoire, especially when it’s played by a fine orchestra. Such pieces can fall flat when you get the feeling that the musicians themselves are a bit bored with them, but that emphatically wasn’t the case yesterday.

It will soon be time to Welsh National Opera’s new season, with a new production of Verdi’s *La Forza del Destino* alongside revivals of *Tosca* and *Don Giovanni*. It’s going to be tricky to see them all, but I’ll give it a go!

### CERN Bulletin

**Cosmos**

KOLI

**Du 15 au 26 janvier 2018 CERN Meyrin, Main Building**

(*Nébuleuse d'Orion- KOLI*)

KOLI,

Artiste confirmé, diplômé de l’Académie de Beaux Arts de Tirana, depuis 26 ans en Suisse, où il a participé à maintes expositions collectives et organisé 10 expositions privées avec beaucoup de succès, s’exprime actuellement dans un bonheur de couleur et de matières qui côtoient des hautes sphères… le cosmos !

Gagnant d’un premier prix lors d’une exposition collective organisée par le consulat Italien, il s’est installé au bord du lac dans le canton de Vaud où il vit depuis maintenant déjà 13 ans.

Pour plus d’informations et demandes d’accès : staff.association@cern.ch | Tél: 022 766 37 38

### CERN Bulletin

**YCC 50 ^{th} anniversary & Swiss SU Championship 2018, there’s a lot going on in the club!**

For those of you that wonder how the YCC operates at CERN the simple answer is that it is made of passionate members that care about the club’s operations. The YCC has reached almost 400 members as of the closing of 2017 and it’s looking forward to bring more members onboard to experience the adrenaline of winds! YCC is not only is the a place to learn how to sail, but it is also a community of international people that gathers during the year through other social events.

There’s nothing better than spending the summer on the lake, learning how to rig and sail a boat, getting to know different people during YCC practices and getting a tan before gathering for a drink in the port!

When you’re on a boat you need to trust your crew no matter how big it is, especially during strong-wind conditions. It is thanks to this that relationships and friendships begins at YCC, at least this is my personal experience. I had the pleasure to meet exceptional people that are now becoming not only sailing partners, but also friends!

The YCC community is ever-growing and ever-evolving, this year we celebrate the YCC 50^{th} anniversary! It’s been 50 years the club has been founded and boats getting out of Versoix port populating the lake all summer long. The committee is planning several events to celebrate and advertise it!

In order to share this great news there will be a brand new logo for the YCC designed by one of the members! Stay tuned for the reveal!

One of the most important events in the radar is the 2018 Swiss SU Championship that will be organized by the YCC in collaboration with CNV (club Nautique d Versoix). This is the utmost event for all SU categories across Switzerland. There’s a strong team at YCC taking care of the organization of the championship that will be commencing on the 07th of July. The YCC will not only ensure the championship is organized with attention to details, but it will also participate with few SU boats! All the YCC members are welcome to take part.

We hope you’re looking forward the beginning of the season, for those that are missing the lake already there are few winter regattas coming up (SU, J) – in the external regattas section of the website), don’t hesitate to signup and participate!

### CERN Bulletin

**Wednesday 17 January 2018 at 20:00**

CERN Council Chamber

**Memories of Murder**

**Directed by Joon-ho Bong**

South Korea, 2003, 132 minutes

In a small Korean province in 1986, three detectives struggle with the case of multiple young women being found raped and murdered by an unknown culprit.

Original version Korean; English subtitles

### The n-Category Cafe

**Guest post by Heiko Gimperlein and Magnus Goffeng.**

The magnitude of a metric space was born, nearly ten years ago, on this blog, although it went by the name of cardinality back then. There has been much development since (for instance, see Tom Leinster and Mark Meckes’ survey of what was known in 2016). Basic geometric questions about magnitude, however, remain open even for compact subsets of $<semantics>{\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}^n</annotation></semantics>$: Tom Leinster and Simon Willerton suggested that magnitude could be computed from intrinsic volumes, and the algebraic origin of magnitude created hopes for an inclusion-exclusion principle.

In this sequence of three posts we would like to discuss our recent article, which is about asymptotic geometric content in the magnitude function and also how it relates to scattering theory.

For “nice” compact domains in $<semantics>{\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}^n</annotation></semantics>$ we prove an asymptotic variant of Leinster and Willerton’s conjecture, as well as an asymptotic inclusion-exclusion principle. Starting from ideas by Juan Antonio Barceló and Tony Carbery, our approach connects the magnitude function with ideas from spectral geometry, heat kernels and the Atiyah-Singer index theorem.

We will also address the location of the poles in the complex plane of the magnitude function. For example, here is a plot of the poles and zeros of the magnitude function of the $<semantics>21<annotation\; encoding="application/x-tex">21</annotation></semantics>$-dimensional ball.

We thank Simon for inviting us to write this post and also for his paper on the magnitude of odd balls as the computations in it rescued us from some tedious combinatorics.

The plan for the three café posts is as follows:

State the recent results on the asymptotic behaviour as a metric space is scaled up and on the meromorphic extension of the magnitude function.

Discuss the proof in the toy case of a compact domain $<semantics>X\subseteq \mathbb{R}<annotation\; encoding="application/x-tex">X\backslash subseteq\; \backslash mathbb\{R\}</annotation></semantics>$ and indicate how it generalizes to arbitrary odd dimension.

Consider the relationship of the methods to geometric analysis and potential ramifications; also state some open problems that could be interesting.

### Asymptotic results

As you may recall, the magnitude $<semantics>\mathrm{mag}(X)<annotation\; encoding="application/x-tex">\backslash mathrm\{mag\}(X)</annotation></semantics>$ of a finite subset $<semantics>X\subseteq {\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">X\backslash subseteq\; \backslash mathbb\{R\}^n</annotation></semantics>$ is easy to define: let $<semantics>w:X\to \mathbb{R}<annotation\; encoding="application/x-tex">w:X\backslash to\; \backslash mathbb\{R\}</annotation></semantics>$ be a function which satisfies

$$<semantics>\sum _{y\in X}{\mathrm{e}}^{-\mathrm{d}(x,y)}w(y)=1\phantom{\rule{1em}{0ex}}\text{for all}\phantom{\rule{thickmathspace}{0ex}}x\in X.<annotation\; encoding="application/x-tex">\; \backslash sum\_\{y\backslash in\; X\}\; \backslash mathrm\{e\}^\{-\backslash mathrm\{d\}(x,y)\}\; w(y)\; =\; 1\; \backslash quad\; \backslash text\{for\; all\}\backslash \; x\; \backslash in\; X.\; </annotation></semantics>$$

Such a function is called a weighting. Then the magnitude is defined as the sum of the weights:

$$<semantics>\mathrm{mag}(X)=\sum _{x\in X}w(x).<annotation\; encoding="application/x-tex">\; \backslash mathrm\{mag\}(X)\; =\; \backslash sum\_\{x\; \backslash in\; X\}\; w(x).\; </annotation></semantics>$$

For a compact subset $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$ of $<semantics>{\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}^n</annotation></semantics>$, Mark Meckes shows that all reasonable definitions of magnitude are equal to what you get by taking the supremum of the magnitudes of all finite subsets of $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$:

$$<semantics>\mathrm{mag}(X)=\mathrm{sup}\{\mathrm{mag}(\Xi ):\Xi \subset X\phantom{\rule{thickmathspace}{0ex}}\text{finite}\}.<annotation\; encoding="application/x-tex">\; \backslash mathrm\{mag\}(X)\; =\; \backslash sup\; \backslash \{\backslash mathrm\{mag\}(\backslash Xi)\; :\; \backslash Xi\; \backslash subset\; X\; \backslash \; \backslash text\{finite\}\; \backslash \}\; .\; </annotation></semantics>$$

Unfortunately, few explicit computations of the magnitude for a compact subset of $<semantics>{\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}^n</annotation></semantics>$ are known.

Instead of the magnitude of an individual set $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$, it proves fruitful to study the magnitude of dilates $<semantics>R\cdot X<annotation\; encoding="application/x-tex">R\backslash cdot\; X</annotation></semantics>$ of $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$, for $<semantics>R>0<annotation\; encoding="application/x-tex">R\backslash gt\; 0</annotation></semantics>$. Here the dilate $<semantics>R\cdot X<annotation\; encoding="application/x-tex">R\backslash cdot\; X</annotation></semantics>$ means the space $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$ with the metric scaled by a factor of $<semantics>R<annotation\; encoding="application/x-tex">R</annotation></semantics>$. We can vary $<semantics>R<annotation\; encoding="application/x-tex">R</annotation></semantics>$ and this gives rise to the **magnitude function** of $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$:

$$<semantics>{\mathcal{M}}_{X}:(0,\mathrm{\infty})\to \mathbb{R};\phantom{\rule{1em}{0ex}}{\mathcal{M}}_{X}(R):=\mathrm{mag}(R\cdot X)\phantom{\rule{1em}{0ex}}\text{for}\phantom{\rule{thickmathspace}{0ex}}R>0.<annotation\; encoding="application/x-tex">\; \backslash mathcal\{M\}\_X\backslash colon\; (0,\backslash infty)\backslash to\; \backslash mathbb\{R\};\backslash quad\backslash mathcal\{M\}\_X(R)\; :=\; \backslash mathrm\{mag\}(R\backslash cdot\; X)\backslash quad\backslash text\{for\; \}\backslash \; R\; \backslash gt\; 0.\; </annotation></semantics>$$

Tom and Simon conjectured a relation between the magnitude function of $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$ and its intrinsic volumes $<semantics>({V}_{i}(X){)}_{i=0}^{n}<annotation\; encoding="application/x-tex">(V\_i(X))\_\{i=0\}^n</annotation></semantics>$. The intrinsic volumes of subsets of $<semantics>{\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}^n</annotation></semantics>$ generalize notions such as volume, surface area and Euler characteristic, with $<semantics>{V}_{n}(X)={\text{vol}}_{n}(X)<annotation\; encoding="application/x-tex">V\_n(X)=\backslash text\{vol\}\_n(X)</annotation></semantics>$ and $<semantics>{V}_{0}(X)=\chi (X)<annotation\; encoding="application/x-tex">V\_0(X)=\backslash chi(X)</annotation></semantics>$.

Convex Magnitude Conjecture.Suppose $<semantics>X\subseteq {\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">X\; \backslash subseteq\; \backslash mathbb\{R\}^n</annotation></semantics>$ is compact and convex, then the magnitude function is a polynomial whose coefficients involve the intrinsic volumes:

$$<semantics>{\mathcal{M}}_{X}(R)=\sum _{i=0}^{n}\frac{{V}_{i}(X)}{i!\phantom{\rule{thinmathspace}{0ex}}{\omega}_{i}}{R}^{n},<annotation\; encoding="application/x-tex">\; \backslash mathcal\{M\}\_X(R)\; =\; \backslash sum\_\{i=0\}^n\; \backslash frac\{V\_i(X)\}\{i!\; \backslash ,\backslash omega\_i\}\; R^n,\; </annotation></semantics>$$ where $<semantics>{V}_{i}(X)<annotation\; encoding="application/x-tex">V\_i(X)</annotation></semantics>$ is the $<semantics>i<annotation\; encoding="application/x-tex">i</annotation></semantics>$-th intrinsic volume of $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$ and $<semantics>{\omega}_{i}<annotation\; encoding="application/x-tex">\backslash omega\_i</annotation></semantics>$ the volume of the $<semantics>i<annotation\; encoding="application/x-tex">i</annotation></semantics>$-dimensional ball.

The conjecture was disproved by Barceló and Carbery (see also this post). They computed the magnitude function of the $<semantics>5<annotation\; encoding="application/x-tex">5</annotation></semantics>$-dimensional ball $<semantics>{B}_{5}<annotation\; encoding="application/x-tex">B\_5</annotation></semantics>$ to be the rational function $$<semantics>{\mathcal{M}}_{{B}_{5}}(R)=\frac{{R}^{5}}{5!}+\frac{3{R}^{5}+27{R}^{4}+105{R}^{3}+216R+72}{24(R+3)}.<annotation\; encoding="application/x-tex">\; \backslash mathcal\{M\}\_\{B\_5\}(R)=\backslash frac\{R^5\}\{5!\}\; +\backslash frac\{3R^5+27R^4+105R^3+216R+72\}\{24(R+3)\}.\; </annotation></semantics>$$ In particular, the magnitude function is not even a polynomial for $<semantics>{B}_{5}<annotation\; encoding="application/x-tex">B\_5</annotation></semantics>$. Also, the coefficient of $<semantics>{R}^{4}<annotation\; encoding="application/x-tex">R^4</annotation></semantics>$ in the asymptotic expansion of $<semantics>{\mathcal{M}}_{{B}_{5}}(R)<annotation\; encoding="application/x-tex">\backslash mathcal\{M\}\_\{B\_5\}(R)</annotation></semantics>$ as $<semantics>R\to \mathrm{\infty}<annotation\; encoding="application/x-tex">R\; \backslash to\; \backslash infty</annotation></semantics>$ does not agree with the conjecture.

Nevertheless, for any smooth, compact **domain** in odd-dimensional Euclidean space, $<semantics>X\subseteq {\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">X\backslash subseteq\backslash mathbb\{R\}^n</annotation></semantics>$ (in other words, the closure of an open bounded subset with smooth boundary), for $<semantics>n=2m-1<annotation\; encoding="application/x-tex">n=2m-1</annotation></semantics>$, our main result shows that a modified form of the conjecture holds asymptotically as $<semantics>R\to \mathrm{\infty}<annotation\; encoding="application/x-tex">R\; \backslash to\; \backslash infty</annotation></semantics>$.

Theorem A.Suppose $<semantics>n=2m-1<annotation\; encoding="application/x-tex">n=2m-1</annotation></semantics>$ and that $<semantics>X\subseteq {\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">X\backslash subseteq\; \backslash mathbb\{R\}^n</annotation></semantics>$ is a smooth domain.

There exists an asymptotic expansion of the magnitude function: $$<semantics>{\mathcal{M}}_{X}(R)\sim \frac{1}{n!{\omega}_{n}}\sum _{j=0}^{\mathrm{\infty}}{c}_{j}(X){R}^{n-j}\phantom{\rule{1em}{0ex}}\text{as}\phantom{\rule{thickmathspace}{0ex}}R\to \mathrm{\infty}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{M\}\_X(R)\; \backslash sim\; \backslash frac\{1\}\{n!\backslash omega\_n\}\backslash sum\_\{j=0\}^\backslash infty\; c\_j(X)\; R^\{n-j\}\; \backslash quad\; \backslash text\{as\; \}\backslash \; R\backslash to\; \backslash infty\; </annotation></semantics>$$ with coefficients $<semantics>{c}_{j}(X)\in \mathbb{R}<annotation\; encoding="application/x-tex">c\_j(X)\backslash in\backslash mathbb\{R\}</annotation></semantics>$ for $<semantics>j=0,1,2,\dots <annotation\; encoding="application/x-tex">j=0,1,2,\backslash ldots</annotation></semantics>$.

The first three coefficients are given by $$<semantics>\begin{array}{rl}{c}_{0}(X)& ={\text{vol}}_{n}(X),\\ {c}_{1}(X)& =m{\text{vol}}_{n-1}(\partial X),\\ {c}_{2}(X)& =\frac{{m}^{2}}{2}(n-1){\int}_{\partial X}H\phantom{\rule{thickmathspace}{0ex}}\mathrm{d}S,\end{array}<annotation\; encoding="application/x-tex">\; \backslash begin\{aligned\}\; c\_0(X)\&=\backslash text\{vol\}\_n(X),\backslash \backslash \; c\_1(X)\&=m\backslash text\{vol\}\_\{n-1\}(\backslash partial\; X),\backslash \backslash \; c\_2(X)\&=\backslash frac\{m^2\}\{2\}\; (n-1)\backslash int\_\{\backslash partial\; X\}\; H\; \backslash \; \backslash mathrm\{d\}S,\; \backslash end\{aligned\}\; </annotation></semantics>$$ where $<semantics>H<annotation\; encoding="application/x-tex">H</annotation></semantics>$ is the mean curvature of $<semantics>\partial X<annotation\; encoding="application/x-tex">\backslash partial\; X</annotation></semantics>$. (The coefficient $<semantics>{c}_{0}<annotation\; encoding="application/x-tex">c\_0</annotation></semantics>$ was computed by Barceló and Carbery.) If $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$ is convex, these coefficients are proportional to the intrinsic volumes $<semantics>{V}_{n}(X)<annotation\; encoding="application/x-tex">V\_\{n\}(X)</annotation></semantics>$, $<semantics>{V}_{n-1}(X)<annotation\; encoding="application/x-tex">V\_\{n-1\}(X)</annotation></semantics>$ and $<semantics>{V}_{n-2}(X)<annotation\; encoding="application/x-tex">V\_\{n-2\}(X)\; </annotation></semantics>$ respectively.

For $<semantics>j\ge 1<annotation\; encoding="application/x-tex">j\backslash geq\; 1</annotation></semantics>$, the coefficient $<semantics>{c}_{j}(X)<annotation\; encoding="application/x-tex">c\_j(X)</annotation></semantics>$ is determined by the second fundamental form $<semantics>L<annotation\; encoding="application/x-tex">L</annotation></semantics>$ and covariant derivative $<semantics>{\nabla}_{\partial X}<annotation\; encoding="application/x-tex">\backslash nabla\_\{\backslash partial\; X\}</annotation></semantics>$ of $<semantics>\partial X<annotation\; encoding="application/x-tex">\backslash partial\; X</annotation></semantics>$: $<semantics>{c}_{j}(X)<annotation\; encoding="application/x-tex">c\_j(X)</annotation></semantics>$ is an integral over $<semantics>\partial X<annotation\; encoding="application/x-tex">\backslash partial\; X</annotation></semantics>$ of a universal polynomial in the entries of $<semantics>{\nabla}_{\partial X}^{k}L<annotation\; encoding="application/x-tex">\backslash nabla\_\{\backslash partial\; X\}^k\; L</annotation></semantics>$, $<semantics>0\le k\le j-2<annotation\; encoding="application/x-tex">0\; \backslash leq\; k\backslash leq\; j-2</annotation></semantics>$. The total number of covariant derivatives appearing in each term of the polynomial is $<semantics>j-2<annotation\; encoding="application/x-tex">j-2</annotation></semantics>$.

The previous part implies an asymptotic inclusion-exclusion principle: if $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$, $<semantics>B<annotation\; encoding="application/x-tex">B</annotation></semantics>$ and $<semantics>A\cap B\subset {\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">A\; \backslash cap\; B\; \backslash subset\; \backslash mathbb\{R\}^n</annotation></semantics>$ are smooth, compact domains, $$<semantics>{\mathcal{M}}_{A\cup B}(R)-{\mathcal{M}}_{A}(R)-{\mathcal{M}}_{B}(R)+{\mathcal{M}}_{A\cap B}(R)\to 0\phantom{\rule{1em}{0ex}}\text{as}\phantom{\rule{thickmathspace}{0ex}}R\to \mathrm{\infty}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{M\}\_\{A\; \backslash cup\; B\}(R)\; -\; \backslash mathcal\{M\}\_A(R)\; -\; \backslash mathcal\{M\}\_B(R)\; +\; \backslash mathcal\{M\}\_\{A\; \backslash cap\; B\}(R)\; \backslash to\; 0\; \backslash quad\; \backslash text\{as\; \}\backslash \; R\; \backslash to\; \backslash infty\; </annotation></semantics>$$ faster than $<semantics>{R}^{-N}<annotation\; encoding="application/x-tex">R^\{-N\}</annotation></semantics>$ for all $<semantics>N<annotation\; encoding="application/x-tex">N</annotation></semantics>$.

If you’re not familiar with the second fundamental form, you should think of it as the container for curvature information of the boundary relative to the ambient Euclidean space. Since Euclidean space is flat, any curvature invariant of the boundary (satisfying reasonable symmetry conditions) will only depend on the second fundamental form and its derivatives.

Note that part 4 of the theorem does not imply that an asymptotic inclusion-exclusion principle holds for all $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ and $<semantics>B<annotation\; encoding="application/x-tex">B</annotation></semantics>$, even if $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ and $<semantics>B<annotation\; encoding="application/x-tex">B</annotation></semantics>$ are smooth, since the intersection $<semantics>A\cap B<annotation\; encoding="application/x-tex">A\; \backslash cap\; B</annotation></semantics>$ usually is not smooth. In fact, it seems unlikely that an asymptotic inclusion-exclusion principle holds for general $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ and $<semantics>B<annotation\; encoding="application/x-tex">B</annotation></semantics>$ without imposing curvature conditions, for example by means of assuming convexity of $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ and $<semantics>B<annotation\; encoding="application/x-tex">B</annotation></semantics>$.

The key idea of the short proof relates the computation of the magnitude to classical techniques from geometric and semiclassical analysis, applied to a reformulated problem already studied by Meckes and by Barceló and Carbery. Meckes proved that the magnitude can be computed from the solution to a partial differential equation in the exterior domain $<semantics>{\mathbb{R}}^{n}\setminus X<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}^n\backslash setminus\; X</annotation></semantics>$ with prescribed values in $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$. A careful analysis by Barceló and Carbery refined Meckes’ results and expressed the magnitude by means of the solution to a boundary value problem. We refer to this boundary value problem as the “Barceló-Carbery boundary value problem” below.

### Meromorphic extension of the magnitude function

Intriguingly, we find that the magnitude function $<semantics>{\mathcal{M}}_{X}<annotation\; encoding="application/x-tex">\backslash mathcal\{M\}\_X</annotation></semantics>$ extends meromorphically to complex values of the scaling factor $<semantics>R<annotation\; encoding="application/x-tex">R</annotation></semantics>$. The meromorphic extension was noted by Tom for finite metric spaces and was observed in all previously known examples.

Theorem B.Suppose $<semantics>n=2m-1<annotation\; encoding="application/x-tex">n=2m-1</annotation></semantics>$ and that $<semantics>X\in {\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">X\backslash in\; \backslash mathbb\{R\}^n</annotation></semantics>$ is a smooth domain.

The magnitude function $<semantics>{\mathcal{M}}_{X}<annotation\; encoding="application/x-tex">\backslash mathcal\{M\}\_X</annotation></semantics>$ admits a meromorphic continuation to the complex plane.

The poles of $<semantics>{\mathcal{M}}_{X}<annotation\; encoding="application/x-tex">\backslash mathcal\{M\}\_X</annotation></semantics>$ are generalized scattering resonances, and each sector $<semantics>\{z:|\mathrm{arg}(z)|<\theta \}<annotation\; encoding="application/x-tex">\backslash \{z\; :\; |\backslash arg(z)|\; \backslash lt\; \backslash theta\; \backslash \}</annotation></semantics>$ with $<semantics>\theta <\frac{\pi}{2}<annotation\; encoding="application/x-tex">\backslash theta\; \backslash lt\; \backslash frac\{\backslash pi\}\{2\}</annotation></semantics>$ contains at most finitely many of them (all of them outside $<semantics>\{z:|\mathrm{arg}(z)|<\frac{\pi}{n+1}\}<annotation\; encoding="application/x-tex">\backslash \{z\; :\; |\backslash arg(z)|\backslash lt\; \{\backslash textstyle\; \backslash frac\{\backslash pi\}\{n+1\}\}\backslash \}</annotation></semantics>$).

The ordinary notion of scattering resonances comes from studying waves scattering at a compact obstacle $<semantics>X\subseteq {\mathbb{R}}^{n}<annotation\; encoding="application/x-tex">X\backslash subseteq\; \backslash mathbb\{R\}^n</annotation></semantics>$. A **scattering resonance** is a pole of the meromorphic extension of the solution operator $<semantics>({R}^{2}-\Delta {)}^{-1}<annotation\; encoding="application/x-tex">(R^2-\backslash Delta)^\{-1\}</annotation></semantics>$ to the Helmholtz equation on $<semantics>{\mathbb{R}}^{n}\setminus X<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}^n\backslash setminus\; X</annotation></semantics>$, with suitable boundary conditions. These resonances determine the long-time behaviour of solutions to the wave equation and are well studied in geometric analysis as well as throughout physics. The Barceló-Carbery boundary value problem is a higher order version of this problem and studies solutions to $<semantics>({R}^{2}-\Delta {)}^{m}u=0<annotation\; encoding="application/x-tex">(R^2-\backslash Delta)^\{m\}u=0</annotation></semantics>$ outside $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$. In dimension $<semantics>n=1<annotation\; encoding="application/x-tex">n=1</annotation></semantics>$ (i.e. $<semantics>m=1<annotation\; encoding="application/x-tex">m=1</annotation></semantics>$), the Barceló-Carbery problem coincides with the Helmholtz problem, and the poles of the magnitude function are indeed scattering resonances. As in scattering theory, one might hope to find detailed structure in the location of the poles. A plot of the poles and zeros in case of the $<semantics>21<annotation\; encoding="application/x-tex">21</annotation></semantics>$-dimensional ball is given at the top of the post.

The second part of this theorem is sharp. In fact, the poles do not need to lie in any half plane. Using the techniques of Barceló and Carbery we observe that the magnitude function of the 3-dimensional spherical shell $<semantics>X=(2{B}_{3})\setminus {B}_{3}^{\circ}<annotation\; encoding="application/x-tex">X=(2B\_3)\backslash setminus\; B\_3^\backslash circ</annotation></semantics>$ is not rational and contains an infinite discrete sequence of poles which approaches the curve $<semantics>\mathrm{Re}(R)=\mathrm{log}(|\mathrm{Im}(R)|)<annotation\; encoding="application/x-tex">\backslash mathrm\{Re\}(R)=\; \backslash log(|\backslash mathrm\{Im\}(R)|)</annotation></semantics>$ as $<semantics>\mathrm{Re}(R)\to \mathrm{\infty}<annotation\; encoding="application/x-tex">\backslash mathrm\{Re\}(R)\; \backslash to\; \backslash infty</annotation></semantics>$. Here’s a plot of the poles of $<semantics>{\mathcal{M}}_{X}<annotation\; encoding="application/x-tex">\backslash mathcal\{M\}\_\{X\}</annotation></semantics>$ with $<semantics>|\mathrm{Im}(R)|<30<annotation\; encoding="application/x-tex">|\backslash mathrm\{Im\}(R)|\backslash lt\; 30</annotation></semantics>$.

The magnitude function of $<semantics>X<annotation\; encoding="application/x-tex">X</annotation></semantics>$ is given by $$<semantics>{\mathcal{M}}_{X}(R)=(7/6){R}^{3}+5{R}^{2}+2R+2+\frac{{e}^{-2R}({R}^{2}+1)+2{R}^{3}-3{R}^{2}+2R-1}{\mathrm{sinh}2R-2R}.<annotation\; encoding="application/x-tex">\backslash mathcal\{M\}\_X(R)=(7/6)R^3\; +5R^2\; +\; 2R\; +\; 2\; +\; \backslash frac\{e^\{-2R\}(R^2+1)+2R^3-3R^2+2R-1\}\{\backslash sinh\; 2R\; -2R\}.</annotation></semantics>$$

[EDIT: The above formula has been corrected, following comments below.]

Our techniques extend to compact domains with a $<semantics>{C}^{k}<annotation\; encoding="application/x-tex">C^k</annotation></semantics>$ boundary, as long as $<semantics>k<annotation\; encoding="application/x-tex">k</annotation></semantics>$ is large enough. In this case, the asymptotic inclusion-exclusion principle takes the form that $$<semantics>{\mathcal{M}}_{A\cup B}(R)-{\mathcal{M}}_{A}(R)-{\mathcal{M}}_{B}(R)+{\mathcal{M}}_{A\cap B}(R)\to 0\phantom{\rule{1em}{0ex}}\text{as}\phantom{\rule{thickmathspace}{0ex}}R\to \mathrm{\infty}<annotation\; encoding="application/x-tex">\backslash mathcal\{M\}\_\{A\; \backslash cup\; B\}(R)\; -\; \backslash mathcal\{M\}\_A(R)\; -\; \backslash mathcal\{M\}\_B(R)\; +\; \backslash mathcal\{M\}\_\{A\; \backslash cap\; B\}(R)\; \backslash to\; 0\; \backslash quad\; \backslash text\{as\}\backslash \; R\; \backslash to\; \backslash infty\; </annotation></semantics>$$ faster than $<semantics>{R}^{-N}<annotation\; encoding="application/x-tex">R^\{-N\}</annotation></semantics>$ for an $<semantics>N=N(k)<annotation\; encoding="application/x-tex">N=N(k)</annotation></semantics>$.

by willerton (S.Willerton@sheffield.ac.uk) at January 15, 2018 03:35 AM

## January 14, 2018

### John Baez - Azimuth

This article is very interesting:

• Ed Yong, Brain cells share information with virus-like capsules, *Atlantic*, January 12, 2018.

Your brain needs a protein called Arc. If you have trouble making this protein, you’ll have trouble forming new memories. The neuroscientist Jason Shepherd noticed something weird:

He saw that these Arc proteins assemble into hollow, spherical shells that look uncannily like viruses. “When we looked at them, we thought: What are these things?” says Shepherd. They reminded him of textbook pictures of HIV, and when he showed the images to HIV experts, they confirmed his suspicions. That, to put it bluntly, was a huge surprise. “Here was a brain gene that makes something that looks like a virus,” Shepherd says.

That’s not a coincidence. The team showed that Arc descends from an ancient group of genes called gypsy retrotransposons, which exist in the genomes of various animals, but can behave like their own independent entities. They can make new copies of themselves, and paste those duplicates elsewhere in their host genomes. At some point, some of these genes gained the ability to enclose themselves in a shell of proteins and leave their host cells entirely. That was the origin of retroviruses—the virus family that includes HIV.

It’s worth pointing out that gypsy is the name of a *specific kind* of retrotransposon. A retrotransposon is a gene that can make copies of itself by first transcribing itself from DNA into RNA and then converting itself back into DNA and inserting itself at other places in your chromosomes.

About 40% of your genes are retrotransposons! They seem to mainly be ‘selfish genes’, focused on their own self-reproduction. But some are also useful to you.

So, Arc genes are the evolutionary cousins of these viruses, which explains why they produce shells that look so similar. Specifically, Arc is closely related to a viral gene called gag, which retroviruses like HIV use to build the protein shells that enclose their genetic material. Other scientists had noticed this similarity before. In 2006, one team searched for human genes that look like gag, and they included Arc in their list of candidates. They never followed up on that hint, and “as neuroscientists, we never looked at the genomic papers so we didn’t find it until much later,” says Shepherd.

I love this because it confirms my feeling that viruses are deeply entangled with our evolutionary past. Computer viruses are just the latest phase of this story.

As if that wasn’t weird enough, other animals seem to have independently evolved their own versions of Arc. Fruit flies have Arc genes, and Shepherd’s colleague Cedric Feschotte showed that these descend from the same group of gypsy retrotransposons that gave rise to ours. But flies and back-boned animals co-opted these genes independently, in two separate events that took place millions of years apart. And yet, both events gave rise to similar genes that do similar things: Another team showed that the fly versions of Arc also sends RNA between neurons in virus-like capsules. “It’s exciting to think that such a process can occur twice,” says Atma Ivancevic from the University of Adelaide.

This is part of a broader trend: Scientists have in recent years discovered several ways that animals have used the properties of virus-related genes to their evolutionary advantage. Gag moves genetic information between cells, so it’s perfect as the basis of a communication system. Viruses use another gene called env to merge with host cells and avoid the immune system. Those same properties are vital for the placenta—a mammalian organ that unites the tissues of mothers and babies. And sure enough, a gene called syncytin, which is essential for the creation of placentas, actually descends from env. Much of our biology turns out to be viral in nature.

Here’s something I wrote in 1998 when I was first getting interested in this business:

RNA reverse transcribing virusesRNA reverse transcribing viruses are usually called retroviruses. They have a single-stranded RNA genome. They infect animals, and when they get inside the cell’s nucleus, they copy themselves into the DNA of the host cell using reverse transcriptase. In the process they often cause tumors, presumably by damaging the host’s DNA.

Retroviruses are important in genetic engineering because they raised for the first time the possibility that RNA could be transcribed into DNA, rather than the reverse. In fact, some of them are currently being deliberately used by scientists to add new genes to mammalian cells.

Retroviruses are also important because AIDS is caused by a retrovirus: the human immunodeficiency virus (HIV). This is part of why AIDS is so difficult to treat. Most usual ways of killing viruses have no effect on retroviruses when they are latent in the DNA of the host cell.

From an evolutionary viewpoint, retroviruses are fascinating because they blur the very distinction between host and parasite. Their genome often contains genetic information derived from the host DNA. And once they are integrated into the DNA of the host cell, they may take a long time to reemerge. In fact, so-called endogenous retroviruses can be passed down from generation to generation, indistinguishable from any other cellular gene, and evolving along with their hosts, perhaps even from species to species! It has been estimated that up to 1% of the human genome consists of endogenous retroviruses! Furthermore, not every endogenous retrovirus causes a noticeable disease. Some may even help their hosts.

It gets even spookier when we notice that once an endogenous retrovirus lost the genes that code for its protein coat, it would become indistinguishable from a long terminal repeat (LTR) retrotransposon—one of the many kinds of “junk DNA” cluttering up our chromosomes. Just how much of us is made of retroviruses? It’s hard to be sure.

For my whole article, go here:

It’s about the mysterious subcellular entities that stand near the blurry border between the living and the non-living—like viruses, viroids, plasmids, satellites, transposons and prions. I need to update it, since a lot of new stuff is being discovered!

Jason Shepherd’s new paper has a few other authors:

• Elissa D. Pastuzyn, Cameron E. Day, Rachel B. Kearns, Madeleine Kyrke-Smith, Andrew V. Taibi, John McCormick, Nathan Yoder, David M. Belnap, Simon Erlendsson, Dustin R. Morado, John A.G. Briggs, Cédric Feschotte and Jason D. Shepherd, The neuronal gene Arc encodes a repurposed retrotransposon gag protein that mediates intercellular RNA transfer, *Cell* **172** (2018), 275–288.

### ZapperZ - Physics and Physicists

Now, he’s suddenly moving from the fringes of physics to the limelight. Northwestern University in Evanston, Illinois, is about to open a first-of-its-kind research institute dedicated to just his sort of small-scale particle physics, and Gabrielse will be its founding director.

The move signals a shift in the search for new physics. Researchers have dreamed of finding subatomic particles that could help them to solve some of the thorniest remaining problems in physics. But six years’ worth of LHC data have failed to produce a definitive detection of anything unexpected.

More physicists are moving in Gabrielse’s direction, with modest set-ups that can fit in standard university laboratories. Instead of brute-force methods such as smashing particles, these low-energy experimentalists use precision techniques to look for extraordinarily subtle deviations in some of nature’s most fundamental parameters. The slightest discrepancy could point the way to the field’s future.

Again, I salute very much this type of endeavor, but I dislike the tone of the title of the article, and I'll tell you why.

In science, and especially physics, there is seldom something that has been verified, found, or discovered using just ONE experimental technique or detection method. For example, in the discovery of the Top quark, both CDF and D0 detectors at Fermilab had to agree. In the discovery of the Higgs, both ATLAS and CMS had to agree. In trying to show that something is a superconductor, you not only measure the resistivity, but also magnetic susceptibility.

In other words, you require many different types of verification, and the more the better or the more convincing it becomes.

While these table-top experiments are very ingenious, they will NOT replace the big colliders. No one in their right mind will tell CERN to "step aside", other than the author of this article. There are discoveries or parameters of elementary particles that these table-top experiments can study more efficiently than the LHC, but there are also plenty of the parameter phase space that the LHC can probe that can't be easily reached by these table-top experiments. They all are complimenting each other!

People who don't know any better, or don't know the intricacies of how experiments are done or how knowledge is gathered, will get the impression that because of these table-top experiments, facilities like the LHC will no longer be needed. I hate to think that this is the "take-home" message that many people will get.

Zz.

by ZapperZ (noreply@blogger.com) at January 14, 2018 03:25 PM

## January 13, 2018

### Clifford V. Johnson - Asymptotia

Over on instagram (@asymptotia – and maybe here too, not sure) I’ll be posting some images of developmental drawings I did for The Dialogues, sometimes alongside the finished panels in the book. It is often very interesting to see how a finished thing came to be, and so that’s why … Click to continue reading this post

The post Process for The Dialogues appeared first on Asymptotia.

## January 12, 2018

### Lubos Motl - string vacua and pheno

*Demons, Sunday School and Prime Numbers*(S01E11), Young Sheldon's mother finds out he plays a demonic game, Dungeons and Dragons, and he is fooled into attending the Sunday School. He reads and learns the Bible and other religions and ultimately establishes his own, math-based religion that teaches that prime numbers make you feel good. He has one (stupid kid) follower. I was actually teaching similar religions at that age.

Meanwhile, in the S11E13 episode of The Big Bang Theory,

*The Solo Oscillation*, Howard is almost replaced by geologist Bert in the rock band (also featuring Rajesh). The folks discuss various projects and it turns out that Sheldon has nothing serious to work on. Recall that almost 4 years ago, Sheldon Cooper left string theory. But you can't really leave string theory.

The wives got replaced, Leonard and Amy were revisiting their school projects, and Sheldon spent some quality pizza time with Penny. Penny figured out that Sheldon's recent projects didn't excite him. They were largely about dark matter. He didn't start to work on dark matter because it excited him. Instead, it was everywhere (e.g. in this fake news on Hossenfelder's blog yesterday), and it served as "rebound science", using Penny's jargon. The term describes someone whom she has dated to feel pretty again.

So instead of calculating the odds that Sheldon's mother meets an old friend (that theme was clearly analogous to Richard Feynman's calculations of the wobbling plates), Sheldon was explaining string theory to a collaborative Penny again. Different elementary particle species are different energy eigenstates of a string vibrating in a higher-dimensional space, Penny was de facto taught.

Note the beautiful whiteboard posted at the top – click the picture to zoom it in. It contains a picture of standing waves; the Nambu-Goto action; the transformation to the Polyakov action at the world sheet, the superstringy fermions are included. Then he writes down the beta-function on the world sheet, finding out that its vanishing imposes Einstein's equations in spacetime. The beta-function even has the stress-energy tensor from the Maxwell field so it's not just the vacuum Einstein's equations that are derived.

What are the odds that you could derive the correct spacetime field equations from a totally different starting point like vibrating strings, Penny is asked? Even Penny understands it can't be a coincidence. Which other TV show discusses the beta-function on the world sheet, its spacetime consequences, and the implications of these consequences for the rational assessment of the validity of string theory? Even 99% of science writers who claim to have some understanding of theoretical physics don't have a clue what the world sheet beta-function is and means!

Penny introduces a natural idea – from the layman's viewpoint – about "what can be done with strings". You can make knots out of strings. But lines may only get "knitted" in 3 dimensions. 4 dimensions is already too much, Sheldon warns her. "Unless..." Penny pretends that she's on verge of discovering a loophole, and thus forces Sheldon to discover the loophole by himself. Unless you consider the "knots" involving the whole world sheets (and/or membranes and other branes or their world volumes).

Well, just to be sure, I think that "knots involving fundamental strings or branes" still don't play an important role in string theory – and there are good reasons for that. To calculate the knot invariants, you need to know the shape of the curve (strings in this case) very accurately, and this sort of goes against the basic principles and lore of string theory as a theory of quantum gravity (in some sense, there is roughly a Planck length uncertainty about all locations, so all finite-energy states of nearby vibrating strings are almost unavoidably weird superpositions of states with different knot invariants).

But that doesn't mean that I haven't been fascinated by "knots" made out of strings and branes and their physical meaning. For example, I and Ori Ganor wrote a paper about knitting of fivebranes. If you have two parallel M5-branes in M-theory (it's not a coincidence that their dimension is about 1/2 of the spacetime dimension!), you may knit them in a new way and the knot invariant may be interpreted as the number of membranes (M2-branes) stretched in between the two (or more) M5-branes. It's a higher-dimensional extension of the "Skyrmions".

The TV show makes it sound as if the discovery of some knitting of strings and branes might become "the next revolution" in string theory and therefore physics and science, too. Well, I have some doubts. Topological solitons like that are important – but they have become a part of the toolkit that is being applied in many different ways. They're not a shocking revolutionary idea that is likely to change everything.

On the other hand, knot theory and lots of geometric and physical effects that have nice interpretations are often naturally embedded within string theory. String theory has provided us with a perfectly physically consistent and complete incarnation of so many physical effects that were known to exist in geometry or condensed matter physics or other fields that a brilliant person simply

*cannot fail*to be excited. And I do think that some proper treatment of monodromy operators in the stringy spacetime does hold the key to the next revolution in quantum gravity.

It will be good if The Big Bang Theory ends this pointless intellectual castration of Sheldon who was supposed to work, like some average astrophysicists and cosmologists, on some uninteresting and often fishy

*ad hoc*dark matter projects in recent almost 4 years. It's natural for Sheldon to get back to serious work – to look for a theory of everything – and I sincerely hope that Penny will continue to be helpful in these efforts. (Well, I still think that the claims that Sheldon will "have to" share his Nobel with Penny are exaggerated LOL.)

Good luck to him and her – and to me, too.

by Luboš Motl (noreply@blogger.com) at January 12, 2018 09:00 AM

### Clifford V. Johnson - Asymptotia

Back where? In front of a classroom teaching quantum field theory, that is. It is a wonderful, fascinating, and super-important subject, and it has been a while since I've taught it. I actually managed to dig out some pretty good notes for the last time I taught it. (Thank you, my inner pack rat for keeping those notes and putting them where I could find them!) They'll be a helpful foundation. (Aren't they beautiful by the way? Those squiggly diagrams are called Feynman diagrams.)

Important? Quantum field theory (QFT) is perhaps one of the most remarkable [...] Click to continue reading this post

The post Nice to be Back… appeared first on Asymptotia.

## January 11, 2018

### ZapperZ - Physics and Physicists

Zz.

by ZapperZ (noreply@blogger.com) at January 11, 2018 08:58 PM

## January 10, 2018

### Clifford V. Johnson - Asymptotia

It seems appropriate somehow that there's an extensive interview with me in the LA Times with Deborah Netburn about my work on the book. Those of you who have read it might have recognised some of the landscape in one of the stories as looking an awful lot like downtown Los Angeles, and if you follow the conversation and pay attention to your surroundings, you see that they pass a number of LA Landmarks, ultimately ending up very close to the LA Times Building, itself a landmark!

(In the shot above, you see a bit of the Angel's Flight railway.)

Anyway, I hope you enjoy the interview! We talk a lot about the motivations for making the book, about drawing, and - most especially - the issue of science being for everyone...

*[For those of you trying to get the book, note that although it is showing out of stock at Amazon, go ahead and place your order. Apparently they are getting the book and shipping it out constantly, even though it might not stop showing as out of stock. Also, check your local bookstores... Several Indys and branches of Barnes and Noble do have copies on their shelves. (I've checked.) Or they can order it for you. Also, the publisher's site is another source. They are offering a 50% discount as thank you for being patient while they restock. There's a whole new batch of books being printed and that will soon help make it easier to grab.]*

-cvj Click to continue reading this post

The post An LA Times Piece… appeared first on Asymptotia.

## January 09, 2018

### Tommaso Dorigo - Scientificblogging

### Lubos Motl - string vacua and pheno

The Wire India has interviewed Princeton's string theorist Nati Seiberg who is just visiting India:

Interview: ‘There’s No Conflict Between Lack of Evidence of String Theory and Work Being Done on It’They cover lots of questions and the interview is rather interesting.

**Spoilers: beware.**

The interview took place at Bengaluru. They explain Seiberg is an important theoretical physicist – e.g. a 2016 Dirac medal laureate. Sandhya Ramesh asks him to define string theory – Seiberg says it's a theory meant to be a TOE that keeps on transforming, it will probably be transforming, and the progress is very exciting.

Seiberg is asked the question from the title: How should you reconcile the absence of an experimental proof with the work on string theory? There is nothing to reconcile, the latter doesn't need the former. There are numerous reasons why people keep on researching string theory, e.g. its consequences for adjacent fields.

He is also asked how he imagines higher-dimensional objects. It's hard for him, too. When answering a question about the role of interdisciplinary research, Seiberg importantly says that there is no "string theory approach to climate science" but sometimes the collaboration on the borders of disciplines is fruitful. SUSY could have been found, it wasn't found, and it may be useful to build bigger colliders. Seiberg knows nothing about politics of begging for the big funds.

Seiberg is asked about alternative contenders running against string/M-theory and his answer is that he doesn't know of any.

Suddenly the journalist asks about the recent results on gauge theories and global symmetries and their implications on the paradigms in condensed matter physics. So unsurprisingly, Seiberg is surprised because the question betrays someone's IQ that is some 40 IQ points above the average journalist. The roles get reversed, Seiberg asks: Where did you get this question? The answer is that the journalist got it from his editor. Seiberg is impressed, and so am I. Maybe the editor just read The Reference Frame recently to improve the questions his colleagues ask. ;-)

Yesterday, Seiberg gave a talk in India that was about related questions but he didn't recommend the "public" to attend the talk because it would be a somewhat technical, although not too technical, talk. OK, he said some basic things about symmetries of faces, supersymmetry, and supersymmetry's diverse implications aside from the discovery of superpartner particles (that hasn't materialized yet).

He praises Indian string theorists – I agree with those sentiments. Seiberg rejects recommendations to give advises what people should work on and to deal with the public more often – because "he's not good at it". He addresses another great question, one about naturalness, and says that the strongest "around the corner" edition of naturalness has been disproved by the LHC null results and the assumptions that went into it have to be reassessed.

Also, Seiberg doesn't know where the work will be done. LIGO is interesting. When asked about the number of string theorists, he says that it's small enough for everyone to know everybody else and it's wonderful. He was offered the meme that India has a good weather and it's a reason to visit the country but he visits India because of the colleagues.

by Luboš Motl (noreply@blogger.com) at January 09, 2018 05:49 AM

## January 07, 2018

### John Baez - Azimuth

Johannes Kepler loved geometry, so of course he was fascinated by Platonic solids. His early work *Mysterium Cosmographicum*, written in 1596, includes pictures showing how the 5 Platonic solids correspond to the 5 elements:

*Five* elements? Yes, besides earth, air, water and fire, he includes a fifth element that doesn’t feel the Earth’s gravitational pull: the ‘quintessence’, or ‘aether’, from which heavenly bodies are made.

In the same book he also tried to use the Platonic solids to explain the orbits of the planets:

The six planets are Mercury, Venus, Earth, Mars, Jupiter and Saturn. And the tetrahedron and cube, in case you’re wondering, sit outside the largest sphere shown above. You can see them another picture from Kepler’s book:

These ideas may seem goofy now, but studying the exact radii of the planets’ orbits led him to discover that these orbits aren’t circular: they’re *ellipses!* By 1619 this led him to what we call Kepler’s laws of planetary motion. And those, in turn, helped Newton verify Hooke’s hunch that the force of gravity goes as the inverse square of the distance between bodies!

In honor of this, the problem of a particle orbiting in an inverse square force law is called the **Kepler problem**.

So, I’m happy that Greg Egan, Layra Idarani and I have come across a solid mathematical connection between the Platonic solids and the Kepler problem.

But this involves a detour into the 4th dimension!

It’s a remarkable fact that the Kepler problem has not just the expected conserved quantities—energy and the 3 components of angular momentum—but also 3 more: the components of the Runge–Lenz vector. To understand those extra conserved quantities, go here:

• Greg Egan, The ellipse and the atom.

Noether proved that conserved quantities come from symmetries. Energy comes from time translation symmetry. Angular momentum comes from rotation symmetry. Since the group of rotations in 3 dimensions, called **SO(3)**, is itself 3-dimensional, it gives 3 conserved quantities, which are the 3 components of angular momentum.

None of this is really surprising. But if we take the angular momentum *together with* the Runge–Lenz vector, we get 6 conserved quantities—and these turn out to come from the group of rotations in 4 dimensions, **SO(4)**, which is itself 6-dimensional. The obvious symmetries in this group just *rotate* a planet’s elliptical orbit, while the unobvious ones can also *squash* or *stretch* it, changing the eccentricity of the orbit.

(To be precise, all this is true only for the ‘bound states’ of the Kepler problem: the circular and elliptical orbits, not the parabolic or hyperbolic ones, which work in a somewhat different way. I’ll only be talking about bound states in this post!)

Why should the Kepler problem have symmetries coming from rotations in 4 dimensions? This is a fascinating puzzle—we know a lot about it, but I doubt the last word has been spoken. For an overview, go here:

• John Baez, Mysteries of the gravitational 2-body problem.

This SO(4) symmetry applies not only to the *classical mechanics* of the inverse square force law, but also the *quantum mechanics!* Nobody cares much about the quantum mechanics of two particles attracting gravitationally via an inverse square force law—but people care a lot about the quantum mechanics of hydrogen atoms, where the electron and proton attract each other via their electric field, which also obeys an inverse square force law.

So, let’s talk about hydrogen. And to keep things simple, let’s pretend the proton stays fixed while the electron orbits it. This is a pretty good approximation, and experts will know how to do things exactly right. It requires only a slight correction.

It turns out that wavefunctions for bound states of hydrogen can be reinterpreted as functions on the 3-sphere, S^{3} The sneaky SO(4) symmetry then becomes obvious: it just rotates this sphere! And the Hamiltonian of the hydrogen atom is closely connected to the Laplacian on the 3-sphere. The Laplacian has eigenspaces of dimensions *n*^{2} where *n* = 1,2,3,…, and these correspond to the eigenspaces of the hydrogen atom Hamiltonian. The number *n* is called the principal quantum number, and the hydrogen atom’s energy is proportional to -1/*n*^{2}.

If you don’t know all this jargon, don’t worry! All you need to know is this: if we find an eigenfunction of the Laplacian on the 3-sphere, it will give a state where the hydrogen atom has a definite energy. And if this eigenfunction is invariant under some subgroup of SO(4), so will this state of the hydrogen atom!

The biggest finite subgroup of SO(4) is the rotational symmetry group of the 600-cell, a wonderful 4-dimensional shape with 120 vertices and 600 dodecahedral faces. The rotational symmetry group of this shape has a whopping 7,200 elements! And here is a marvelous moving image, made by Greg Egan, of an eigenfunction of the Laplacian on S^{3} that’s invariant under this 7,200-element group:

We’re seeing the wavefunction on a moving slice of the 3-sphere, which is a 2-sphere. This wavefunction is actually real-valued. Blue regions are where this function is positive, yellow regions where it’s negative—or maybe the other way around—and black is where it’s almost zero. When the image fades to black, our moving slice is passing through a 2-sphere where the wavefunction is almost zero.

For a full explanation, go here:

• Greg Egan, In the chambers with seven thousand symmetries, 2 January 2018.

Layra Idarani has come up with a complete classification of *all* eigenfunctions of the Laplacian on S^{3} that are invariant under this group… or more generally, eigenfunctions of the Laplacian on a sphere of *any* dimension that are invariant under the even part of *any* Coxeter group. For the details, go here:

• Layra Idarani, SG-invariant polynomials, 4 January 2018.

All that is a continuation of a story whose beginning is summarized here:

• John Baez, Quantum mechanics and the dodecahedron.

So, there’s a lot of serious math under the hood. But right now I just want to marvel at the fact that we’ve found a wavefunction for the hydrogen atom that not only has a well-defined energy, but is also invariant under this 7,200-element group. This group includes the usual 60 rotational symmetries of a dodecahedron, but also other much less obvious symmetries.

I don’t have a good picture of what these less obvious symmetries do to the wavefunction of a hydrogen atom. I understand them a bit better *classically*—where, as I said, they squash or stretch an elliptical orbit, changing its eccentricity while not changing its energy.

We can have fun with this using the old quantum theory—the approach to quantum mechanics that Bohr developed with his colleague Sommerfeld from 1920 to 1925, before Schrödinger introduced wavefunctions.

In the old Bohr–Sommerfeld approach to the hydrogen atom, the quantum states with specified energy, total angular momentum and angular momentum about a fixed axis were drawn as elliptical orbits. In this approach, the symmetries that squash or stretch elliptical orbits are a bit easier to visualize:

This picture by Pieter Kuiper shows some orbits at the 5th energy level, *n = 5*: namely, those with different eigenvalues of the total angular momentum, ℓ.

While the old quantum theory was superseded by the approach using wavefunctions, it’s possible to make it mathematically rigorous for the hydrogen atom. So, we can draw elliptical orbits that rigorously correspond to a basis of wavefunctions for the hydrogen atom. So, I believe we can draw the orbits corresponding to the basis elements whose linear combination gives the wavefunction shown as a function on the 3-sphere in Greg’s picture above!

We should get a bunch of ellipses forming a complicated picture with dodecahedral symmetry. This would make Kepler happy.

As a first step in this direction, Greg drew the collection of orbits that results when we take a circle and apply all the symmetries of the 600-cell:

For more details, read this:

• Greg Egan, Kepler orbits with the symmetries of the 600-cell.

### Postscript

To do this really right, one should learn a bit about ‘old quantum theory’. I believe people have been getting it a bit wrong for quite a while—starting with Bohr and Sommerfeld!

If you look at the ℓ = 0 orbit in the picture above, it’s a long skinny ellipse. But I believe it really should be *a line segment straight through the proton*: that’s what’s an orbit with *no* angular momentum looks like.

There’s a paper about this:

• Manfred Bucher, Rise and fall of the old quantum theory.

Matt McIrvin had some comments on this:

This paper from 2008 is a kind of thing I really like: an exploration of an old, incomplete theory that takes it further than anyone actually did at the time.

It has to do with the Bohr-Sommerfeld “old quantum theory”, in which electrons followed definite orbits in the atom, but these were quantized–not all orbits were permitted. Bohr managed to derive the hydrogen spectrum by assuming circular orbits, then Sommerfeld did much more by extending the theory to elliptical orbits with various shapes and orientations. But there were some problems that proved maddeningly intractable with this analysis, and it eventually led to the abandonment of the “orbit paradigm” in favor of Heisenberg’s matrix mechanics and Schrödinger’s wave mechanics, what we know as modern quantum theory.

The paper argues that the old quantum theory was abandoned prematurely. Many of the problems Bohr and Sommerfeld had came not from the orbit paradigm per se, but from a much simpler bug in the theory: namely, their rejection of orbits in which the electron moves entirely radially and goes right through the nucleus! Sommerfeld called these orbits “unphysical”, but they actually correspond to the s orbital states in the full quantum theory, with zero angular momentum. And, of course, in the full theory the electron in these states does have some probability of being inside the nucleus.

So Sommerfeld’s orbital angular momenta were always off by one unit. The hydrogen spectrum came out right anyway because of the happy accident of the energy degeneracy of certain orbits in the Coulomb potential.

I guess the states they really should have been rejecting as “unphysical” were Bohr’s circular orbits: no radial motion would correspond to a certain zero radial momentum in the full theory, and we can’t have that for a confined electron because of the uncertainty principle.

## January 06, 2018

### Jon Butterworth - Life and Physics

Book review in Publishers’ Weekly.

*Butterworth (Most Wanted Particle), a CERN alum and professor of physics at University College London, explains everything particle physics from antimatter to Z bosons in this charming trek through a landscape of “the otherwise invisible.” His accessible narrative cleverly relates difficult concepts, such as wave-particle duality or electron spin, in bite-size bits. Readers become explorers on Butterworth’s metaphoric map… **Read more.*

### Jon Butterworth - Life and Physics

## January 05, 2018

### ZapperZ - Physics and Physicists

As you read this, notice all the "background knowledge" that one must have to be able to know how well certain things are known, and what are the assumptions and uncertainties in each of the methods and values that we use. All of these need to be known, and people using them must be aware of them.

Compare that to the decision we make everyday on things we accept in social policies and politics.

Zz.

by ZapperZ (noreply@blogger.com) at January 05, 2018 09:01 PM

### ZapperZ - Physics and Physicists

I have highlighted a number of CP-violation experiments on here, which is something mentioned in the article. But it is nice to have a layman-type summary of the baryo-lepton-genesis ideas that are floating out there.

Zz.

by ZapperZ (noreply@blogger.com) at January 05, 2018 09:01 PM

## January 03, 2018

### Clifford V. Johnson - Asymptotia

Happy New Year!

Yesterday, the NPR affiliate KCRW's Press Play broadcast an interview with me. I spoke with the host Madeleine Brand about my non-fiction graphic novel about science, and several other things that came up on the spur of the moment. Rather like one of the wide-ranging conversations in the book itself, come to think of it...

This was a *major* interview for me because I've been a huge fan of Madeleine for many years, going back to her NPR show Day to Day (which I still [...] Click to continue reading this post

The post Press Play! appeared first on Asymptotia.

## January 02, 2018

### Tommaso Dorigo - Scientificblogging

Venice is a wonderful city and quite a special place, if you ask me. A city with a millenary history, crammed with magnificent palaces and churches. A place where one could write a book about every stone. Walking through the maze of narrow streets or making one's way through a tight network of canals is an unforgettable experience, but living there for decades is something else - it makes you a part of it. I feel I own the place, in some way. So why did I leave it?

## January 01, 2018

### ZapperZ - Physics and Physicists

An employee in Perth, Australia, used the metallic package from a snack to shield his device that has a GPS and locate his whereabouts. He then went golfing... many times, during his work hours.

The tribunal found that the packet was deliberately used to operate as an elaborate “Faraday cage” - an enclosure which can block electromagnetic fields - and prevented his employer knowing his location. The cage set-up was named after English scientist Michael Faraday, who in 1836 observed that a continuous covering of conductive material could be used to block electromagnetic fields.

Now, if it works for his device, it should work to shield our credit cards as an RFID shield, don't you think? There's no reason to buy those expensive wallet or credit-card envelopes. Next time you have a Cheetos or potato chips, save those bags and wrap your wallet with them! :)

Zz.

by ZapperZ (noreply@blogger.com) at January 01, 2018 11:16 PM

### The n-Category Cafe

In the comments last time, a conversation got going about *$<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$-adic*
entropy. But here I’ll return to the original subject: entropy *modulo
$<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$*. I’ll answer the question:

Given a “probability distribution” mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, that is, a tuple $$<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})\in (\mathbb{Z}/p\mathbb{Z}{)}^{n}<annotation\; encoding="application/x-tex">\; \backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; \backslash in\; (\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^n\; </annotation></semantics>$$ summing to $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$, what is the right definition of its entropy $$<semantics>{H}_{p}(\pi )\in \mathbb{Z}/p\mathbb{Z}?<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi)\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}?\; </annotation></semantics>$$

How will we know when we’ve got the right definition? As I explained last time, the acid test is whether it satisfies the chain rule

$$<semantics>{H}_{p}(\gamma \circ ({\pi}^{1},\dots ,{\pi}^{n}))={H}_{p}(\gamma )+\sum _{i=1}^{n}{\gamma}_{i}{H}_{p}({\pi}^{i}).<annotation\; encoding="application/x-tex">\; H\_p(\backslash gamma\; \backslash circ\; (\backslash pi^1,\; \backslash ldots,\; \backslash pi^n))\; =\; H\_p(\backslash gamma)\; +\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash gamma\_i\; H\_p(\backslash pi^i).\; </annotation></semantics>$$

This is supposed to hold for all $<semantics>\gamma =({\gamma}_{1},\dots ,{\gamma}_{n})\in {\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash gamma\; =\; (\backslash gamma\_1,\; \backslash ldots,\; \backslash gamma\_n)\; \backslash in\; \backslash Pi\_n</annotation></semantics>$ and $<semantics>{\pi}^{i}=({\pi}_{1}^{i},\dots ,{\pi}_{{k}_{i}}^{i})\in {\Pi}_{{k}_{i}}<annotation\; encoding="application/x-tex">\backslash pi^i\; =\; (\backslash pi^i\_1,\; \backslash ldots,\; \backslash pi^i\_\{k\_i\})\; \backslash in\; \backslash Pi\_\{k\_i\}</annotation></semantics>$, where $<semantics>{\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash Pi\_n</annotation></semantics>$ is the hyperplane

$$<semantics>{\Pi}_{n}=\{({\pi}_{1},\dots ,{\pi}_{n})\in (\mathbb{Z}/p\mathbb{Z}{)}^{n}:{\pi}_{1}+\cdots +{\pi}_{n}=1\},<annotation\; encoding="application/x-tex">\; \backslash Pi\_n\; =\; \backslash \{\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; \backslash in\; (\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^n\; :\; \backslash pi\_1\; +\; \backslash cdots\; +\; \backslash pi\_n\; =\; 1\backslash \},\; </annotation></semantics>$$

whose elements we’re calling “probability distributions” mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. And if
God is smiling on us, $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ will be essentially the *only* quantity
that satisfies the chain rule. Then we’ll know we’ve got the right
definition.

Black belts in functional equations will be able to use the chain rule and
nothing else to work out what $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ must be. But the rest of us might like
an extra clue, and we have one in the definition of *real* Shannon entropy:

$$<semantics>{H}_{\mathbb{R}}(\pi )=-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}\mathrm{log}{\pi}_{i}.<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; -\; \backslash sum\_\{i:\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; \backslash log\; \backslash pi\_i.\; </annotation></semantics>$$

Now, we saw last time that there is no logarithm mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$; that is, there is no group homomorphism

$$<semantics>(\mathbb{Z}/p\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}.<annotation\; encoding="application/x-tex">\; (\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}.\; </annotation></semantics>$$

But there *is* a next-best thing: a homomorphism

$$<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}.<annotation\; encoding="application/x-tex">\; (\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}.\; </annotation></semantics>$$

This is called the Fermat quotient $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$, and it’s given by

$$<semantics>{q}_{p}(n)=\frac{{n}^{p-1}-1}{p}\in \mathbb{Z}/p\mathbb{Z}.<annotation\; encoding="application/x-tex">\; q\_p(n)\; =\; \backslash frac\{n^\{p\; -\; 1\}\; -\; 1\}\{p\}\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}.\; </annotation></semantics>$$

Let’s go through why this works.

The elements of $<semantics>\mathbb{Z}/{p}^{2}\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\}</annotation></semantics>$ are the congruence classes mod $<semantics>{p}^{2}<annotation\; encoding="application/x-tex">p^2</annotation></semantics>$ of the integers not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. Fermat’s little theorem says that whenever $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ is not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$,

$$<semantics>\frac{{n}^{p-1}-1}{p}<annotation\; encoding="application/x-tex">\; \backslash frac\{n^\{p\; -\; 1\}\; -\; 1\}\{p\}\; </annotation></semantics>$$

is an integer. This, or rather its congruence class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, is the Fermat quotient. The congruence class of $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ mod $<semantics>{p}^{2}<annotation\; encoding="application/x-tex">p^2</annotation></semantics>$ determines the congruence class of $<semantics>{n}^{p-1}-1<annotation\; encoding="application/x-tex">n^\{p\; -\; 1\}\; -\; 1</annotation></semantics>$ mod $<semantics>{p}^{2}<annotation\; encoding="application/x-tex">p^2</annotation></semantics>$, and it therefore determines the congruence class of $<semantics>({n}^{p-1}-1)/p<annotation\; encoding="application/x-tex">(n^\{p\; -\; 1\}\; -\; 1)/p</annotation></semantics>$ mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. So, $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ defines a function $<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">(\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. It’s a pleasant exercise to show that it’s a homomorphism. In other words, $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ has the log-like property

$$<semantics>{q}_{p}(mn)={q}_{p}(m)+{q}_{p}(n)<annotation\; encoding="application/x-tex">\; q\_p(m\; n)\; =\; q\_p(m)\; +\; q\_p(n)\; </annotation></semantics>$$

for all integers $<semantics>m,n<annotation\; encoding="application/x-tex">m,\; n</annotation></semantics>$ not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$.

In fact, it’s essentially unique as such. Any other homomorphism $<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">(\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ is a scalar multiple of $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$. (This follows from the classical theorem that the group $<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}<annotation\; encoding="application/x-tex">(\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times</annotation></semantics>$ is cyclic.) It’s just like the fact that up to a scalar multiple, the real logarithm is the unique measurable function $<semantics>\mathrm{log}:(0,\mathrm{\infty})\to R<annotation\; encoding="application/x-tex">\backslash log\; :\; (0,\; \backslash infty)\; \backslash to\; \backslash R</annotation></semantics>$ such that $<semantics>\mathrm{log}(xy)=\mathrm{log}x+\mathrm{log}y<annotation\; encoding="application/x-tex">\backslash log(x\; y)\; =\; \backslash log\; x\; +\; \backslash log\; y</annotation></semantics>$, but here there’s nothing like measurability complicating things.

So: $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ functions as a kind of logarithm. Given a mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ probability distribution $<semantics>\pi =\in {\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash pi\; =\; \backslash in\; \backslash Pi\_n</annotation></semantics>$, we might therefore guess that the right definition of its entropy is

$$<semantics>-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}{q}_{p}({a}_{i}),<annotation\; encoding="application/x-tex">\; -\; \backslash sum\_\{i\; :\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; q\_p(a\_i),\; </annotation></semantics>$$

where $<semantics>{a}_{i}<annotation\; encoding="application/x-tex">a\_i</annotation></semantics>$ is an integer representing $<semantics>{\pi}_{i}\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash pi\_i\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$.

However, this doesn’t work. It depends on the choice of representatives $<semantics>{a}_{i}<annotation\; encoding="application/x-tex">a\_i</annotation></semantics>$.

To get the right answer, we’ll look at real entropy in a slightly different way. Define $<semantics>{\partial}_{\mathbb{R}}:[0,1]\to \mathbb{R}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}:\; [0,\; 1]\; \backslash to\; \backslash mathbb\{R\}</annotation></semantics>$ by

$$<semantics>{\partial}_{\mathbb{R}}(x)=\{\begin{array}{ll}-x\mathrm{log}x& \mathrm{if}x\ne 0,\\ 0& \mathrm{if}x=0.\end{array}.annotation\; encoding="application/x-tex"\; \backslash partial\_\backslash mathbb\{R\}(x)\; =\; \backslash begin\{cases\}\; -\; x\; \backslash log\; x\; ifnbsp;\; x\; \backslash neq\; 0,\; \backslash \backslash \; 0\; ifnbsp;\; x\; =\; 0.\; \backslash end\{cases\}.\; /annotation/semantics$$

Then $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ has the derivative-like property

$$<semantics>{\partial}_{\mathbb{R}}(xy)=x{\partial}_{\mathbb{R}}(y)+{\partial}_{\mathbb{R}}(x)y.<annotation\; encoding="application/x-tex">\; \backslash partial\_\backslash mathbb\{R\}(x\; y)\; =\; x\; \backslash partial\_\backslash mathbb\{R\}(y)\; +\; \backslash partial\_\backslash mathbb\{R\}(x)\; y.\; </annotation></semantics>$$

A *linear* map with this property is called a derivation, so it’s
reasonable to call $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ a **nonlinear derivation**.

The observation that $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ is a nonlinear derivation turns out to be quite useful. For instance, real entropy is given by

$$<semantics>{H}_{\mathbb{R}}(\pi )=\sum _{i=1}^{n}{\partial}_{\mathbb{R}}({\pi}_{i})<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash partial\_\backslash mathbb\{R\}(\backslash pi\_i)\; </annotation></semantics>$$

($<semantics>\pi \in {\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash pi\; \backslash in\; \backslash Pi\_n</annotation></semantics>$), and verifying the chain rule for $<semantics>{H}_{\mathbb{R}}<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}</annotation></semantics>$ is done most neatly using the derivation property of $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$.

An equivalent formula for real entropy is

$$<semantics>{H}_{\mathbb{R}}(\pi )=\sum _{i=1}^{n}{\partial}_{\mathbb{R}}({\pi}_{i})-{\partial}_{\mathbb{R}}\left(\sum _{i=1}^{n}{\pi}_{i}\right).<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash partial\_\backslash mathbb\{R\}(\backslash pi\_i)\; -\; \backslash partial\_\backslash mathbb\{R\}\backslash biggl(\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash pi\_i\; \backslash biggr).\; </annotation></semantics>$$

This is a triviality: $<semantics>\sum {\pi}_{i}=1<annotation\; encoding="application/x-tex">\backslash sum\; \backslash pi\_i\; =\; 1</annotation></semantics>$, so $<semantics>{\partial}_{\mathbb{R}}(\sum {\pi}_{i})=0<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}\backslash bigl(\; \backslash sum\; \backslash pi\_i\; \backslash bigr)\; =\; 0</annotation></semantics>$, so this is the same as the previous formula. But it’s also quite suggestive: $<semantics>{H}_{\mathbb{R}}(\pi )<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}(\backslash pi)</annotation></semantics>$ measures the extent to which the nonlinear derivation $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ fails to preserve the sum $<semantics>\sum {\pi}_{i}<annotation\; encoding="application/x-tex">\backslash sum\; \backslash pi\_i</annotation></semantics>$.

Now let’s try to imitate this in $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. Since $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ plays a similar role to $<semantics>\mathrm{log}<annotation\; encoding="application/x-tex">\backslash log</annotation></semantics>$, it’s natural to define

$$<semantics>{\partial}_{p}(n)=-n{q}_{p}(n)=\frac{n-{n}^{p}}{p}<annotation\; encoding="application/x-tex">\; \backslash partial\_p(n)\; =\; -n\; q\_p(n)\; =\; \backslash frac\{n\; -\; n^p\}\{p\}\; </annotation></semantics>$$

for integers $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. But the last expression makes sense
even if $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ *is* divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. So, we can define a function

$$<semantics>{\partial}_{p}:\mathbb{Z}/{p}^{2}\mathbb{Z}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\; \backslash partial\_p\; :\; \backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\}\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; </annotation></semantics>$$

by $<semantics>{\partial}_{p}(n)=(n-{n}^{p})/p<annotation\; encoding="application/x-tex">\backslash partial\_p(n)\; =\; (n\; -\; n^p)/p</annotation></semantics>$. (This is called a $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$-derivation.) It’s easy to check that $<semantics>{\partial}_{p}<annotation\; encoding="application/x-tex">\backslash partial\_p</annotation></semantics>$ has the derivative-like property

$$<semantics>{\partial}_{p}(mn)=m{\partial}_{p}(n)+{\partial}_{p}(m)n.<annotation\; encoding="application/x-tex">\; \backslash partial\_p(m\; n)\; =\; m\; \backslash partial\_p(n)\; +\; \backslash partial\_p(m)\; n.\; </annotation></semantics>$$

And now we arrive at the long-awaited definition. The **entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$**
of $<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})<annotation\; encoding="application/x-tex">\backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)</annotation></semantics>$ is

$$<semantics>{H}_{p}(\pi )=\sum _{i=1}^{n}{\partial}_{p}({a}_{i})-{\partial}_{p}\left(\sum _{i=1}^{n}{a}_{i}\right),<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi)\; =\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash partial\_p(a\_i)\; -\; \backslash partial\_p\backslash biggl(\; \backslash sum\_\{i\; =\; 1\}^n\; a\_i\; \backslash biggr),\; </annotation></semantics>$$

where $<semantics>{a}_{i}\in \mathbb{Z}<annotation\; encoding="application/x-tex">a\_i\; \backslash in\; \backslash mathbb\{Z\}</annotation></semantics>$ represents $<semantics>{\pi}_{i}\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash pi\_i\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. This is independent of the choice of representatives $<semantics>{a}_{i}<annotation\; encoding="application/x-tex">a\_i</annotation></semantics>$. And when you work it out explicitly, it gives

$$<semantics>{H}_{p}(\pi )=\frac{1}{p}(1-\sum _{i=1}^{n}{a}_{i}^{p}).<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi)\; =\; \backslash frac\{1\}\{p\}\; \backslash biggl(\; 1\; -\; \backslash sum\_\{i\; =\; 1\}^n\; a\_i^p\; \backslash biggr).\; </annotation></semantics>$$

Just as in the real case, $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ satisfies the chain rule, which is most easily shown using the derivation property of $<semantics>{\partial}_{p}<annotation\; encoding="application/x-tex">\backslash partial\_p</annotation></semantics>$.

Before I say any more, let’s have some examples.

In the real case, the uniform distribution $<semantics>{u}_{n}=(1/n,\dots ,1/n)<annotation\; encoding="application/x-tex">u\_n\; =\; (1/n,\; \backslash ldots,\; 1/n)</annotation></semantics>$ has entropy $<semantics>\mathrm{log}n<annotation\; encoding="application/x-tex">\backslash log\; n</annotation></semantics>$. Mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, this distribution only makes sense if $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ does not divide $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ (otherwise $<semantics>1/n<annotation\; encoding="application/x-tex">1/n</annotation></semantics>$ is undefined); but assuming that, we do indeed have $<semantics>{H}_{p}({u}_{n})={q}_{p}(n)<annotation\; encoding="application/x-tex">H\_p(u\_n)\; =\; q\_p(n)</annotation></semantics>$, as we’d expect.

When we take our prime $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ to be $<semantics>2<annotation\; encoding="application/x-tex">2</annotation></semantics>$, a probability distribution $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ is just a sequence of bits like $<semantics>(0,0,1,0,1,1,1,0,1)<annotation\; encoding="application/x-tex">(0,\; 0,\; 1,\; 0,\; 1,\; 1,\; 1,\; 0,\; 1)</annotation></semantics>$ with an odd number of $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$s. Its entropy $<semantics>{H}_{2}(\pi )\in \mathbb{Z}/2\mathbb{Z}<annotation\; encoding="application/x-tex">H\_2(\backslash pi)\; \backslash in\; \backslash mathbb\{Z\}/2\backslash mathbb\{Z\}</annotation></semantics>$ turns out to be $<semantics>0<annotation\; encoding="application/x-tex">0</annotation></semantics>$ if the number of $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$s is congruent to $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$ mod $<semantics>4<annotation\; encoding="application/x-tex">4</annotation></semantics>$, and $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$ if the number of $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$s is congruent to $<semantics>3<annotation\; encoding="application/x-tex">3</annotation></semantics>$ mod $<semantics>4<annotation\; encoding="application/x-tex">4</annotation></semantics>$.

What about distributions on two elements? In other words, let $<semantics>\alpha \in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash alpha\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ and put $<semantics>\pi =(\alpha ,1-\alpha )<annotation\; encoding="application/x-tex">\backslash pi\; =\; (\backslash alpha,\; 1\; -\; \backslash alpha)</annotation></semantics>$. What is $<semantics>{H}_{p}(\pi )<annotation\; encoding="application/x-tex">H\_p(\backslash pi)</annotation></semantics>$?

It takes a bit of algebra to figure this out, but it’s not too hard, and the outcome is that for $<semantics>p\ne 2<annotation\; encoding="application/x-tex">p\; \backslash neq\; 2</annotation></semantics>$, $$<semantics>{H}_{p}(\alpha ,1-\alpha )=\sum _{r=1}^{p-1}\frac{{\alpha}^{r}}{r}.<annotation\; encoding="application/x-tex">\; H\_p(\backslash alpha,\; 1\; -\; \backslash alpha)\; =\; \backslash sum\_\{r\; =\; 1\}^\{p\; -\; 1\}\; \backslash frac\{\backslash alpha^r\}\{r\}.\; </annotation></semantics>$$ This function was, in fact, the starting point of Kontsevich’s note, and it’s what he called the $<semantics>1\frac{1}{2}<annotation\; encoding="application/x-tex">1\backslash tfrac\{1\}\{2\}</annotation></semantics>$-logarithm.

We’ve now succeeded in finding a definition of entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ that
satisfies the chain rule. That’s not quite enough, though. In principle,
there could be *loads* of things satisfying the chain rule, in which case,
what special status would ours have?

But in fact, up to the inevitable constant factor, our definition of
entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ is the *one and only* definition satisfying the chain rule:

TheoremLet $<semantics>(I:{\Pi}_{n}\to \mathbb{Z}/p\mathbb{Z})<annotation\; encoding="application/x-tex">(I:\; \backslash Pi\_n\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\})</annotation></semantics>$ be a sequence of functions. Then $<semantics>I<annotation\; encoding="application/x-tex">I</annotation></semantics>$ satisfies the chain rule if and only if $<semantics>I=c{H}_{p}<annotation\; encoding="application/x-tex">I\; =\; c\; H\_p</annotation></semantics>$ for some $<semantics>c\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">c\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$.

This is precisely analogous to the characterization theorem for real entropy, except that in the real case some analytic condition on $<semantics>I<annotation\; encoding="application/x-tex">I</annotation></semantics>$ has to be imposed (continuity in Faddeev’s theorem, and measurability in the stronger theorem of Lee). So, this is excellent justification for calling $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ the entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$.

I’ll say nothing about the proof except the following. In Faddeev’s
theorem over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$, the tricky part of the proof involves the fact
that the sequence $<semantics>(\mathrm{log}n{)}_{n\ge 1}<annotation\; encoding="application/x-tex">(\backslash log\; n)\_\{n\; \backslash geq\; 1\}</annotation></semantics>$ is *not* uniquely characterized up
to a constant factor by the equation $<semantics>\mathrm{log}(mn)=\mathrm{log}m+\mathrm{log}n<annotation\; encoding="application/x-tex">\backslash log(m\; n)\; =\; \backslash log\; m\; +\; \backslash log\; n</annotation></semantics>$; to make
that work, you have to introduce some analytic condition. Over
$<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$, the tricky part involves the fact that the domain
of the “logarithm” (Fermat quotient) is not $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$, but
$<semantics>\mathbb{Z}/{p}^{2}\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\}</annotation></semantics>$. So, analytic difficulties are replaced by
number-theoretic difficulties.

Kontsevich didn’t actually write down a definition of entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ in his two-and-a-half page note. He did exactly enough to show that there must be a unique sensible such definition… and left it there! Of course he could have worked it out if he’d wanted to, and maybe he even did, but he didn’t write it up here.

Anyway, let’s return to the quotation from Kontsevich that I began my first post with:

Conclusion:If we have a random variable $<semantics>\xi <annotation\; encoding="application/x-tex">\backslash xi</annotation></semantics>$ which takes finitely many values with all probabilities in $<semantics>\mathbb{Q}<annotation\; encoding="application/x-tex">\backslash mathbb\{Q\}</annotation></semantics>$ then we can define not only the transcendental number $<semantics>H(\xi )<annotation\; encoding="application/x-tex">H(\backslash xi)</annotation></semantics>$ but also its “residues modulo $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$” for almost all primes $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ !

In the notation of these posts, he’s saying the following. Let

$$<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})<annotation\; encoding="application/x-tex">\; \backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; </annotation></semantics>$$

be a real probability distribution in which each $<semantics>{\pi}_{i}<annotation\; encoding="application/x-tex">\backslash pi\_i</annotation></semantics>$ is rational.
There are only finitely many primes that divide one or more of the
denominators of $<semantics>{\pi}_{1},\dots ,{\pi}_{n}<annotation\; encoding="application/x-tex">\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n</annotation></semantics>$. For primes $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ *not* belonging to
this exceptional set, we can interpret $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ as a probability distribution
in $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. We can therefore take its mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ entropy,
$<semantics>{H}_{p}(\pi )<annotation\; encoding="application/x-tex">H\_p(\backslash pi)</annotation></semantics>$.

Kontsevich is playfully suggesting that we view $<semantics>{H}_{p}(\pi )\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">H\_p(\backslash pi)\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ as the residue class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ of $<semantics>{H}_{\mathbb{R}}(\pi )\in \mathbb{R}<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}(\backslash pi)\; \backslash in\; \backslash mathbb\{R\}</annotation></semantics>$.

There is more to this than meets the eye! Different real probability distributions can have the same real entropy, so there’s a question of consistency. Kontsevich’s suggestion only makes sense if

$$<semantics>{H}_{\mathbb{R}}(\pi )={H}_{\mathbb{R}}(\pi \prime )\Rightarrow {H}_{p}(\pi )={H}_{p}(\pi \prime ).<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; H\_\backslash mathbb\{R\}(\backslash pi\text{\'})\; \backslash implies\; H\_p(\backslash pi)\; =\; H\_p(\backslash pi\text{\'}).\; </annotation></semantics>$$

And this is true! I have a proof, though I’m not convinced it’s optimal. Does anyone see an easy argument for this?

Let’s write $<semantics>{\mathscr{H}}^{(p)}<annotation\; encoding="application/x-tex">\backslash mathcal\{H\}^\{(p)\}</annotation></semantics>$ for the set of real numbers of the form $<semantics>{H}_{\mathbb{R}}(\pi )<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}(\backslash pi)</annotation></semantics>$, where $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ is a real probability distribution whose probabilities $<semantics>{\pi}_{i}<annotation\; encoding="application/x-tex">\backslash pi\_i</annotation></semantics>$ can all be expressed as fractions with denominator not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. We’ve just seen that there’s a well-defined map

$$<semantics>[.]:{\mathscr{H}}^{(p)}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\; [.]\; :\; \backslash mathcal\{H\}^\{(p)\}\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; </annotation></semantics>$$

defined by

$$<semantics>[{H}_{\mathbb{R}}(\pi )]={H}_{p}(\pi ).<annotation\; encoding="application/x-tex">\; [H\_\backslash mathbb\{R\}(\backslash pi)]\; =\; H\_p(\backslash pi).\; </annotation></semantics>$$

For $<semantics>x\in {\mathscr{H}}^{(p)}\subseteq \mathbb{R}<annotation\; encoding="application/x-tex">x\; \backslash in\; \backslash mathcal\{H\}^\{(p)\}\; \backslash subseteq\; \backslash mathbb\{R\}</annotation></semantics>$, we view $<semantics>[x]<annotation\; encoding="application/x-tex">[x]</annotation></semantics>$ as the congruence class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ of $<semantics>x<annotation\; encoding="application/x-tex">x</annotation></semantics>$. This notion of “congruence class” even behaves something like the ordinary notion, in the sense that $<semantics>[.]<annotation\; encoding="application/x-tex">[.]</annotation></semantics>$ preserves addition.

(We can even go a bit further. Accompanying the characterization theorem
for entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, there is a characterization theorem for information
loss mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, strictly analogous to the theorem that John Baez, Tobias
Fritz and I proved over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$. I won’t review that stuff here,
but the point is that an information loss is a *difference* of entropies,
and this enables us to define the congruence class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ of the
*difference* of two elements of $<semantics>{\mathscr{H}}^{(p)}<annotation\; encoding="application/x-tex">\backslash mathcal\{H\}^\{(p)\}</annotation></semantics>$. The same additivity holds.)

There’s just one more thing. In a way, the definition of entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ is unsatisfactory. In order to define it, we had to step outside the world of $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ by making arbitrary choices of representing integers, and then we had to show that the definition was independent of those choices. Can’t we do it directly?

In fact, we can. It’s a well-known miracle about finite fields $<semantics>K<annotation\; encoding="application/x-tex">K</annotation></semantics>$ that
*any* function $<semantics>K\to K<annotation\; encoding="application/x-tex">K\; \backslash to\; K</annotation></semantics>$ is a polynomial. It’s a slightly less well-known
miracle that any function $<semantics>{K}^{n}\to K<annotation\; encoding="application/x-tex">K^n\; \backslash to\; K</annotation></semantics>$, for any $<semantics>n\ge 0<annotation\; encoding="application/x-tex">n\; \backslash geq\; 0</annotation></semantics>$, is also a
polynomial.

Of course, multiple polynomials can induce the same function. For
instance, the polynomials $<semantics>{x}^{p}<annotation\; encoding="application/x-tex">x^p</annotation></semantics>$ and $<semantics>x<annotation\; encoding="application/x-tex">x</annotation></semantics>$ induce the same function
$<semantics>\mathbb{Z}/p\mathbb{Z}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. But it’s still
possible to make a uniqueness statement. Given a function $<semantics>F:{K}^{n}\to K<annotation\; encoding="application/x-tex">F\; :\; K^n\; \backslash to\; K</annotation></semantics>$,
there’s a *unique* polynomial $<semantics>f\in K[{x}_{1},\dots ,{x}_{n}]<annotation\; encoding="application/x-tex">f\; \backslash in\; K[x\_1,\; \backslash ldots,\; x\_n]</annotation></semantics>$ that induces $<semantics>F<annotation\; encoding="application/x-tex">F</annotation></semantics>$
and is of degree less than the order of $<semantics>K<annotation\; encoding="application/x-tex">K</annotation></semantics>$ in each variable separately.

So, there must be a polynomial representing entropy, of order less than $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ in each variable. And as it turns out, it’s this one:

$$<semantics>{H}_{p}({\pi}_{1},\dots ,{\pi}_{n})=-\sum _{\begin{array}{c}0\le {r}_{1},\dots ,{r}_{n}<p:\\ {r}_{1}+\cdots +{r}_{n}=p\end{array}}\frac{{\pi}_{1}^{{r}_{1}}\cdots {\pi}_{n}^{{r}_{n}}}{{r}_{1}!\cdots {r}_{n}!}.<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; =\; -\; \backslash sum\_\{\backslash substack\{0\; \backslash leq\; r\_1,\; \backslash ldots,\; r\_n\; \backslash lt\; p:\backslash \backslash r\_1\; +\; \backslash cdots\; +\; r\_n\; =\; p\}\}\; \backslash frac\{\backslash pi\_1^\{r\_1\}\; \backslash cdots\; \backslash pi\_n^\{r\_n\}\}\{r\_1!\; \backslash cdots\; r\_n!\}.\; </annotation></semantics>$$

You can check that when $<semantics>n=2<annotation\; encoding="application/x-tex">n\; =\; 2</annotation></semantics>$, this is in fact the same polynomial $<semantics>{\sum}_{r=1}^{p-1}{\pi}_{1}^{r}/r<annotation\; encoding="application/x-tex">\backslash sum\_\{r\; =\; 1\}^\{p\; -\; 1\}\; \backslash pi\_1^r/r</annotation></semantics>$ as we met before — Kontsevich’s $<semantics>1\frac{1}{2}<annotation\; encoding="application/x-tex">1\backslash tfrac\{1\}\{2\}</annotation></semantics>$-logarithm.

It’s striking that this direct formula for entropy modulo a prime looks quite unlike the formula for real entropy,

$$<semantics>{H}_{\mathbb{R}}(\pi )=-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}\mathrm{log}{\pi}_{i}.<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; -\; \backslash sum\_\{i\; :\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; \backslash log\; \backslash pi\_i.\; </annotation></semantics>$$

It’s also striking that in the case $<semantics>n=2<annotation\; encoding="application/x-tex">n\; =\; 2</annotation></semantics>$, the formula for real entropy is

$$<semantics>{H}_{\mathbb{R}}(\alpha ,1-\alpha )=-\alpha \mathrm{log}\alpha -(1-\alpha )\mathrm{log}(1-\alpha ),<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash alpha,\; 1\; -\; \backslash alpha)\; =\; -\; \backslash alpha\; \backslash log\; \backslash alpha\; -\; (1\; -\; \backslash alpha)\; \backslash log(1\; -\; \backslash alpha),\; </annotation></semantics>$$

whereas mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, we get

$$<semantics>{H}_{p}(\alpha ,1-\alpha )=\sum _{r=1}^{p-1}\frac{{\alpha}^{r}}{r},<annotation\; encoding="application/x-tex">\; H\_p(\backslash alpha,\; 1\; -\; \backslash alpha)\; =\; \backslash sum\_\{r\; =\; 1\}^\{p\; -\; 1\}\; \backslash frac\{\backslash alpha^r\}\{r\},\; </annotation></semantics>$$

which is a truncation of the Taylor series of $<semantics>-\mathrm{log}(1-\alpha )<annotation\; encoding="application/x-tex">-\backslash log(1\; -\; \backslash alpha)</annotation></semantics>$. And yet, the characterization theorems for entropy over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ and over $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ are strictly analogous.

As I see it, there are two or three big open questions:

Entropy over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ can be understood, interpreted and applied in many ways. How can we understand, interpret or apply entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$?

Entropy over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ and entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ are defined in roughly analogous ways, and uniquely characterized by strictly analogous theorems. Is there a common generalization? That is, can we unify the two definitions and characterization theorems, perhaps proving a theorem about entropy over suitable fields?

by leinster (Tom.Leinster@gmx.com) at January 01, 2018 01:12 PM

## December 31, 2017

### John Baez - Azimuth

This is an expanded version of my G+ post, which was a watered-down version of Greg Egan’s G+ post and the comments on that. I’ll start out slow, and pick up speed as I go.

### Quantum mechanics meets the dodecahedron

In quantum mechanics, the position of a particle is not a definite thing: it’s described by a ‘wavefunction’. This says how probable it is to find the particle at any location… but it also contains other information, like how probable it is to find the particle moving at any *velocity*.

Take a hydrogen atom, and look at the wavefunction of the electron.

**Question 1.** Can we make the electron’s wavefunction have all the rotational symmetries of a dodecahedron—that wonderful Platonic solid with 12 pentagonal faces?

Yes! In fact it’s too easy: you can make the wavefunction look like whatever you want.

So let’s make the question harder. Like everything else in quantum mechanics, angular momentum can be uncertain. In fact you can never make all 3 components of angular momentum take definite values simultaneously! However, there are lots of wavefunctions where the *magnitude* of an electron’s angular momentum is completely definite.

This leads naturally to the next question, which was first posed by Gerard Westendorp:

**Question 2.** Can an electron’s wavefunction have a definite magnitude for its angular momentum while having all the rotational symmetries of a dodecahedron?

Yes! And there are *infinitely many ways for this to happen!* This is true even if we neglect the radial dependence of the wavefunction—that is, how it depends on the distance from the proton. Henceforth I’ll always do that, which lets us treat the wavefunction as a function on a sphere. And by the way, I’m also ignoring the electron’s spin! So, whenever I say ‘angular momentum’ I mean *orbital* angular momentum: the part that depends only on the electron’s position and velocity.

Question 2 has a trivial solution that’s too silly to bother with. It’s the spherically symmetric wavefunction! That’s invariant under *all* rotations. The real challenge is to figure out the simplest nontrivial solution. Egan figured it out, and here’s what it looks like:

The rotation here is just an artistic touch. Really the solution should be just sitting there, or perhaps changing colors while staying the same shape.

In what sense is this the simplest nontrivial solution? Well, the magnitude of the angular momentum is equal to

where the number is *quantized*: it can only take values 0, 1, 2, 3,… and so on.

The trivial solution to Question 2 has The first nontrivial solution has Why 6? That’s where things get interesting. We can get it using the 6 lines connecting opposite faces of the dodecahedron!

I’ll explain later how this works. For now, let’s move straight on to a harder question:

**Question 3.** What’s the smallest choice of where we can find *two linearly independent* wavefunctions that both have the same and both have all the rotational symmetries of a dodecahedron?

It turns out to be And Egan created an image of a wavefunction oscillating between these two possibilities!

But we can go a lot further:

**Question 4.** For each how many linearly independent functions on the sphere have that value of and all the rotational symmetries of a dodecahedron?

For ranging from 0 to 29 there are either none or one. There are none for these numbers:

1, 2, 3, 4, 5, 7, 8, 9, 11, 13, 14, 17, 19, 23, 29

and one for these numbers:

0, 6, 10, 12, 15, 16, 18, 20, 21, 22, 24, 25, 26, 27, 28

The pattern continues as follows. For ranging from 30 to 59 there are either one or two. There is one for these numbers:

31, 32, 33, 34, 35, 37, 38, 39, 41, 43, 44, 47, 49, 53, 59

and two for these numbers:

30, 36, 40, 42, 45, 46, 48, 50, 51, 52, 54, 55, 56, 57, 58

The numbers in these two lists are just 30 more than the numbers in the first two lists! And it continues on like this forever: there’s always one more linearly independent solution for than there is for

**Question 5.** What’s special about these numbers from 0 to 29?

0, 6, 10, 12, 15, 18, 20, 21, 22, 24, 25, 26, 27, 28

You don’t need to know tons of math to figure this out—but I guess it’s a sort of weird pattern-recognition puzzle unless you know which patterns are likely to be important here. So I’ll give away the answer.

Here’s the answer: these are the numbers below 30 that can be written as sums of the numbers 6, 10 and 15.

But the real question is *why?* Also: what’s so special about the number 30?

The short, cryptic answer is this. The dodecahedron has 6 axes connecting the centers of opposite faces, 10 axes connecting opposite vertices, and 15 axes connecting the centers of opposite edges. The least common multiple of these numbers is 30.

But this requires more explanation!

For this, we need more math. You may want to get off here. But first, let me show you the solutions for and as drawn by Greg Egan. I’ve already showed you which we could call the **quantum dodecahedron**:

Here is which looks like a **quantum icosahedron**:

And here is :

Maybe this deserves to be called a **quantum Coxeter complex**, since the Coxeter complex for the group of rotations and reflections of the dodecahedron looks like this:

### Functions with icosahedral symmetry

The dodecahedron and icosahedron have the same symmetries, but for some reason people talk about the icosahedron when discussing symmetry groups, so let me do that.

So far we’ve been looking at the *rotational* symmetries of the icosahedron. These form a group called or for short, with 60 elements. We’ve been looking for certain functions on the sphere that are invariant under the action of this group. To get them all, we’ll first get ahold of all polynomials on that are invariant under the action of this group Then we’ll restrict these to the sphere.

To save time, we’ll use the work of Claude Chevalley. He looked at *rotation and reflection* symmetries of the icosahedron. These form the group also known as but let’s call it for short. It has 120 elements, but never confuse it with two other groups with 120 elements: the symmetric group on 5 letters, and the binary icosahedral group.

Chevalley found all polynomials on that are invariant under the action of this bigger group These invariant polynomials form an algebra, and Chevalley showed that this algebra is freely generated by 3 homogeneous polynomials:

• of degree 2.

• of degree 6. To get this we take the dot product of with each of the 6 vectors joining antipodal vertices of the icosahedron, and multiply them together.

• of degree 10. To get this we take the dot product of with each of the 10 vectors joining antipodal face centers of the icosahedron, and multiply them together.

So, linear combinations of products of these give all polynomials on invariant under all rotation and reflection symmetries of the icosahedron.

But we want the polynomials that are invariant under just *rotational* symmetries of the icosahedron! To get all these, we need an extra generator:

• of degree 15. To get this we take the dot product of with each of the 15 vectors joining antipodal edge centers of the icosahedron, and multiply them together.

You can check that this is invariant under rotational symmetries of the icosahedron. But unlike our other polynomials, this one is not invariant under reflection symmetries! Because 15 is an odd number, switches sign under ‘total inversion’—that is, replacing with This is a product of three reflection symmetries of the icosahedron.

Thanks to Egan’s extensive computations, I’m completely convinced that and generate the algebra of all -invariant polynomials on I’ll take this as a fact, even though I don’t have a clean, human-readable proof. But someone must have proved it already—do you know where?

Since we now have 4 polynomials on they must obey a relation. Egan figured it out:

The exact coefficients depend on some normalization factors used in defining and Luckily the details don’t matter much. All we’ll really need is that this relation expresses in terms of the other generators. And this fact is easy to see without any difficult calculations!

How? Well, we’ve seen is unchanged by rotations, while it changes sign under total inversion. So, the most any rotation or reflection symmetry of the icosahedron can do to is change its sign. This means that is invariant under all these symmetries. So, by Chevalley’s result, it must be a polynomial in and .

So, we now have a nice description of the -invariant polynomials on in terms of generators and relations. Each of these gives an -invariant function on the sphere. And Leo Stein, a postdoc at Caltech who has a great blog on math and physics, has kindly created some images of these.

The polynomial is spherically symmetric so it’s too boring to draw. The polynomial of degree 6, looks like this when restricted to the sphere:

Since it was made by multiplying linear functions, one for each axis connecting opposite vertices of an icosahedron, it shouldn’t be surprising that we see blue blobs centered at these vertices.

The polynomial of degree 10, looks like this:

Here the blue blobs are centered on the icosahedron’s 20 faces.

Finally, here’s of degree 15:

This time the blue blobs are centered on the icosahedron’s 30 edges.

Now let’s think a bit about functions on the sphere that arise from polynomials on Let’s call them **algebraic functions** on the sphere. They form an algebra, and it’s just the algebra of polynomials on modulo the relation since the sphere is the set

It makes no sense to talk about the ‘degree’ of an algebraic function on the sphere, since the relation equates polynomials of different degree. What makes sense is the number that I was talking about earlier!

The group acts by rotation on the space of algebraic functions on the sphere, and we can break this space up into irreducible representations of It’s a direct sum of irreps, one of each ‘spin’

So, we can’t talk about the degree of a function on the sphere, but we can talk about its value. On the other hand, it’s very convenient to work with homogeneous polynomials on which have a definite degree—and these *restrict to* functions on the sphere. How can we relate the degree and the quantity ?

Here’s one way. The polynomials on form a graded algebra. That means it’s a direct sum of vector spaces consisting of homogeneous polynomials of fixed degree, and if we multiply two homogeneous polynomials their degrees add. But the algebra of polynomials restricted to the sphere is merely filtered algebra.

What does this mean? Let be the algebra of all algebraic functions on the sphere, and let consist of those that are restrictions of polynomials of degree Then:

1)

and

2)

and

3) if we multiply a function in by one in we get one in

That’s what a filtered algebra amounts to.

But starting from a filtered algebra, we can get a graded algebra! It’s called the associated graded algebra.

To do this, we form

and let

Then has a product where multiplying a guy in and one in gives one in So, it’s indeed a graded algebra! For the details, see Wikipedia, which manages to make it look harder than it is. The basic idea is that we multiply in and then ‘ignore terms of lower degree’. That’s what is all about.

Now I want to use two nice facts. First, is the spin- representation of Second, there’s a natural map from any filtered algebra to its associated graded algebra, which is an isomorphism of vector spaces (though not of algebras). So, we get an natural isomorphism of vector spaces

from the algebraic functions on the sphere to the direct sum of all the spin- representations!

Now to the point: because this isomorphism is *natural*, it commutes with symmetries, so we can also use it to study algebraic functions on the sphere that are invariant under a group of linear transformations of

Before tackling the group we’re really interested in, let’s try the group of rotation *and reflection* symmetries of the icosahedron, As I mentioned, Chevalley worked out the algebra of polynomials on that are invariant under this bigger group. It’s a graded commutative algebra, and it’s free on three generators: of degree 2, of degree 6, and of degree 10.

Starting from here, to get the algebra of -invariant algebraic functions on the sphere, we mod out by the relation This gives a filtered algebra which I’ll call (It’s common to use a superscript with the name of a group to indicate that we’re talking about the stuff that’s invariant under some action of that group.) From this we can form the associated graded algebra

where

If you’ve understood everything I’ve been trying to explain, you’ll see that is the space of all functions on the sphere that transform in the spin- representation and are invariant under the rotation and reflection symmetries of the icosahedron.

But now for the fun part: what is this space like? By the work of Chevalley, the algebra is spanned by products

but since we have the relation and no other relations, it has a basis given by products

So, the space has a basis of products like this whose degree is meaning

Thus, the space we’re really interested in:

has a basis consisting of equivalence classes

where

So, we get:

**Theorem 1.** The dimension of the space of functions on the sphere that lie in the spin- representation of and are invariant under the rotation and reflection symmetries of the icosahedron equals the number of ways of writing as an unordered sum of 6’s and 10’s.

Let’s see how this goes:

: dimension 1, with basis

: dimension 0

: dimension 0

: dimension 0

: dimension 0

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 0

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 0

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 2, with basis

So, the story starts out boring, with long gaps. The odd numbers are completely uninvolved. But it heats up near the end, and reaches a thrilling climax at At this point we get *two* linearly independent solutions, because 30 is the least common multiple of the degrees of and

It’s easy to see that from here on the story ‘repeats’ with period 30, with the dimension growing by 1 each time:

Now, finally, we are to tackle Question 4 from the first part of this post: for each how many linearly independent functions on the sphere have that value of and all the rotational symmetries of a dodecahedron?

We just need to repeat our analysis with the group of rotational symmetries of the dodecahedron, replacing the bigger group

We start with algebra of polynomials on that are invariant under . As we’ve seen, this is a graded commutative algebra with *four* generators: as before, but also of degree 15. To make up for this extra generator there’s an extra relation, which expresses in terms of the other generators.

Starting from here, to get the algebra of -invariant algebraic functions on the sphere, we mod out by the relation This gives a filtered algebra I’ll call Then we form the associated graded algebra

where

What we really want to know is the dimension of since this is the space of functions on the sphere that transform in the spin- representation and are invariant under the rotational symmetries of the icosahedron.

So, what’s this space like? The algebra is *spanned* by products

but since we have the relation and a relation expressing in terms of other generators, it has a *basis* given by products

where

So, the space has a basis of products like this whose degree is meaning

and

Thus, the space we’re really interested in:

has a basis consisting of equivalence classes

where

and

So, we get:

**Theorem 2.** The dimension of the space of functions on the sphere that lie in the spin- representation of and are invariant under the rotational symmetries of the icosahedron equals the number of ways of writing as an unordered sum of 6’s, 10’s and at most one 15.

Let’s work out these dimensions explicitly, and see how the extra generator changes the story! Since it has degree 15, it contributes some solutions for odd values of But when we reach the magic number 30, this extra generator loses its power: has degree 30, but it’s a linear combination of other things.

: dimension 1, with basis

: dimension 0

: dimension 0

: dimension 0

: dimension 0

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 0

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 0

: dimension 1, with basis

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 1, with basis

: dimension 1, with basis

: dimension 0

: dimension 1, with basis

: dimension 1, with basis

: dimension 1, with basis

: dimension 1, with basis

: dimension 1, with basis

: dimension 0

: dimension 2, with basis

From here on the story ‘repeats’ with period 30, with the dimension growing by 1 each time:

So, we’ve more or less proved everything that I claimed in the first part. So we’re done!

### Postscript

But I can’t resist saying a bit more.

First, there’s a very different and somewhat easier way to compute the dimensions in Theorems 1 and 2. It uses the theory of characters, and Egan explained it in a comment on the blog post on which this is based.

Second, if you look in these comments, you’ll also see a lot of material about harmonic polynomials on —that is, those obeying the Laplace equation. These polynomials are very nice when you’re trying to decompose the space of functions on the sphere into irreps of The reason is that the *harmonic* homogeneous polynomials of degree when restricted to the sphere, give you exactly the spin- representation!

If you take *all* homogeneous polynomials of degree and restrict them to the sphere you get a lot of ‘redundant junk’. You get the spin- rep, plus the spin- rep, plus the spin- rep, and so on. The reason is the polynomial

and its powers: if you have a polynomial living in the spin- rep and you multiply it by you get another one living in the spin- rep, but you’ve boosted the degree by 2.

Layra Idarani pointed out that this is part of a nice general theory. But I found all this stuff slightly distracting when I was trying to prove Theorems 1 and 2 assuming that we had explicit presentations of the algebras of – and -invariant polynomials on So, instead of introducing facts about harmonic polynomials, I decided to use the ‘associated graded algebra’ trick. This is a more algebraic way to ‘eliminate the redundant junk’ in the algebra of polynomials and chop the space of functions on the sphere into irreps of

Also, Egan and Idarani went ahead and considered what happens when we replace the icosahedron by another Platonic solid. It’s enough to consider the cube and tetrahedron. These cases are actually subtler than the icosahedron! For example, when we take the dot product of with each of the 10 vectors joining antipodal face centers of the cube, and multiply them together, we get a polynomial that’s not invariant under rotations of the cube! Up to a constant it’s just and this changes sign under some rotations.

People call this sort of quantity, which gets multiplied by a number under transformations instead of staying the same, a **semi-invariant**. The reason we run into semi-invariants for the cube and tetrahedron is that their rotational symmetry groups, and have nontrivial abelianizations, namely and The abelianization of is trivial.

Egan summarized the story as follows:

Just to sum things up for the cube and the tetrahedron, since the good stuff has ended up scattered over many comments:

For the cube, we define:A of degree 4 from the cube’s vertex-axes, a full invariant

B of degree 6 from the cube’s edge-centre-axes, a semi-invariant

C of degree 3 from the cube’s face-centre-axes, a semi-invariantWe have full invariants:

A of degree 4

C^{2}of degree 6

BC of degree 9B

^{2}can be expressed in terms of A, C and P, so we never use it, and we use BC at most once.So the number of copies of the trivial rep of the rotational symmetry group of the

cubein spin ℓ is the number of ways to write ℓ as an unordered sum of 4, 6 and at most one 9.

For the tetrahedron, we embed its vertices as four vertices of the cube. We then define:V of degree 4 from the tet’s vertices, a full invariant

E of degree 3 from the tet’s edge-centre axes, a full invariantAnd the B we defined for the embedding cube serves as a full invariant of the tet, of degree 6.

B

^{2}can be expressed in terms of V, E and P, so we use B at most once.So the number of copies of the trivial rep of the rotational symmetry group of the

tetrahedronin spin ℓ is the number of ways to write ℓ as a sum of 3, 4 and at most one 6.

All of this stuff reminds me of a baby version of the theory of modular forms. For example, the algebra of modular forms is graded by ‘weight’, and it’s the free commutative algebra on a guy of weight 4 and a guy of weight 6. So, the dimension of the space of modular forms of weight is the number of ways of writing as an unordered sum of 4’s and 6’s. Since the least common multiple of 4 and 6 is 12, we get a pattern that ‘repeats’, in a certain sense, mod 12. Here I’m talking about the simplest sort of modular forms, based on the group But there are lots of variants, and I have the feeling that this post is secretly about some sort of variant based on *finite* subgroups of instead of infinite discrete subgroups.

There’s a lot more to say about all this, but I have to stop or I’ll never stop. Please ask questions and if you want me to say more!

## December 30, 2017

### Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

If anyone had suggested a few years ago that I would forgo a snowsports holiday in the Alps for a week’s research, I would probably not have believed them. Yet here I am, sitting comfortably in the library of the *Dublin Institute for Advanced Studies*.

*The School of Theoretical Physics at the Dublin Institute for Advanced Studies*

It’s been a most satisfying week. One reason is that a change truly is as good as a rest – after a busy teaching term, it’s very enjoyable to spend some time in a quiet spot, surrounded by books on the history of physics. Another reason is that one can accomplish an astonishing amount in one week’s uninterrupted study. That said, I’m not sure I could do this all year round, I’d miss the teaching!

As regards a resolution for 2018, I’ve decided to focus on getting a book out this year. For some time, I have been putting together a small introductory book on the big bang theory, based on a public lecture I give to diverse audiences, from amateur astronomers to curious taxi drivers. The material is drawn from a course I teach at both *Waterford Institute of Technology* and *University College Dublin* and is almost in book form already. The UCD experience is particularly useful, as the module is aimed at first-year students from all disciplines.

Of course, there are already plenty of books out there on this topic. My students have a comprehensive reading list, which includes classics such as *A Brief History of Time* (Hawking), *The* *First Three Minutes* (Weinberg) and *The Big Bang* (Singh). However, I regularly get feedback to the effect that the books are too hard (Hawking) or too long (Singh) or out of date (Weinberg). So I decided a while ago to put together my own effort; a useful exercise if nothing else comes of it.

In particular, I intend to take a historical approach to the story. I’m a great believer in the ‘how-we-found-out’ approach to explaining scientific theories (think for example of that great BBC4 documentary on the discovery of oxygen). My experience is that a historical approach allows the reader to share the excitement of discovery and makes technical material much easier to understand. In addition, much of the work of the early pioneers remains relevant today. The challenge will be to present a story that is also concise – that’s the hard part!

## December 28, 2017

### Life as a Physicist

Every Christmas I try to do some sort of project. Something new. Sometimes it turns into something real, and last for years. Sometimes it goes no where. Normally, I have an idea of what I’m going to attempt – usually it has been bugging me for months and I can’t wait till break to get it started. This year, I had none.

But, I arrived home at my parent’s house in New Jersey and there it was waiting for me. The house is old – more 200 yrs old – and the steam furnace had just been replaced. For those of you unfamiliar with this method of heating a house: it is noisy! The furnace boils water, and the steam is forced up through the pipes to cast iron radiators. The radiators hiss through valves as the air is forced up – an iconic sound from my childhood. Eventually, after traveling sometimes four floors, the super hot steam reaches the end of a radiator and the valve shuts off. The valves are cool – heat sensitive! The radiator, full of hot steam, then warms the room – and rather effectively.

The bane of this system, however, is that it can leak. And you have no idea where the leak is in the whole house! The only way you know: the furnace reservoir needs refilling too often. So… the problem: how to detect the reservoir needs refilling? Especially with this new modern furnace which can automatically refill its resevoir.

Me: Oh, look, there is a little LED that comes on when the automatic refilling system comes on! I can watch that! Dad: Oh, look, there is a little light that comes on when the water level is low. We can watch that.

Dad’s choice of tools: a wifi cam that is triggered by noise. Me: A Raspberry Pi 3, a photo-resistor, and a capacitor. Hahahaha. Game on!

What’s funny? Neither of us have detected a water-refill since we started this project. The first picture at the right you can see both of our devices – in the foreground taped to the gas input line is the CAM watching the water refill light through a mirror, and in the background (look for the yellow tape) is the Pi taped to the refill controller (and the capacitor and sensor hanging down looking at the LED on the bottom of the box).

I chose the Pi because I’ve used it once before – for a Spotify end-point. But never for anything that it is designed for. An Arduino is almost certainly better suited to this – but I wasn’t confident that I could get it up and running in the 3 days I had to make this (including time for ordering and shipping of all parts from Amazon). It was a lot of fun! And consumed a bunch of time. “Hey, where is Gordon? He needs to come for Christmas dinner!” “Wait, are you working on Christmas day?” – for once I could answer that last one with a honest no! Hahaha.

I learned a bunch:

- I had to solder! It has been a loooong time since I’ve done that. My first graduate student, whom I made learn how to solder before I let him graduate, would have laughed at how rusty my skills were!
- I was surprised to learn, at the start, that the Pi has no analog to digital converter. I stole a quick and dirty trick that lots of people have used to get around this problem: time how long it takes to charge a capacitor up with a photoresistor. This is probably the biggest source of noise in my system, but does for crude measurements.
- I got to write all my code in Python. Even interrupt handling (ok, no call backs, but still!)
- The Pi, by default, runs a full build of Linux. Also, python 3! I made full use of this – all my code is in python, and a bit in bash to help it get going. I used things like cron and pip – they were either there, or trivial to install. Really, for this project, I was never consious of the Pi being anything less than a full computer.
- At first I tried to write auto detection code – that would see any changes in the light levels and write them to a file… which was then served on a nginx simple webserver (seriously – that was about 2 lines of code to install). But the noise in the system plus the fact that we’ve not had a fill so I don’t know what my signal looks like yet… So, that code will have to be revised.
- In the end, I have to write a file with the raw data in it, and analyze that – at least, until I know what an actual signal looks like. So… how to get that data off the Pi – especially given that I can’t access it anymore now that I’ve left New Jersey? In the end I used some Python code to push the files to OneDrive. Other than figuring out how to deal with OAuth2, it was really easy (and I’m still not done fighting the authentication battle). What will happen if/when it fails? Well… I’ve recorded the commands my Dad will have to execute to get the new authentication files down there. Hopefully there isn’t going to be an expiration!
- To analyze the raw data I’ve used a new tool I’ve recently learned at work: numpy and Jupyter notebooks. They allow me to produce a plot like this one. The dip near the left hand side of the plot is my Dad shining the flashlight at my sensors to see if I could actually see anything. The joker.

Pretty much the only thing I’d used before was Linux, and some very simple things with an older Raspberry Pi 2. If anyone is on the fence about this – I’d definately recommend trying it out. It is very easy and there are 1000’s of web pages with step by step instructions for most things you’ll want to do!

### John Baez - Azimuth

There are still a few more things I want to say about the 600-cell. Last time I described the ‘compound of five 24-cells’. David Richter built a model of this, projected from 4 dimensions down to 3:

It’s nearly impossible to tell from this picture, but it’s five 24-cells inscribed in the 600-cell, with each vertex of the 600-cell being the vertex of just one of these five 24-cells. The trick for constructing it is to notice that the vertices of the 600-cell form a *group* sitting in the sphere of unit quaternions, and to find a 24-cell whose vertices form a *subgroup*.

The left cosets of a subgroup are the sets

They look like copies of ‘translated’, or in our case ‘rotated’, inside Every point of lies in exactly one coset.

In our example there are five cosets. Each is the set of vertices of a 24-cell inscribed in the 600-cell. Every vertex of the 600-cell lies in exactly one of these cosets. This gives our ‘compound of five 24-cells’.

It turns out this trick is part of a family of three tricks, each of which gives a nice compound of 4d regular polytopes. While I’ve been avoiding coordinates, I think they’ll help get the idea across now. Here’s a nice description of the 120 vertices of the 600-cell. We take these points:

and all those obtained by *even* permutations of the coordinates. So, we get

points of the first kind,

points of the second kind, and

points of the third kind, for a total of

points.

The 16 points of the first kind are the vertices of a 4-dimensional **hypercube**, the 4d analogue of a cube:

The 8 points of the second kind are the vertices of a 4-dimensional **orthoplex**, the 4d analogue of an octahedron:

The hypercube and orthoplex are dual to each other. Taking both their vertices together we get the 16 + 8 = 24 vertices of the **24-cell**, which is self-dual:

The hypercube, orthoplex and 24-cell are regular polytopes, as is the 600-cell.

Now let’s think of any point in 4-dimensional space as a quaternion:

If we do this, we can check that the 120 vertices of the 600-cell form a *group* under quaternion multiplication. As mentioned in Part 1, this group is called the **binary icosahedral group** or because it’s a double cover of the rotational symmetry group of an icosahedron (or dodecahedron).

We can also check that the 24 vertices of the 24-cell form a group under quaternion multiplication. As mentioned in Part 1, this is called the **binary tetrahedral group** or because it’s a double cover of the rotational symmetry group of a tetrahedron.

All this is old news. But it’s even easier to check that the 8 vertices of the orthoplex form a group under quaternion multiplication: they’re just

This group is often called the **quaternion group** or It too is a double cover of a group of rotations! The 180° rotations about the and axes square to 1 and commute with each other; up in the double cover of the rotation group (the unit quaternions, or ) they give elements that square to -1 and anticommute with each other.

Furthermore, the 180° rotations about the and axes are symmetries of a regular tetrahedron! This is easiest to visualize if you inscribe the tetrahedron in a cube thus:

So, up in the double cover of the 3d rotation group we get a chain of subgroups

which explains why we’re seeing an orthoplex inscribed in a 24-cell inscribed in a 600-cell! This explanation is more satisfying to me than the one involving coordinates.

Alas, I don’t see how to understand the hypercube inscribed in the 24-cell in quite this way, since the hypercube is not a subgroup of the unit quaternions. It certainly wasn’t in the coordinates I gave before—but worse, there’s no way to rotate the hypercube so that it becomes a subgroup. There must be something interesting to say here, but I don’t know it. So, I’ll forget the hypercube for now.

Instead, I’ll use group theory to do something nice with the orthoplex.

First, look at the orthoplexes sitting inside the 24-cell! We’ve got 8-element subgroup of a 24-element group:

so it has three right cosets, each forming the vertices of an orthoplex inscribed in the 24-cell. So, we get **compound of three orthoplexes**: a way of partitioning the vertices of the 24-cell into those of three orthoplexes.

Second, look at the orthoplexes sitting inside the 600-cell! We’ve got 8-element subgroup of a 120-element group:

so it has 15 right cosets, each forming the vertices of an orthoplex inscribed in the 600-cell. So, we get a **compound of 15 orthoplexes**: a way of partitioning the vertices of the 600-cell into those of 15 orthoplexes.

And third, these fit nicely with what we saw last time: the 24-cells sitting inside the 600-cell! We saw a 24-element subgroup of a 120-element group

so it has 5 right cosets, each forming the vertices of a 24-cell inscribed in the 600-cell. That gave us the **compound of five 24-cells**: a way of partitioning the vertices of the 600-cell into those of five 24-cells.

There are some nontrivial counting problems associated with each of these three compounds. David Roberson has already solved most of these.

1) How many ways are there of inscribing an orthoplex in a 24-cell?

2) How many ways are there of inscribing a compound of three orthoplexes in a 24-cell?

3) How many ways are there of inscribing an orthoplex in a 600-cell? David used a computer to show there are 75. Is there a nice human-understandable argument?

4) How many ways are there of inscribing a compound of 15 orthoplexes in a 600-cell? David used a computer to show there are 280. Is there a nice human-understandable argument?

5) How many ways are there of inscribing a 24-cell in a 600-cell? David used a computer to show there are 25. Is there a nice human-understandable argument?

4) How many ways are there of inscribing a compound of five 24-cells in a 600-cell? David used a computer to show there are 10. Is there a nice human-understandable argument? (It’s pretty easy to prove that 10 is a lower bound.)

For those who prefer visual delights to math puzzles, here is a model of the compound of 15 orthoplexes, cleverly projected from 4 dimensions down to 3, made by David Richter and some friends:

It took four people 6 hours to make this! Click on the image to learn more about this amazing shape, and explore David Richter’s pages to see more compounds.

So far my tale has not encompassed the **120-cell**, which is the dual of the 600-cell. This has 600 vertices and 120 dodecahedral faces:

Unfortunately, like the hypercube, the vertices of the 120-cell cannot be made into a subgroup of the unit quaternions. I’ll need some other idea to think about them in a way that I enjoy. But the 120-cell is amazing because *every regular polytope in 4 dimensions can be inscribed in the 120-cell*.

For example, we can inscribe the orthoplex in the 120-cell. Since the orthoplex has 8 vertices while the 120-cell has 600, and

we might hope for a compound of 75 orthoplexes whose vertices, taken together, are those of the 120-cell. And indeed it exists… and David Richter and his friends have built a model!

### Image credits

You can click on any image to see its source. The photographs of models of the compound of five 24-cells and the compound of 15 orthoplexes are due to David Richter and friends. The shiny ball-and-strut pictures of the tetrahedron in the cube and the 120-cells were made by Tom Ruen using Robert Webb’s Stella software and placed on Wikicommons. The 2d projections of the hypercube, orthoplex and 24-cell were made by Tom Ruen and placed into the public domain on Wikicommons.

## December 26, 2017

### Tommaso Dorigo - Scientificblogging

I offer three questions below, and you are welcome to think any or all of them over today and tomorrow. In two days I will give my answer, explain the underlying physics a bit, and comment your own answers, if you have been capable of typing them despite your skyrocketing glycemic index.

## December 24, 2017

### John Baez - Azimuth

This is a compound of five tetrahedra:

It looks like a scary, almost random way of slapping together 5 regular tetrahedra until you realize what’s going on. A regular dodecahedron has 20 vertices, while a regular tetrahedron has 4. Since 20 = 4 × 5, you can try to partition the dodecahedron’s vertices into the vertices of five tetrahedra. And it works!

The result is the **compound of five tetrahedra**. It comes in in two mirror-image forms.

I want to tell you about a 4-dimensional version of the same thing. Amazingly, the 4-dimensional version arises from studying the *symmetries* of the 3-dimensional thing you see above! The symmetries of the tetrahedron gives a 4-dimensional regular polytope called the ’24-cell’, while the symmetries of the dodecahedron give one called the ‘600-cell’. And there’s a way of partitioning the vertices of the 600-cell into 5 sets, each being the vertices of a 24-cell! So, we get a ‘compound of five 24-cells’.

To see how this works, we need to think about symmetries.

Any rotational symmetry of the dodecahedron acts to permute the tetrahedra in our compound of five tetrahedra. We can only get *even* permutations this way, but we can get *any* even permutation. Furthermore, knowing this permutation, we can tell what rotation we did. So, by the marvel of mathematical reasoning, the rotational symmetry group of the dodecahedron must be the alternating group

On the other hand, the rotational symmetry group of the tetrahedron is since any rotation gives an even permutation of the 4 vertices of the tetrahedron.

If we pick any tetrahedron in our compound of five tetrahedra, its rotational symmetries give rotational symmetries of the dodecahedron. So, these symmetries form a subgroup of that is isomorphic to There are exactly 5 such subgroups—one for each tetrahedron.

So, to the eyes of a group theorist, the tetrahedra in our compound of five tetrahedra are just the subgroups of acts on itself by conjugation, and this action permutes these 5 subgroups. Indeed, it acts to give all the even permutations—after all, it’s

So, *the compound of five tetrahedra has dissolved into group theory, with each figure becoming a group, and the big group acting to permute its own subgroups!*

All this is just the start of a longer story about compounds of Platonic solids:

• Dodecahedron with 5 tetrahedra, *Visual Insight*, 15 May 2015.

But only recently did I notice how this story generalizes to four dimensions. Just as we can inscribe a compound of five tetrahedra in the dodecahedron, we can inscribe a compound of five 24-cells in a 600-cell!

Here’s how it goes.

The rotational symmetry group of a tetrahedron is contained in the group of rotations in 3d space:

so it has a double cover, the **binary tetrahedral group**

and since we can see as the unit quaternions, the elements of the binary tetrahedral group are the vertices of a 4d polytope! This polytope obviously has 24 vertices, twice the number of elements in —but less obviously, it also has 24 octahedral faces, so it’s called the **24-cell**:

Similarly, the rotational symmetry group of a dodecahedron is contained in the group of rotations in 3d space:

so it has a double cover, usually called the **binary icosahedral group**

and since we can see as the unit quaternions, the elements of the binary icosahedral group are the vertices of a 4d polytope! This polytope obviously has 120 vertices, twice the number of elements in —but less obviously, it has 600 tetrahedral faces, so it’s called the **600-cell**:

Each way of making into a subgroup of gives a way of making the binary tetrahedral group into a subgroup of the binary dodecahedral group … and thus a way of inscribing the 24-cell in the 600-cell!

Next, since 120 = 24 × 5, you can try to partition the 600-cell’s vertices into the vertices of five 24-cells. And it works!

And it’s easy: just take a subgroup of , and consider the cosets of this subgroup. Each coset gives the vertices of a 24-cell inscribed in the 600-cell, and there are 5 of these cosets, all disjoint.

So, we get a **compound of five 24-cells**, whose vertices are those of the 600-cell.

This leads to another question: how many ways can we fit a compound of five 24-cells into a 600-cell?

The answer is 10. It’s easy to get ahold of 10, so the hard part is proving there are no more. Coxeter claims it’s true in the footnote in Section 14.3 of his *Regular Polytopes*, in which he apologizes for criticizing someone else who earlier claimed it was true:

Thus Schoute (

6, p. 231) was right when he said the 120 vertices of {3,5,3} belong to five {3,4,3}’s in ten different ways. The disparaging remark in the second footnote to Coxeter4, p. 337, should be deleted.

I believe neither of these references has a proof! David Roberson has verified it using Sage, as explained in his comment on a previous post. But it would still be nice to find a human-readable proof.

To see why there are *at least* 10 ways to stick a compound of five 24-cells in the 600-cell, go here:

• John Baez, How many ways can you inscribe five 24-cells in a 600-cell, hitting all its vertices?, *MathOverflow*, 15 December 2017.

**Puzzle.** Are five of these ways mirror-image versions of the other five?

### Image credits

You can click on any image to see its source. The first image of the compound of five tetrahedra was made using Robert Webb’s Stella software and placed on Wikicommons. The rotating compound of five tetrahedra in a dodecahedron was made by Greg Egan and donated to my blog *Visual Insight*. The rotating 24-cell and 600-cell were made by Jason Hise and put into the public domain on Wikicommons.

### The n-Category Cafe

When you try to quantize 10-dimensional supergravity theories, you are led to some theories involving strings. These are fairly well understood, because the worldsheet of a string is 2-dimensional, so string theories can be studied using 2-dimensional conformal quantum field theories, which are mathematically tractable.

When you try to quantize 11-dimensional supergravity, you are led to a theory involving 2-branes and 5-branes. People call it M-theory, because while it seems to have magical properties, our understanding of it is still murky — because it involves these higher-dimensional membranes. They have 3- and 6-dimensional worldsheets, respectively. So, precisely formulating M-theory seems to require understanding certain quantum field theories in 3 and 6 dimensions. These are bound to be tougher than 2d quantum field theories… tougher to make mathematically rigorous, for example… but even worse, until recently people didn’t know what either of these theories *were!*

In 2008, Aharony, Bergman, Jafferis and Maldacena figured out the 3-dimensional theory: it’s a supersymmetric Chern–Simons theory coupled to matter in a way that makes it no longer a topological quantum field theory, but still conformally invariant. It’s now called the ABJM theory. This discovery led to the ‘M2-brane mini-revolution’, as various puzzles about M-theory got solved.

The 6-dimensional theory has been much more elusive. It’s called the (0,2) theory. It should be a 6-dimensional conformal quantum field theory. But its curious properties got people thinking that it *couldn’t arise from any Lagrangian* — a serious roadblock, given how physicists normally like to study quantum field theories. But people have continued avidly seeking it, and not just for its role in a potential ‘theory of everything’. Witten and others have shown that if it existed, it would shed new light on Khovanov duality and geometric Langlands correspondence! The best introduction is here:

- Edward Witten, Geometric Langlands from six dimensions, 2009.

In a recent interview with *Quanta* magazine, Witten called this elusive 6-dimensional theory “the pinnacle”:

Q: I’ve heard about the mysterious (2,0) theory, a quantum field theory describing particles in six dimensions, which is dual to M-theory describing strings and gravity in seven-dimensional AdS space. Does this (2,0) theory play an important role in the web of dualities?

A: Yes, that’s the pinnacle. In terms of conventional quantum field theory without gravity, there is nothing quite like it above six dimensions. From the (2,0) theory’s existence and main properties, you can deduce an incredible amount about what happens in lower dimensions. An awful lot of important dualities in four and fewer dimensions follow from this six-dimensional theory and its properties. However, whereas what we know about quantum field theory is normally from quantizing a classical field theory, there’s no reasonable classical starting point of the (2,0) theory. The (2,0) theory has properties [such as combinations of symmetries] that sound impossible when you first hear about them. So you can ask why dualities exist, but you can also ask why is there a 6-D theory with such and such properties? This seems to me a more fundamental restatement.

Indeed, it sits atop a terrifying network of field theories in various lower dimensions:

Now, maybe, *maybe* this theory has been found:

- Christian Saemann, Lennart Schmidt, An M5-brane model.

As Urs cautiously and wisely wrote:

If this holds water, it will be big.

Here’s the abstract:

Abstract.We present an action for a six-dimensional superconformal field theory containing a non-abelian tensor multiplet. All of the ingredients of this action have been available in the literature. We bring these pieces together by choosing the string Lie 2-algebra as a gauge structure, which we motivated in previous work. The kinematical data contains a connection on a categorified principal bundle, which is the appropriate mathematical description of the parallel transport of self-dual strings. Our action can be written down for each of the simply laced Dynkin diagrams, and each case reduces to a four-dimensional supersymmetric Yang-Mills theory with corresponding gauge Lie algebra. Our action also reduces nicely to an M2-brane model which is a deformation of the ABJM model.

My own interest in this is purely self-centered. I hope this theory holds water — I hope it continues to pass various tests it needs to pass to be the elusive (0,2) theory — because it uses ideas from higher gauge theory, and in particular the string Lie 2-algebra!

This is a ‘categorified Lie algebra’ that one can construct starting from any Lie algebra with an invariant inner product. It was first found (though not under this name) by Alissa Crans, who was then working on her thesis with me:

- Alissa Crans,
*Lie 2-Algebras*, Ph.D. thesis, U.C. Riverside, 2004.

The idea was published here:

- John Baez and Alissa Crans, Higher-dimensional algebra VI: Lie 2-algebras,
*TAC***12**(2004), 492–528.

In 2005, together with Danny Stevenson and Urs Schreiber, we connected the string Lie 2-algebra to central extensions of loop groups and the ‘string group’:

- John C. Baez, Alissa S. Crans, Danny Stevenson, Urs Schreiber, From loop groups to 2-groups,
*Homotopy, Homology and Applications***9**(2007), 101–135.

though our paper was published only after a struggle. Subsequently Urs worked out a much better understanding of how Lie $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$-algebras appear in higher gauge theory and string theory. In 2012, together with Domenico Fiorenza, Hisham Sati, he began working out how string Lie 2-algebras are related to the 5-branes in M-theory:

- Domenico Fiorenza, Hisham Sati, Urs Schreiber, Multiple M5-branes, string 2-connections, and 7d nonabelian Chern-Simons theory,
*ATMP***18**(2014), 229–321.

They focused on the 7-dimensional Chern–Simons theory which should be connected to the elusive 6-dimensional (0,2) theory via the AdS/CFT correspondence. The new work by Saemann and Schmidt goes further by making an explicit proposal for the Lagrangian of the 6-dimensional theory.

For more details, read Urs’ blog article on G+.

All this leaves me feeling excited but also bemused. When I started thinking about 2-groups and Lie 2-algebras, I knew right away that they should be related to the parallel transport of 1-dimensional objects — that is, strings. But not being especially interested in string theory, I had no inkling of how, exactly, the Lie 2-algebras that Alissa Crans and I discovered might play a role in that subject. That began becoming clear in our paper with Danny and Urs. But then I left the subject, moving on to questions that better fit my real interests.

From tiny seeds grow great oaks. Should have stuck with the string Lie 2-algebra and helped it grow? Probably not. But sometimes I wish I had.

## December 23, 2017

### Clifford V. Johnson - Asymptotia

This is one of the best interviews I've done about #thedialoguesbook so far. Eric Newman is an excellent interviewer, and for the first half of the Los Angeles Review of Books (LARB) radio hour we covered science and its intersection with art, culture, philosophy, religion, politics, and more!

You can listen to it here.

-cvj Click to continue reading this post

The post LARB Radio Hour Interview! appeared first on Asymptotia.

## December 22, 2017

### The n-Category Cafe

Here’s a draft of a little thing I’m writing for the *Newsletter of the London Mathematical Society*. The regular icosahedron is connected to many ‘exceptional objects’ in mathematics, and here I describe two ways of using it to construct $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$. One uses a subring of the quaternions called the ‘icosians’, while the other uses Du Val’s work on the resolution of Kleinian singularities. I leave it as a challenge to find the connection between these two constructions!

(Dedicated readers of this blog may recall that I was struggling with the second construction in July. David Speyer helped me a lot, but I got distracted by other work and the discussion fizzled. Now I’ve made more progress… but I’ve realized that the details would never fit in the *Newsletter*, so I’m afraid anyone interested will have to wait a bit longer.)

You can get a PDF version here:

• From the icosahedron to E_{8}.

But blogs are more fun.

### From the Icosahedron to E_{8}

In mathematics, every sufficiently beautiful object is connected to all others. Many exciting adventures, of various levels of difficulty, can be had by following these connections.
Take, for example, the icosahedron — that is, the *regular* icosahedron, one of the five Platonic solids. Starting from this it is just a hop, skip and a jump to the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice, a wonderful pattern of points in 8 dimensions! As we explore this connection we shall see that it also ties together many other remarkable entities: the golden ratio, the quaternions, the quintic equation, a highly symmetrical 4-dimensional shape called the 600-cell, and a manifold called the Poincaré homology 3-sphere.

Indeed, the main problem with these adventures is knowing where to stop. The story we shall tell is just a snippet of a longer one involving the McKay correspondence and quiver representations. It would be easy to bring in the octonions, exceptional Lie groups, and more. But it can be enjoyed without these digressions, so let us introduce the protagonists without further ado.

The icosahedron has a long history. According to a comment in Euclid’s *Elements* it was discovered by Plato’s friend Theaetetus, a geometer who lived from roughly 415 to 369 BC. Since Theaetetus is believed to have classified the Platonic solids, he may have found the icosahedron as part of this project. If so, it is one of the earliest mathematical objects discovered as part of a classification theorem. In any event, it was known to Plato: in his *Timaeus*, he argued that water comes in atoms of this shape.

The icosahedron has 20 triangular faces, 30 edges, and 12 vertices. We can take the vertices to be the four points

$$<semantics>(0,\pm 1,\pm \Phi )<annotation\; encoding="application/x-tex">\; (0\; ,\; \backslash pm\; 1\; ,\; \backslash pm\; \backslash Phi)\; </annotation></semantics>$$

and all those obtained from these by cyclic permutations of the coordinates, where

$$<semantics>{\displaystyle \Phi =\frac{\sqrt{5}+1}{2}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash Phi\; =\; \backslash frac\{\backslash sqrt\{5\}\; +\; 1\}\{2\}\; \}\; </annotation></semantics>$$

is the golden ratio. Thus, we can group the vertices into three orthogonal **golden rectangles**: rectangles whose proportions are $<semantics>\Phi <annotation\; encoding="application/x-tex">\backslash Phi</annotation></semantics>$ to 1.

In fact, there are five ways to do this. The rotational symmetries of the icosahedron permute these five ways, and any nontrivial rotation gives a nontrivial permutation. The rotational symmetry group of the icosahedron is thus a subgroup of $<semantics>{\mathrm{S}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{S\}\_5</annotation></semantics>$. Moreover, this subgroup has 60 elements. After all, any rotation is determined by what it does to a chosen face of the icosahedron: it can map this face to any of the 20 faces, and it can do so in 3 ways. The rotational symmetry group of the icosahedron is therefore a 60-element subgroup of $<semantics>{\mathrm{S}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{S\}\_5</annotation></semantics>$. Group theory therefore tells us that it must be the alternating group $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$.

The $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice is harder to visualize than the icosahedron, but still easy to characterize. Take a bunch of equal-sized spheres in 8 dimensions. Get as many of these spheres to touch a single sphere as you possibly can. Then, get as many to touch *those* spheres as you possibly can, and so on. Unlike in 3 dimensions, where there is “wiggle room”, you have no choice about how to proceed, except for an overall rotation and translation. The balls will inevitably be centered at points of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice!

We can also characterize the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice as the one giving the densest packing of spheres among all lattices in 8 dimensions. This packing was long suspected to be optimal even among those that do not arise from lattices — but this fact was proved only in 2016, by the young mathematician Maryna Viazovska [V].

We can also describe the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice more explicitly. In suitable coordinates, it consists of vectors for which:

• the components are either all integers or all integers plus $<semantics>\frac{1}{2}<annotation\; encoding="application/x-tex">\backslash textstyle\{\backslash frac\{1\}\{2\}\}</annotation></semantics>$, and

• the components sum to an even number.

This lattice consists of all integral linear combinations of the 8 rows of this matrix:

$$<semantics>\left(\begin{array}{rrrrrrrr}1& -1& 0& 0& 0& 0& 0& 0\\ 0& 1& -1& 0& 0& 0& 0& 0\\ 0& 0& 1& -1& 0& 0& 0& 0\\ 0& 0& 0& 1& -1& 0& 0& 0\\ 0& 0& 0& 0& 1& -1& 0& 0\\ 0& 0& 0& 0& 0& 1& -1& 0\\ 0& 0& 0& 0& 0& 1& 1& 0\\ -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}\end{array}\right)<annotation\; encoding="application/x-tex">\; \backslash left(\; \backslash begin\{array\}\{rrrrrrrr\}\; 1\&-1\&0\&0\&0\&0\&0\&0\; \backslash \backslash \; 0\&1\&-1\&0\&0\&0\&0\&0\; \backslash \backslash \; 0\&0\&1\&-1\&0\&0\&0\&0\; \backslash \backslash \; 0\&0\&0\&1\&-1\&0\&0\&0\; \backslash \backslash \; 0\&0\&0\&0\&1\&-1\&0\&0\; \backslash \backslash \; 0\&0\&0\&0\&0\&1\&-1\&0\; \backslash \backslash \; 0\&0\&0\&0\&0\&1\&1\&0\; \backslash \backslash \; -\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\; \backslash end\{array\}\; \backslash right)\; </annotation></semantics>$$

The inner product of any row vector with itself is 2, while the inner product of distinct row vectors is either 0 or -1. Thus, any two of these vectors lie at an angle of either 90° or 120° from each other. If we draw a dot for each vector, and connect two dots by an edge when the angle between their vectors is 120° we get this pattern:

This is called the **$<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram**. In the first part of our story we shall find the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice hiding in the icosahedron; in the second part, we shall find this diagram. The two parts of this story must be related — but the relation remains mysterious, at least to me.

### The Icosians

The quickest route from the icosahedron to $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ goes through the fourth dimension. The symmetries of the icosahedron can be described using certain quaternions; the integer linear combinations of these form a subring of the quaternions called the ‘icosians’, but the icosians can be reinterpreted as a lattice in 8 dimensions, and this is the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice [CS]. Let us see how this works. The quaternions, discovered by Hamilton, are a 4-dimensional algebra

$$<semantics>{\displaystyle \mathbb{H}=\{a+bi+cj+dk:\phantom{\rule{thickmathspace}{0ex}}a,b,c,d\in \mathbb{R}\}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{H\}\; =\; \backslash \{a\; +\; b\; i\; +\; c\; j\; +\; d\; k\; \backslash colon\; \backslash ;\; a,b,c,d\backslash in\; \backslash mathbb\{R\}\backslash \}\; \}\; </annotation></semantics>$$

with multiplication given as follows:

$$<semantics>{\displaystyle {i}^{2}={j}^{2}={k}^{2}=-1,}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{i^2\; =\; j^2\; =\; k^2\; =\; -1,\; \}\; </annotation></semantics>$$ $$<semantics>{\displaystyle ij=k=-ji\phantom{\rule{thickmathspace}{0ex}}\mathrm{and}\phantom{\rule{thickmathspace}{0ex}}\mathrm{cyclic}\phantom{\rule{thickmathspace}{0ex}}\mathrm{permutations}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{i\; j\; =\; k\; =\; -\; j\; i\; \backslash ;\; and\; \backslash ;\; cyclic\; \backslash ;\; permutations\; \}\; </annotation></semantics>$$

It is a normed division algebra, meaning that the norm

$$<semantics>{\displaystyle |a+bi+cj+dk|=\sqrt{{a}^{2}+{b}^{2}+{c}^{2}+{d}^{2}}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; |a\; +\; b\; i\; +\; c\; j\; +\; d\; k|\; =\; \backslash sqrt\{a^2\; +\; b^2\; +\; c^2\; +\; d^2\}\; \}\; </annotation></semantics>$$

obeys

$$<semantics>|qq\prime |=|q||q\prime |<annotation\; encoding="application/x-tex">\; |q\; q\text{\'}|\; =\; |q|\; |q\text{\'}|\; </annotation></semantics>$$

for all $<semantics>q,q\prime \in \mathbb{H}<annotation\; encoding="application/x-tex">q,q\text{\'}\; \backslash in\; \backslash mathbb\{H\}</annotation></semantics>$. The unit sphere in $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{H\}</annotation></semantics>$ is thus a group, often called $<semantics>\mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)</annotation></semantics>$ because its elements can be identified with $<semantics>2\times 2<annotation\; encoding="application/x-tex">\; 2\; \backslash times\; 2</annotation></semantics>$ unitary matrices with determinant 1. This group acts as rotations of 3-dimensional Euclidean space, since we can see any point in $<semantics>{\mathbb{R}}^{3}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{R\}^3</annotation></semantics>$ as a **purely imaginary** quaternion $<semantics>x=bi+cj+dk<annotation\; encoding="application/x-tex">\; x\; =\; b\; i\; +\; c\; j\; +\; d\; k</annotation></semantics>$, and the quaternion $<semantics>{\mathrm{qxq}}^{-1}<annotation\; encoding="application/x-tex">\; qxq^\{-1\}</annotation></semantics>$ is then purely imaginary for any $<semantics>q\in \mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; q\; \backslash in\; \backslash mathrm\{SO\}(3)</annotation></semantics>$. Indeed, this action gives a double cover

$$<semantics>{\displaystyle \alpha :\mathrm{SU}(2)\to \mathrm{SO}(3)}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash alpha\; \backslash colon\; \backslash mathrm\{SU\}(2)\; \backslash to\; \backslash mathrm\{SO\}(3)\; \}\; </annotation></semantics>$$

where $<semantics>\mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SO\}(3)</annotation></semantics>$ is the group of rotations of $<semantics>{\mathbb{R}}^{3}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{R\}^3</annotation></semantics>$.

We can thus take any Platonic solid, look at its group of rotational symmetries, get a subgroup of $<semantics>\mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SO\}(3)</annotation></semantics>$, and take its double cover in $<semantics>\mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)</annotation></semantics>$. If we do this starting with the icosahedron, we see that the $<semantics>60<annotation\; encoding="application/x-tex">\; 60</annotation></semantics>$-element group $<semantics>{\mathrm{A}}_{5}\subset \mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5\; \backslash subset\; \backslash mathrm\{SO\}(3)</annotation></semantics>$ is covered by a 120-element group $<semantics>\Gamma \subset \mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash Gamma\; \backslash subset\; \backslash mathrm\{SU\}(2)</annotation></semantics>$, called the **binary icosahedral group**.

The elements of $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$ are quaternions of norm one, and it turns out that they are the vertices of a 4-dimensional regular polytope: a 4-dimensional cousin of the Platonic solids. It deserves to be called the ‘hypericosahedron’, but it is usually called the 600-cell, since it has 600 tetrahedral faces. Here is the 600-cell projected down to 3 dimensions, drawn using Robert Webb’s Stella software:

Explicitly, if we identify $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{H\}</annotation></semantics>$ with $<semantics>{\mathbb{R}}^{4}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{R\}^4</annotation></semantics>$, the elements of $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$ are the points

$$<semantics>{\displaystyle (\pm \frac{1}{2},\pm \frac{1}{2},\pm \frac{1}{2},\pm \frac{1}{2})}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; (\backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\},\; \backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\},\backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\},\backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\})\; \}\; </annotation></semantics>$$

$$<semantics>{\displaystyle (\pm 1,0,0,0)}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; (\backslash pm\; 1,\; 0,\; 0,\; 0)\; \}</annotation></semantics>$$

$$<semantics>{\displaystyle \frac{1}{2}(\pm \Phi ,\pm 1,\pm 1/\Phi ,0),}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash textstyle\{\backslash frac\{1\}\{2\}\}\; (\backslash pm\; \backslash Phi,\; \backslash pm\; 1\; ,\; \backslash pm\; 1/\backslash Phi,\; 0\; ),\}\; </annotation></semantics>$$

and those obtained from these by even permutations of the coordinates. Since these points are closed under multiplication, if we take integral linear combinations of them we get a subring of the quaternions:

$$<semantics>{\displaystyle \mathbb{I}=\{\sum _{q\in \Gamma}{a}_{q}q:\phantom{\rule{thickmathspace}{0ex}}{a}_{q}\in \mathbb{Z}\}\subset \mathbb{H}.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{I\}\; =\; \backslash \{\; \backslash sum\_\{q\; \backslash in\; \backslash Gamma\}\; a\_q\; q\; :\; \backslash ;\; a\_q\; \backslash in\; \backslash mathbb\{Z\}\; \backslash \}\; \backslash subset\; \backslash mathbb\{H\}\; .\}\; </annotation></semantics>$$

Conway and Sloane [CS] call this the ring of **icosians**. The icosians are not a lattice in the quaternions: they are dense. However, any icosian is of the form $<semantics>a+\mathrm{bi}+\mathrm{cj}+\mathrm{dk}<annotation\; encoding="application/x-tex">\; a\; +\; bi\; +\; cj\; +\; dk</annotation></semantics>$ where $<semantics>a,b,c<annotation\; encoding="application/x-tex">\; a,b,c</annotation></semantics>$, and $<semantics>d<annotation\; encoding="application/x-tex">\; d</annotation></semantics>$ live in the **golden field**

$$<semantics>{\displaystyle \mathbb{Q}(\sqrt{5})=\{x+\sqrt{5}y:\phantom{\rule{thickmathspace}{0ex}}x,y\in \mathbb{Q}\}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{Q\}(\backslash sqrt\{5\})\; =\; \backslash \{\; x\; +\; \backslash sqrt\{5\}\; y\; :\; \backslash ;\; x,y\; \backslash in\; \backslash mathbb\{Q\}\backslash \}\; \}\; </annotation></semantics>$$

Thus we can think of an icosian as an 8-tuple of rational numbers. Such 8-tuples form a lattice in 8 dimensions.

In fact we can put a norm on the icosians as follows. For $<semantics>q\in \mathbb{I}<annotation\; encoding="application/x-tex">\; q\; \backslash in\; \backslash mathbb\{I\}</annotation></semantics>$ the usual quaternionic norm has

$$<semantics>{\displaystyle |q{|}^{2}=x+\sqrt{5}y}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; |q|^2\; =\; x\; +\; \backslash sqrt\{5\}\; y\; \}\; </annotation></semantics>$$

for some rational numbers $<semantics>x<annotation\; encoding="application/x-tex">\; x</annotation></semantics>$ and $<semantics>y<annotation\; encoding="application/x-tex">\; y</annotation></semantics>$, but we can define a new norm on $<semantics>\mathbb{I}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{I\}</annotation></semantics>$ by setting

$$<semantics>{\displaystyle \Vert q{\Vert}^{2}=x+y}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash |q\backslash |^2\; =\; x\; +\; y\; \}\; </annotation></semantics>$$

With respect to this new norm, the icosians form a lattice that fits isometrically in 8-dimensional Euclidean space. And this is none other than $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$!

### Klein’s Icosahedral Function

Not only is the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice hiding in the icosahedron; so is the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram. The space of all regular icosahedra of arbitrary size centered at the origin has a singularity, which corresponds to a degenerate special case: the icosahedron of zero size. If we resolve this singularity in a minimal way we get eight Riemann spheres, intersecting in a pattern described by the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram!

This remarkable story starts around 1884 with Felix Klein’s *Lectures on the Icosahedron* [Kl]. In this work he inscribed an icosahedron in the Riemann sphere, $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$. He thus got the icosahedron’s symmetry group, $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$, to act as conformal transformations of $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ — indeed, rotations. He then found a rational function of one complex variable that is invariant under all these transformations. This function equals $<semantics>0<annotation\; encoding="application/x-tex">\; 0</annotation></semantics>$ at the centers of the icosahedron’s faces, 1 at the midpoints of its edges, and $<semantics>\mathrm{\infty}<annotation\; encoding="application/x-tex">\; \backslash infty</annotation></semantics>$ at its vertices.

Here is Klein’s icosahedral function as drawn by Abdelaziz Nait Merzouk. The color shows its phase, while the contour lines show its magnitude:

We can think of Klein’s icosahedral function as a branched cover of the Riemann sphere by itself with 60 sheets:

$$<semantics>{\displaystyle \mathcal{I}:\u2102{\mathrm{P}}^{1}\to \u2102{\mathrm{P}}^{1}.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathcal\{I\}\; \backslash colon\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; \backslash to\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; .\}\; </annotation></semantics>$$

Indeed, $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$ acts on $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, and the quotient space $<semantics>\u2102{\mathrm{P}}^{1}/{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1/\backslash mathrm\{A\}\_5</annotation></semantics>$ is isomorphic to $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ again. The function $<semantics>\mathcal{I}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}</annotation></semantics>$ gives an explicit formula for the quotient map $<semantics>\u2102{\mathrm{P}}^{1}\to \u2102{\mathrm{P}}^{1}/{\mathrm{A}}_{5}\cong \u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; \backslash to\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1/\backslash mathrm\{A\}\_5\; \backslash cong\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$.

Klein managed to reduce solving the quintic to the problem of solving the equation $<semantics>\mathcal{I}(z)=w<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}(z)\; =\; w</annotation></semantics>$ for $<semantics>z<annotation\; encoding="application/x-tex">\; z</annotation></semantics>$. A modern exposition of this result is Shurman’s *Geometry of the Quintic* [Sh]. For a more high-powered approach, see the paper by Nash [N]. Unfortunately, neither of these treatments avoids complicated calculations. But our interest in Klein’s icosahedral function here does not come from its connection to the quintic: instead, we want to see its connection to $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$.

For this we should actually construct Klein’s icosahedral function. To do this, recall that the Riemann sphere $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ is the space of 1-dimensional linear subspaces of $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$. Let us work directly with $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$. While $<semantics>\mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SO\}(3)</annotation></semantics>$ acts on $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, this comes from an action of this group’s double cover $<semantics>\mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)</annotation></semantics>$ on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$. As we have seen, the rotational symmetry group of the icosahedron, $<semantics>{\mathrm{A}}_{5}\subset \mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5\; \backslash subset\; \backslash mathrm\{SO\}(3)</annotation></semantics>$, is double covered by the binary icosahedral group $<semantics>\Gamma \subset \mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash Gamma\; \backslash subset\; \backslash mathrm\{SU\}(2)</annotation></semantics>$. To build an $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$-invariant rational function on $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, we should thus look for $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$-invariant homogeneous polynomials on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$.

It is easy to construct three such polynomials:

• $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$, of degree $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$, vanishing on the 1d subspaces corresponding to icosahedron vertices.

• $<semantics>E<annotation\; encoding="application/x-tex">\; E</annotation></semantics>$, of degree $<semantics>30<annotation\; encoding="application/x-tex">\; 30</annotation></semantics>$, vanishing on the 1d subspaces corresponding to icosahedron edge midpoints.

• $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$, of degree $<semantics>20<annotation\; encoding="application/x-tex">\; 20</annotation></semantics>$, vanishing on the 1d subspaces corresponding to icosahedron face centers.

Remember, we have embedded the icosahedron in $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, and each point in $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ is a 1-dimensional subspace of $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$, so each icosahedron vertex determines such a subspace, and there is a linear function on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$, unique up to a constant factor, that vanishes on this subspace. The icosahedron has $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$ vertices, so we get $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$ linear functions this way. Multiplying them gives $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$, a homogeneous polynomial of degree $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$ on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$ that vanishes on all the subspaces corresponding to icosahedron vertices! The same trick gives $<semantics>E<annotation\; encoding="application/x-tex">\; E</annotation></semantics>$, which has degree $<semantics>30<annotation\; encoding="application/x-tex">\; 30</annotation></semantics>$ because the icosahedron has $<semantics>30<annotation\; encoding="application/x-tex">\; 30</annotation></semantics>$ edges, and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$, which has degree $<semantics>20<annotation\; encoding="application/x-tex">\; 20</annotation></semantics>$ because the icosahedron has $<semantics>20<annotation\; encoding="application/x-tex">\; 20</annotation></semantics>$ faces.

A bit of work is required to check that $<semantics>V,E<annotation\; encoding="application/x-tex">\; V,E</annotation></semantics>$ and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ are invariant under $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$, instead of changing by constant factors under group transformations. Indeed, if we had copied this construction using a tetrahedron or octahedron, this would not be the case. For details, see Shurman’s book [Sh], which is free online, or van Hoboken’s nice thesis [VH].

Since both $<semantics>{F}^{3}<annotation\; encoding="application/x-tex">\; F^3</annotation></semantics>$ and $<semantics>{V}^{5}<annotation\; encoding="application/x-tex">\; V^5</annotation></semantics>$ have degree $<semantics>60<annotation\; encoding="application/x-tex">\; 60</annotation></semantics>$, $<semantics>{F}^{3}/{V}^{5}<annotation\; encoding="application/x-tex">\; F^3/V^5</annotation></semantics>$ is homogeneous of degree zero, so it defines a rational function $<semantics>\mathcal{I}:\u2102{\mathrm{P}}^{1}\to \u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}\; \backslash colon\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; \backslash to\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$. This function is invariant under $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$ because $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ and $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$ are invariant under $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$. Since $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ vanishes at face centers of the icosahedron while $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$ vanishes at vertices, $<semantics>\mathcal{I}={F}^{3}/{V}^{5}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}\; =\; F^3/V^5</annotation></semantics>$ equals $<semantics>0<annotation\; encoding="application/x-tex">\; 0</annotation></semantics>$ at face centers and $<semantics>\mathrm{\infty}<annotation\; encoding="application/x-tex">\; \backslash infty</annotation></semantics>$ at vertices. Finally, thanks to its invariance property, $<semantics>\mathcal{I}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}</annotation></semantics>$ takes the same value at every edge center, so we can normalize $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$ or $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ to make this value 1. Thus, $<semantics>\mathcal{I}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}</annotation></semantics>$ has precisely the properties required of Klein’s icosahedral function!

### The Appearance of E_{8}

Now comes the really interesting part. Three polynomials on a 2-dimensional space must obey a relation, and $<semantics>V,E<annotation\; encoding="application/x-tex">\; V,E</annotation></semantics>$, and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ obey a very pretty one, at least after we normalize them correctly:

$$<semantics>{\displaystyle {V}^{5}+{E}^{2}+{F}^{3}=0.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; V^5\; +\; E^2\; +\; F^3\; =\; 0.\; \}\; </annotation></semantics>$$

We could guess this relation simply by noting that each term must have the same degree. Every $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$-invariant polynomial on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$ is a polynomial in $<semantics>V,E<annotation\; encoding="application/x-tex">\; V,\; E</annotation></semantics>$ and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$, and indeed

$$<semantics>{\displaystyle {\u2102}^{2}/\Gamma \cong \{(V,E,F)\in {\u2102}^{3}:\phantom{\rule{thickmathspace}{0ex}}{V}^{5}+{E}^{2}+{F}^{3}=0\}.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{C\}^2\; /\; \backslash Gamma\; \backslash cong\; \backslash \{\; (V,E,F)\; \backslash in\; \backslash mathbb\{C\}^3\; \backslash colon\; \backslash ;\; V^5\; +\; E^2\; +\; F^3\; =\; 0\; \backslash \}\; .\; \}\; </annotation></semantics>$$

This complex surface is smooth except at $<semantics>V=E=F=0<annotation\; encoding="application/x-tex">\; V\; =\; E\; =\; F\; =\; 0</annotation></semantics>$, where it has a singularity. And hiding in this singularity is $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$!

To see this, we need to ‘resolve’ the singularity. Roughly, this means that we find a smooth complex surface $<semantics>S<annotation\; encoding="application/x-tex">\; S</annotation></semantics>$ and an onto map

that is one-to-one away from the singularity. (More precisely, if $<semantics>X<annotation\; encoding="application/x-tex">\; X</annotation></semantics>$ is an algebraic variety with singular points $<semantics>{X}_{\mathrm{sing}}\subset X<annotation\; encoding="application/x-tex">\; X\_\{\backslash mathrm\{sing\}\}\; \backslash subset\; X</annotation></semantics>$, $<semantics>\pi :S\to X<annotation\; encoding="application/x-tex">\; \backslash pi\; \backslash colon\; S\; \backslash to\; X</annotation></semantics>$ is a **resolution** of $<semantics>X<annotation\; encoding="application/x-tex">\; X</annotation></semantics>$ if $<semantics>S<annotation\; encoding="application/x-tex">\; S</annotation></semantics>$ is smooth, $<semantics>\pi <annotation\; encoding="application/x-tex">\; \backslash pi</annotation></semantics>$ is proper, $<semantics>{\pi}^{-1}(X-{X}_{\mathrm{sing}})<annotation\; encoding="application/x-tex">\; \backslash pi^\{-1\}(X\; -\; X\_\{sing\})</annotation></semantics>$ is dense in $<semantics>S<annotation\; encoding="application/x-tex">\; S</annotation></semantics>$, and $<semantics>\pi <annotation\; encoding="application/x-tex">\; \backslash pi</annotation></semantics>$ is an isomorphism between $<semantics>{\pi}^{-1}(X-{X}_{\mathrm{sing}})<annotation\; encoding="application/x-tex">\; \backslash pi^\{-1\}(X\; -\; X\_\{sing\})</annotation></semantics>$ and $<semantics>X-{X}_{\mathrm{sing}}<annotation\; encoding="application/x-tex">\; X\; -\; X\_\{sing\}</annotation></semantics>$. For more details see Lamotke’s book [L].)

There are many such resolutions, but one **minimal** resolution, meaning that all others factor uniquely through this one:

What sits above the singularity in this minimal resolution? Eight copies of the Riemann sphere $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, one for each dot here:

Two of these $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$s intersect in a point if their dots are connected by an edge: otherwise they are disjoint.

This amazing fact was discovered by Patrick Du Val in 1934 [DV]. Why is it true? Alas, there is not enough room in the margin, or even in the entire blog article, to explain this. The books by Kirillov [Ki] and Lamotke [L] fill in the details. But here is a clue. The $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram has ‘legs’ of lengths $<semantics>5,2<annotation\; encoding="application/x-tex">\; 5,\; 2</annotation></semantics>$ and $<semantics>3<annotation\; encoding="application/x-tex">\; 3</annotation></semantics>$:

On the other hand,

$$<semantics>{\displaystyle {\mathrm{A}}_{5}\cong \u27e8v,e,f|{v}^{5}={e}^{2}={f}^{3}=vef=1\u27e9}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathrm\{A\}\_5\; \backslash cong\; \backslash langle\; v,\; e,\; f\; |\; v^5\; =\; e^2\; =\; f^3\; =\; v\; e\; f\; =\; 1\; \backslash rangle\; \}\; </annotation></semantics>$$

where in terms of the rotational symmetries of the icosahedron:

• $<semantics>v<annotation\; encoding="application/x-tex">\; v</annotation></semantics>$ is a $<semantics>1/5<annotation\; encoding="application/x-tex">\; 1/5</annotation></semantics>$ turn around some vertex of the icosahedron,

• $<semantics>e<annotation\; encoding="application/x-tex">\; e</annotation></semantics>$ is a $<semantics>1/2<annotation\; encoding="application/x-tex">\; 1/2</annotation></semantics>$ turn around the center of an edge touching that vertex,

• $<semantics>f<annotation\; encoding="application/x-tex">\; f</annotation></semantics>$ is a $<semantics>1/3<annotation\; encoding="application/x-tex">\; 1/3</annotation></semantics>$ turn around the center of a face touching that vertex,

and we must choose the sense of these rotations correctly to obtain $<semantics>vef=1<annotation\; encoding="application/x-tex">\; v\; e\; f\; =\; 1</annotation></semantics>$. To get a presentation of the binary icosahedral group we drop one relation:

$$<semantics>{\displaystyle \Gamma \cong \u27e8v,e,f|{v}^{5}={e}^{2}={f}^{3}=vef\u27e9}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash Gamma\; \backslash cong\; \backslash langle\; v,\; e,\; f\; |\; v^5\; =\; e^2\; =\; f^3\; =\; v\; e\; f\; \backslash rangle\; \}\; </annotation></semantics>$$

The dots in the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram correspond naturally to conjugacy classes in $<semantics>\Gamma <annotation\; encoding="application/x-tex">\backslash Gamma</annotation></semantics>$, not counting the conjugacy class of the central element $<semantics>-1<annotation\; encoding="application/x-tex">-1</annotation></semantics>$. Each of these conjugacy classes, in turn, gives a copy of $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ in the minimal resolution of $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$.

Not only the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram, but also the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice, can be found in the minimal resolution of $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$. Topologically, this space is a 4-dimensional manifold. Its real second homology group is an 8-dimensional vector space with an inner product given by the intersection pairing. The integral second homology is a lattice in this vector space spanned by the 8 copies of $<semantics>\u2102{P}^{1}<annotation\; encoding="application/x-tex">\backslash mathbb\{C\}P^1</annotation></semantics>$ we have just seen—and it is a copy of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice [KS].

But let us turn to a more basic question: what is $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$ like as a topological space? To tackle this, first note that we can identify a pair of complex numbers with a single quaternion, and this gives a homeomorphism

$$<semantics>{\u2102}^{2}/\Gamma \cong \mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2/\backslash Gamma\; \backslash cong\; \backslash mathbb\{H\}/\backslash Gamma\; </annotation></semantics>$$

where we let $<semantics>\Gamma <annotation\; encoding="application/x-tex">\backslash Gamma</annotation></semantics>$ act by right multiplication on $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\backslash mathbb\{H\}</annotation></semantics>$. So, it suffices to understand $<semantics>\mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$.

Next, note that sitting inside $<semantics>\mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$ are the points coming from the unit sphere in $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\backslash mathbb\{H\}</annotation></semantics>$. These points form the 3-dimensional manifold $<semantics>\mathrm{SU}(2)/\Gamma <annotation\; encoding="application/x-tex">\backslash mathrm\{SU\}(2)/\backslash Gamma</annotation></semantics>$, which is called the **Poincaré homology 3-sphere** [KS]. This is a wonderful thing in its own right: Poincaré discovered it as a counterexample to his guess that any compact 3-manifold with the same homology as a 3-sphere is actually diffeomorphic to the 3-sphere, and it is deeply connected to $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$. But for our purposes, what matters is that we can think of this manifold in another way, since we have a diffeomorphism

$$<semantics>\mathrm{SU}(2)/\Gamma \cong \mathrm{SO}(3)/{\mathrm{A}}_{5}.<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)/\backslash Gamma\; \backslash cong\; \backslash mathrm\{SO\}(3)/\backslash mathrm\{A\}\_5.\; </annotation></semantics>$$

The latter is just *the space of all icosahedra inscribed in the unit sphere in 3d space*, where we count two as the same if they differ by a rotational symmetry.

This is a nice description of the points of $<semantics>\mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$ coming from points in the unit sphere of $<semantics>H<annotation\; encoding="application/x-tex">\backslash H</annotation></semantics>$. But every quaternion lies in *some* sphere centered at the origin of $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\backslash mathbb\{H\}</annotation></semantics>$, of possibly zero radius. It follows that $<semantics>{\u2102}^{2}/\Gamma \cong \mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma\; \backslash cong\; \backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$ is the space of *all* icosahedra centered at the origin of 3d space — of arbitrary size, including a degenerate icosahedron of zero size. This degenerate icosahedron is the singular point in $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$. This is where $<semantics>{E}_{8}<annotation\; encoding="application/x-tex">\backslash E\_8</annotation></semantics>$ is hiding.

Clearly much has been left unexplained in this brief account. Most of the missing details can be found in the references. But it remains unknown — at least to me — how the two constructions of $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ from the icosahedron fit together in a unified picture.

Recall what we did. First we took the binary icosahedral group $<semantics>\Gamma \subset \mathbb{H}<annotation\; encoding="application/x-tex">\backslash Gamma\; \backslash subset\; \backslash mathbb\{H\}</annotation></semantics>$, took integer linear combinations of its elements, thought of these as forming a lattice in an 8-dimensional rational vector space with a natural norm, and discovered that this lattice is a copy of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice. Then we took $<semantics>{\u2102}^{2}/\Gamma \cong \mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma\; \backslash cong\; \backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$, took its minimal resolution, and found that the integral 2nd homology of this space, equipped with its natural inner product, is a copy of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice. From the same ingredients we built the same lattice in two very different ways! How are these constructions connected? This puzzle deserves a nice solution.

#### Acknowledgements

I thank Tong Yang for inviting me to speak on this topic at the Annual General Meeting of the Hong Kong Mathematical Society on May 20, 2017, and Guowu Meng for hosting me at the HKUST while I prepared that talk. I also thank the many people, too numerous to accurately list, who have helped me understand these topics over the years.

#### Bibliography

[CS] J. H. Conway and N. J. A. Sloane, *Sphere Packings, Lattices and Groups*, Springer, Berlin, 2013.

[DV] P. du Val, On isolated singularities of surfaces which do not affect the conditions of adjunction, I, II and III, *Proc. Camb. Phil. Soc. * **30**, 453–459, 460–465, 483–491.

[KS] R. Kirby and M. Scharlemann, Eight faces of the Poincaré homology 3-sphere, *Usp. Mat. Nauk.* **37** (1982), 139–159. Available at https://tinyurl.com/ybrn4pjq.

[Ki] A. Kirillov, *Quiver Representations and Quiver Varieties*, AMS, Providence, Rhode Island, 2016.

[Kl] F. Klein, *Lectures on the Ikosahedron and the Solution of Equations of the Fifth Degree*, Trüubner & Co., London, 1888. Available at https://archive.org/details/cu31924059413439.

[L] K. Lamotke, *Regular Solids and Isolated Singularities*, Vieweg & Sohn, Braunschweig, 1986.

[N] O. Nash, On Klein’s icosahedral solution of the quintic. Available as arXiv:1308.0955.

[Sh] J. Shurman, *Geometry of the Quintic*, Wiley, New York, 1997. Available at http://people.reed.edu/~jerry/Quintic/quintic.html.

[Sl] P. Slodowy, Platonic solids, Kleinian singularities, and Lie groups, in *Algebraic Geometry*, Lecture Notes in Mathematics **1008**, Springer, Berlin, 1983, pp. 102–138.

[VH] J. van Hoboken, *Platonic Solids, Binary Polyhedral Groups, Kleinian Singularities and Lie Algebras of Type A, D, E*, Master’s Thesis, University of Amsterdam, 2002. Available at http://math.ucr.edu/home/baez/joris_van_hoboken_platonic.pdf.

[V] M. Viazovska, The sphere packing problem in dimension 8, *Ann. Math.* **185** (2017), 991–1015. Available at https://arxiv.org/abs/1603.04246.

### The n-Category Cafe

Around 2008-9 we had several exchanges with Minhyong Kim here at the Café, in particular on his views of approaching number theory from a homotopic perspective, in particular in the post Kim on Fundamental Groups in Number Theory. (See also the threads Afternoon Fishing and The Elusive Proteus.)

I even recall proposing a polymath project based on his ideas in Galois Theory in Two Variables. Something physics-like was in the air, and this seemed a good location with two mathematical physicists as hosts, John having extensively written on number theory in This Week’s Finds.

Nothing came of that, but it’s interesting to see Minhyong is very much in the news these days, including in a popular article in Quanta magazine, Secret Link Uncovered Between Pure Math and Physics.

The Quanta article has Minhyong saying:

“I was hiding it because for many years I was somewhat embarrassed by the physics connection,” he said. “Number theorists are a pretty tough-minded group of people, and influences from physics sometimes make them more skeptical of the mathematics.”

Café readers had an earlier alert from an interview I conducted with Minhyong, reported in Minhyong Kim in The Reasoner. There he was prepared to announce

The work that occupies me most right now, arithmetic homotopy theory, concerns itself very much with arithmetic moduli spaces that are similar in nature and construction to moduli spaces of solutions to the Yang-Mills equation.

Now his articles are appearing bearing explicit names such as ‘Arithmetic Chern-Simons theory’ (I and II), and today, we have Arithmetic Gauge Theory: A Brief Introduction.

What’s moved on in the intervening years from our side (‘our’ largely in the sense of the nLab) is an approach (very much due to Urs) to gauge field theory which looks to extract its abstract essence, and even to express this in the language of cohesive homotopy type theory, see nLab: geometry of physics. What I would love to know is how best to think of the deepest level of commonality between constructions deserving of the name ‘gauge theory’.

On the apparently non-physics side, who knows what depths might be reached if topological Langlands is ever worked out in stable homotopy theory, there being a gauge theoretic connection to geometric Langlands and even to the arithmetic version, as Minhyong remarks in his latest article:

We note also that the Langlands reciprocity conjecture … has as its goal the rewriting of arithmetic L-functions quite generally in terms of automorphic L-functions… it seems reasonable to expect the geometry of arithmetic gauge fields to play a key role in importing quantum field theoretic dualities to arithmetic geometry.

Perhaps the deepest idea would have to reflect the lack of uniformity in arithmetic. Minhyong writes in his latest paper about the action of $<semantics>{G}_{K}=\mathrm{Gal}(\overline{K},K)<annotation\; encoding="application/x-tex">G\_K\; =\; Gal(\backslash bar\; K,\; K)</annotation></semantics>$

The $<semantics>{G}_{K}<annotation\; encoding="application/x-tex">G\_K</annotation></semantics>$-action is usually highly non-trivial, and this is a main difference from geometric gauge theory, where the gauge group tends to be constant over spacetime.

Even if orbifolds and singularities appear in the latter, maybe there’s still a difference. From a dilettantish wish to make sense of Buium and Borger’s nLab: arithmetic jet spaces, I had hoped that the geometric jet space constructions as beautifully captured by the nlab: jet comonad, might help. But arithmetic always seems a little obstructive, and one can’t quite reach the adjoint quadruples of the cohesive world: nlab: Borger’s absolute geometry. James Borger explained this to me as follows:

the usual, differential jet space of X can be constructed by gluing together jet spaces on open subsets, essentially because an infinitesimal arc can never leave an open subset. However, the analogous thing is not true for arithmetic jet spaces, because a Frobenius lift can jump outside an open set. So you can’t construct them by gluing local jet spaces together!

So plenty to explore. Where Minhyong speaks of arithmetic Euler-Lagrange equations, how does this compare with the jet comonadic version of Urs, outlined in Higher Prequantum Geometry II: The Principle of Extremal Action - Comonadically?

by david (d.corfield@kent.ac.uk) at December 22, 2017 12:59 AM

## December 15, 2017

### Andrew Jaffe - Leaves on the Line

It was announced this morning that the WMAP team has won the $3 million Breakthrough Prize. Unlike the Nobel Prize, which infamously is only awarded to three people each year, the Breakthrough Prize was awarded to the whole 27-member WMAP team, led by Chuck Bennett, Gary Hinshaw, Norm Jarosik, Lyman Page, and David Spergel, but including everyone through postdocs and grad students who worked on the project. This is great, and I am happy to send my hearty congratulations to all of them (many of whom I know well and am lucky to count as friends).

I actually knew about the prize last week as I was interviewed by Nature for an article about it. Luckily I didn’t have to keep the secret for long. Although I admit to a little envy, it’s hard to argue that the prize wasn’t deserved. WMAP was ideally placed to solidify the current standard model of cosmology, a Universe dominated by dark matter and dark energy, with strong indications that there was a period of cosmological inflation at very early times, which had several important observational consequences. First, it made the geometry of the Universe — as described by Einstein’s theory of general relativity, which links the contents of the Universe with its shape — flat. Second, it generated the tiny initial seeds which eventually grew into the galaxies that we observe in the Universe today (and the stars and planets within them, of course).

By the time WMAP released its first results in 2003, a series of earlier experiments (including MAXIMA and BOOMERanG, which I had the privilege of being part of) had gone much of the way toward this standard model. Indeed, about ten years one of my Imperial colleagues, Carlo Contaldi, and I wanted to make that comparison explicit, so we used what were then considered fancy Bayesian sampling techniques to combine the data from balloons and ground-based telescopes (which are collectively known as “sub-orbital” experiments) and compare the results to WMAP. We got a plot like the following (which we never published), showing the main quantity that these CMB experiments measure, called the power spectrum (which I’ve discussed in a little more detail here). The horizontal axis corresponds to the size of structures in the map (actually, its inverse, so smaller is to the right) and the vertical axis to how large the the signal is on those scales.

As you can see, the suborbital experiments, en masse, had data at least as good as WMAP on most scales except the very largest (leftmost; this is because you really do need a satellite to see the entire sky) and indeed were able to probe smaller scales than WMAP (to the right). Since then, I’ve had the further privilege of being part of the Planck Satellite team, whose work has superseded all of these, giving much more precise measurements over all of these scales:

Am I jealous? Ok, a little bit.

But it’s also true, perhaps for entirely sociological reasons, that the community is more apt to trust results from a single, monolithic, very expensive satellite than an ensemble of results from a heterogeneous set of balloons and telescopes, run on (comparative!) shoestrings. On the other hand, the overall agreement amongst those experiments, and between them and WMAP, is remarkable.

And that agreement remains remarkable, even if much of the effort of the cosmology community is devoted to understanding the small but significant differences that remain, especially between one monolithic and expensive satellite (WMAP) and another (Planck). Indeed, those “real and serious” (to quote myself) differences would be hard to see even if I plotted them on the same graph. But since both are ostensibly measuring exactly the same thing (the CMB sky), any differences — even those much smaller than the error bars — must be accounted for almost certainly boil down to differences in the analyses or misunderstanding of each team’s own data. Somewhat more interesting are differences between CMB results and measurements of cosmology from other, very different, methods, but that’s a story for another day.

## December 14, 2017

### Robert Helling - atdotde

I would like to call it Summers' problem:

Let's have two real random variables $M$ and $F$ that are drawn according to two probability distributions $\rho_{M/F}(x)$ (for starters you may both assume to be Gaussians but possibly with different mean and variance). Take $N$ draws from each and order the $2N$ results. What is the probability that the $k$ largest ones are all from $M$ rather than $F$? Express your results in terms of the $\rho_{M/F}(x)$. We are also interested in asymptotic results for $N$ large and $k$ fixed as well as $N$ and $k$ large but $k/N$ fixed.

Last bonus question: How many of the people that say that they hire only based on merit and end up with an all male board realise that by this they say that women are not as good by quite a margin?

by Robert Helling (noreply@blogger.com) at December 14, 2017 08:58 AM

## November 30, 2017

### Axel Maas - Looking Inside the Standard Model

by Axel Maas (noreply@blogger.com) at November 30, 2017 05:15 PM

## November 24, 2017

### Sean Carroll - Preposterous Universe

This year we give thanks for a simple but profound principle of statistical mechanics that extends the famous Second Law of Thermodynamics: the Jarzynski Equality. (We’ve previously given thanks for the Standard Model Lagrangian, Hubble’s Law, the Spin-Statistics Theorem, conservation of momentum, effective field theory, the error bar, gauge symmetry, Landauer’s Principle, the Fourier Transform, Riemannian Geometry, and the speed of light.)

The Second Law says that entropy increases in closed systems. But really it says that entropy *usually* increases; thermodynamics is the limit of statistical mechanics, and in the real world there can be rare but inevitable fluctuations around the typical behavior. The Jarzynski Equality is a way of quantifying such fluctuations, which is increasingly important in the modern world of nanoscale science and biophysics.

Our story begins, as so many thermodynamic tales tend to do, with manipulating a piston containing a certain amount of gas. The gas is of course made of a number of jiggling particles (atoms and molecules). All of those jiggling particles contain energy, and we call the total amount of that energy the internal energy *U* of the gas. Let’s imagine the whole thing is embedded in an environment (a “heat bath”) at temperature *T*. That means that the gas inside the piston starts at temperature *T*, and after we manipulate it a bit and let it settle down, it will relax back to *T* by exchanging heat with the environment as necessary.

Finally, let’s divide the internal energy into “useful energy” and “useless energy.” The useful energy, known to the cognoscenti as the (Helmholtz) free energy and denoted by *F*, is the amount of energy potentially available to do useful work. For example, the pressure in our piston may be quite high, and we could release it to push a lever or something. But there is also useless energy, which is just the entropy *S* of the system times the temperature *T.* That expresses the fact that once energy is in a highly-entropic form, there’s nothing useful we can do with it any more. So the total internal energy is the free energy plus the useless energy,

Our piston starts in a boring equilibrium configuration *a*, but we’re not going to let it just sit there. Instead, we’re going to push in the piston, decreasing the volume inside, ending up in configuration *b*. This squeezes the gas together, and we expect that the total amount of energy will go up. It will typically cost us energy to do this, of course, and we refer to that energy as the work *W _{ab}* we do when we push the piston from

*a*to

*b*.

Remember that when we’re done pushing, the system might have heated up a bit, but we let it exchange heat *Q* with the environment to return to the temperature *T*. So three things happen when we do our work on the piston: (1) the free energy of the system changes; (2) the entropy changes, and therefore the useless energy; and (3) heat is exchanged with the environment. In total we have

(There is no Δ*T*, because *T* is the temperature of the environment, which stays fixed.) The Second Law of Thermodynamics says that entropy increases (or stays constant) in closed systems. Our system isn’t closed, since it might leak heat to the environment. But really the Second Law says that the total of the last two terms on the right-hand side of this equation add up to a positive number; in other words, the increase in entropy will more than compensate for the loss of heat. (Alternatively, you can lower the entropy of a bottle of champagne by putting it in a refrigerator and letting it cool down; no laws of physics are violated.) One way of stating the Second Law for situations such as this is therefore

The work we do on the system is greater than or equal to the change in free energy from beginning to end. We can make this inequality into an equality if we act as efficiently as possible, minimizing the entropy/heat production: that’s an *adiabatic* process, and in practical terms amounts to moving the piston as gradually as possible, rather than giving it a sudden jolt. That’s the limit in which the process is reversible: we can get the same energy out as we put in, just by going backwards.

Awesome. But the language we’re speaking here is that of classical thermodynamics, which we all know is the limit of statistical mechanics when we have many particles. Let’s be a little more modern and open-minded, and take seriously the fact that our gas is actually a collection of particles in random motion. Because of that randomness, there will be *fluctuations* over and above the “typical” behavior we’ve been describing. Maybe, just by chance, all of the gas molecules happen to be moving away from our piston just as we move it, so we don’t have to do any work at all; alternatively, maybe there are more than the usual number of molecules hitting the piston, so we have to do more work than usual. The Jarzynski Equality, derived 20 years ago by Christopher Jarzynski, is a way of saying something about those fluctuations.

One simple way of taking our thermodynamic version of the Second Law (3) and making it still hold true in a world of fluctuations is simply to say that it holds true on average. To denote an average over all possible things that could be happening in our system, we write angle brackets around the quantity in question. So a more precise statement would be that the *average* work we do is greater than or equal to the change in free energy:

(We don’t need angle brackets around Δ*F*, because *F* is determined completely by the equilibrium properties of the initial and final states *a* and *b*; it doesn’t fluctuate.) Let me multiply both sides by -1, which means we need to flip the inequality sign to go the other way around:

Next I will exponentiate both sides of the inequality. Note that this keeps the inequality sign going the same way, because the exponential is a monotonically increasing function; if *x* is less than *y*, we know that *e ^{x}* is less than

*e*.

^{y}(More typically we will see the exponents divided by *kT*, where *k* is Boltzmann’s constant, but for simplicity I’m using units where *kT* = 1.)

Jarzynski’s equality is the following remarkable statement: in equation (6), if we exchange the exponential of the average work for the average of the exponential of the work , we get a precise **equality**, not merely an inequality:

That’s the Jarzynski Equality: the average, over many trials, of the exponential of minus the work done, is equal to the exponential of minus the free energies between the initial and final states. It’s a stronger statement than the Second Law, just because it’s an equality rather than an inequality.

In fact, we can *derive* the Second Law from the Jarzynski equality, using a math trick known as Jensen’s inequality. For our purposes, this says that the exponential of an average is less than the average of an exponential, . Thus we immediately get

as we had before. Then just take the log of both sides to get , which is one way of writing the Second Law.

So what does it mean? As we said, because of fluctuations, the work we needed to do on the piston will sometimes be a bit less than or a bit greater than the average, and the Second Law says that the average will be greater than the difference in free energies from beginning to end. Jarzynski’s Equality says there is a quantity, the exponential of minus the work, that averages out to be exactly the exponential of minus the free-energy difference. The function is convex and decreasing as a function of *W*. A fluctuation where *W* is lower than average, therefore, contributes a greater shift to the average of than a corresponding fluctuation where *W* is higher than average. To satisfy the Jarzynski Equality, we must have more fluctuations upward in *W* than downward in *W*, by a precise amount. So on average, we’ll need to do more work than the difference in free energies, as the Second Law implies.

It’s a remarkable thing, really. Much of conventional thermodynamics deals with inequalities, with equality being achieved only in adiabatic processes happening close to equilibrium. The Jarzynski Equality is fully non-equilibrium, achieving equality no matter how dramatically we push around our piston. It tells us not only about the average behavior of statistical systems, but about the full ensemble of possibilities for individual trajectories around that average.

The Jarzynski Equality has launched a mini-revolution in nonequilibrium statistical mechanics, the news of which hasn’t quite trickled to the outside world as yet. It’s one of a number of relations, collectively known as “fluctuation theorems,” which also include the Crooks Fluctuation Theorem, not to mention our own Bayesian Second Law of Thermodynamics. As our technological and experimental capabilities reach down to scales where the fluctuations become important, our theoretical toolbox has to keep pace. And that’s happening: the Jarzynski equality isn’t just imagination, it’s been experimentally tested and verified. (Of course, I remain just a poor theorist myself, so if you want to understand this image from the experimental paper, you’ll have to talk to someone who knows more about Raman spectroscopy than I do.)

## November 16, 2017

## November 09, 2017

### Robert Helling - atdotde

By Original upload by en:User:Tbower - USGS animation A08, Public Domain, Link

In fact, that was only the last in a series of supercontinents, that keep forming and breaking up in the "supercontinent cycle".

By SimplisticReps - Own work, CC BY-SA 4.0, Link

So here is the question: I am happy with the idea of several (say $N$) plates roughly containing a continent each that a floating around on the magma driven by all kinds of convection processes in the liquid part of the earth. They are moving around in a pattern that looks to me to be pretty chaotic (in the non-technical sense) and of course for random motion you would expect that from time to time two of those collide and then maybe stick for a while.

Then it would be possible that also a third plate collides with the two but that would be a coincidence (like two random lines typically intersect but if you have three lines they would typically intersect in pairs but typically not in a triple intersection). But to form a supercontinent, you need all $N$ plates to miraculously collide at the same time. This order-$N$ process seems to be highly unlikely when random let alone the fact that it seems to repeat. So this motion cannot be random (yes, Sabine, this is a naturalness argument). This needs an explanation.

So, why, every few hundred million years, do all the land masses of the earth assemble on side of the earth?

One explanation could for example be that during those tines, the center of mass of the earth is not in the symmetry center so the water of the oceans flow to one side of the earth and reveals the seabed on the opposite side of the earth. Then you would have essentially one big island. But this seems not to be the case as the continents (those parts that are above sea-level) appear to be stable on much longer time scales. It is not that the seabed comes up on one side and the land on the other goes under water but the land masses actually move around to meet on one side.

I have already asked this question whenever I ran into people with a geosciences education but it is still open (and I have to admit that in a non-zero number of cases I failed to even make the question clear that an $N$-body collision needs an explanation). But I am sure, you my readers know the answer or even better can come up with one.

by Robert Helling (noreply@blogger.com) at November 09, 2017 09:35 AM

## October 28, 2017

## October 24, 2017

### Andrew Jaffe - Leaves on the Line

first direct detection of gravitational waves was announced in February of 2015 by the LIGO team, after decades of planning, building and refining their beautiful experiment. Since that time, the US-based LIGO has been joined by the European Virgo gravitational wave telescope (and more are planned around the globe).The first four events that the teams announced were from the spiralling in and eventual mergers of pairs of black holes, with masses ranging from about seven to about forty times the mass of the sun. These masses are perhaps a bit higher than we expect to by typical, which might raise intriguing questions about how such black holes were formed and evolved, although even comparing the results to the predictions is a hard problem depending on the details of the statistical properties of the detectors and the astrophysical models for the evolution of black holes and the stars from which (we think) they formed.

Last week, the teams announced the detection of a very different kind of event, the collision of two neutron stars, each about 1.4 times the mass of the sun. Neutron stars are one possible end state of the evolution of a star, when its atoms are no longer able to withstand the pressure of the gravity trying to force them together. This was first understood by S Chandrasekhar in the early years of the 20th Century, who realised that there was a limit to the mass of a star held up simply by the quantum-mechanical repulsion of the electrons at the outskirts of the atoms making up the star. When you surpass this mass, known, appropriately enough, as the Chandrasekhar mass, the star will collapse in upon itself, combining the electrons and protons into neutrons and likely releasing a vast amount of energy in the form of a supernova explosion. After the explosion, the remnant is likely to be a dense ball of neutrons, whose properties are actually determined fairly precisely by similar physics to that of the Chandrasekhar limit (discussed for this case by Oppenheimer, Volkoff and Tolman), giving us the magic 1.4 solar mass number.

(Last week also coincidentally would have seen Chandrasekhar’s 107th birthday, and Google chose to illustrate their home page with an animation in his honour for the occasion. I was a graduate student at the University of Chicago, where Chandra, as he was known, spent most of his career. Most of us students were far too intimidated to interact with him, although it was always seen as an auspicious occasion when you spotted him around the halls of the Astronomy and Astrophysics Center.)

This process can therefore make a single 1.4 solar-mass neutron star, and we can imagine that in some rare cases we can end up with two neutron stars orbiting one another. Indeed, the fact that LIGO saw one, but only one, such event during its year-and-a-half run allows the teams to constrain how often that happens, albeit with very large error bars, between 320 and 4740 events per cubic gigaparsec per year; a cubic gigaparsec is about 3 billion light-years on each side, so these are rare events indeed. These results and many other scientific inferences from this single amazing observation are reported in the teams’ overview paper.

A series of other papers discuss those results in more detail, covering the physics of neutron stars to limits on departures from Einstein’s theory of gravity (for more on some of these other topics, see this blog, or this story from the NY Times). As a cosmologist, the most exciting of the results were the use of the event as a “standard siren”, an object whose gravitational wave properties are well-enough understood that we can deduce the distance to the object from the LIGO results alone. Although the idea came from Bernard Schutz in 1986, the term “Standard siren” was coined somewhat later (by Sean Carroll) in analogy to the (heretofore?) more common cosmological standard candles and standard rulers: objects whose

intrinsicbrightness and distances are known and so whose distances can be measured by observations of theirapparentbrightness or size, just as you can roughly deduce how far away a light bulb is by how bright it appears, or how far away a familiar object or person is by how big how it looks.Gravitational wave events are standard sirens because our understanding of relativity is good enough that an observation of the shape of gravitational wave pattern as a function of time can tell us the properties of its source. Knowing that, we also then know the amplitude of that pattern when it was released. Over the time since then, as the gravitational waves have travelled across the Universe toward us, the amplitude has gone down (further objects look dimmer sound quieter); the expansion of the Universe also causes the frequency of the waves to decrease — this is the cosmological redshift that we observe in the spectra of distant objects’ light.

Unlike LIGO’s previous detections of binary-black-hole mergers, this new observation of a binary-neutron-star merger was also seen in photons: first as a gamma-ray burst, and then as a “nova”: a new dot of light in the sky. Indeed, the observation of the afterglow of the merger by teams of literally thousands of astronomers in gamma and x-rays, optical and infrared light, and in the radio, is one of the more amazing pieces of academic teamwork I have seen.

And these observations allowed the teams to identify the host galaxy of the original neutron stars, and to measure the redshift of its light (the lengthening of the light’s wavelength due to the movement of the galaxy away from us). It is most likely a previously unexceptional galaxy called NGC 4993, with a redshift

z=0.009, putting it about 40 megaparsecs away, relatively close on cosmological scales.But this means that we can measure all of the factors in one of the most celebrated equations in cosmology, Hubble’s law:

cz=H₀d, wherecis the speed of light,zis the redshift just mentioned, anddis the distance measured from the gravitational wave burst itself. This just leavesH₀, the famous Hubble Constant, giving the current rate of expansion of the Universe, usually measured in kilometres per second per megaparsec. The old-fashioned way to measure this quantity is via the so-called cosmic distance ladder, bootstrapping up from nearby objects of known distance to more distant ones whose properties can only be calibrated by comparison with those more nearby. But errors accumulate in this process and we can be susceptible to the weakest rung on the chain (see recent work by some of my colleagues trying to formalise this process). Alternately, we can use data from cosmic microwave background (CMB) experiments like thePlanckSatellite (see here for lots of discussion on this blog); the typical size of the CMB pattern on the sky is something very like a standard ruler. Unfortunately, it, too, needs to calibrated, implicitly by other aspects of the CMB pattern itself, and so ends up being a somewhat indirect measurement. Currently, the best cosmic-distance-ladder measurement gives something like 73.24 ± 1.74 km/sec/Mpc whereasPlanckgives 67.81 ± 0.92 km/sec/Mpc; these numbers disagree by “a few sigma”, enough that it is hard to explain as simply a statistical fluctuation.Unfortunately, the new LIGO results do not solve the problem. Because we cannot observe the inclination of the neutron-star binary (i.e., the orientation of its orbit), this blows up the error on the distance to the object, due to the Bayesian marginalisation over this unknown parameter (just as the

Planckmeasurement requires marginalization over all of the other cosmological parameters to fully calibrate the results). Because the host galaxy is relatively nearby, the teams must also account for the fact that the redshift includes the effect not only of the cosmological expansion but also the movement of galaxies with respect to one another due to the pull of gravity on relatively large scales; this so-called peculiar velocity has to be modelled which adds further to the errors.This procedure gives a final measurement of 70.0

^{+12}_{-8.0}, with the full shape of the probability curve shown in the Figure, taken directly from the paper. Both thePlanckand distance-ladder results are consistent with these rather large error bars. But this is calculated from a single object; as more of these events are seen these error bars will go down, typically by something like the square root of the number of events, so it might not be too long before this is the best way to measure the Hubble Constant.[Apologies: too long, too technical, and written late at night while trying to get my wonderful not-quite-three-week-old daughter to sleep through the night.]

## October 17, 2017

### Matt Strassler - Of Particular Significance

Yesterday’s post on the results from the LIGO/VIRGO network of gravitational wave detectors was aimed at getting information out, rather than providing the pedagogical backdrop. Today I’m following up with a post that attempts to answer some of the questions that my readers and my personal friends asked me. Some wanted to understand better how to visualize what had happened, while others wanted more clarity on why the discovery was so important. So I’ve put together a post which (1) explains what neutron stars and black holes are and what their mergers are like, (2) clarifies why yesterday’s announcement was important — and there were many reasons, which is why it’s hard to reduce it all to a single soundbite. And (3) there are some miscellaneous questions at the end.

First, a disclaimer: I am *not* an expert in the very complex subject of neutron star mergers and the resulting explosions, called kilonovas. These are much more complicated than black hole mergers. I am still learning some of the details. Hopefully I’ve avoided errors, but you’ll notice a few places where I don’t know the answers … yet. Perhaps my more expert colleagues will help me fill in the gaps over time.

Please, if you spot any errors, don’t hesitate to comment!! And feel free to ask additional questions whose answers I can add to the list.

**BASIC QUESTIONS ABOUT NEUTRON STARS, BLACK HOLES, AND MERGERS**

**What are neutron stars and black holes, and how are they related?**

Every atom is made from a tiny atomic nucleus, made of neutrons and protons (which are very similar), and loosely surrounded by electrons. Most of an atom is empty space, so it can, under extreme circumstances, be crushed — but only if every electron and proton convert to a neutron (which remains behind) and a neutrino (which heads off into outer space.) When a giant star runs out of fuel, the pressure from its furnace turns off, and it collapses inward under its own weight, creating just those extraordinary conditions in which the matter can be crushed. Thus: a star’s interior, with a mass one to several times the Sun’s mass, is all turned into a several-mile(kilometer)-wide ball of neutrons — the number of neutrons approaching a 1 with 57 zeroes after it.

If the star is big but not too big, the neutron ball stiffens and holds its shape, and the star explodes outward, blowing itself to pieces in a what is called a core-collapse supernova. The ball of neutrons remains behind; this is what we call a neutron star. It’s a ball of the densest material that we know can exist in the universe — a pure atomic nucleus many miles(kilometers) across. It has a very hard surface; if you tried to go inside a neutron star, your experience would be a lot worse than running into a closed door at a hundred miles per hour.

If the star is very big indeed, the neutron ball that forms may immediately (or soon) collapse under its own weight, forming a black hole. A supernova may or may not result in this case; the star might just disappear. A black hole is very, very different from a neutron star. Black holes are what’s left when matter collapses irretrievably upon itself under the pull of gravity, shrinking down endlessly. While a neutron star has a surface that you could smash your head on, a black hole has no surface — it has an edge that is simply a point of no return, called a horizon. In Einstein’s theory, you can just go right through, as if passing through an open door. You won’t even notice the moment you go in. *[Note: this is true in Einstein’s theory. But there is a big controversy as to whether the combination of Einstein’s theory with quantum physics changes the horizon into something novel and dangerous to those who enter; this is known as the firewall controversy, and would take us too far afield into speculation.] * But once you pass through that door, you can never return.

Black holes can form in other ways too, but not those that we’re observing with the LIGO/VIRGO detectors.

**Why are their mergers the best sources for gravitational waves?**

One of the easiest and most obvious ways to make gravitational waves is to have two objects orbiting each other. If you put your two fists in a pool of water and move them around each other, you’ll get a pattern of water waves spiraling outward; this is in rough (*very* rough!) analogy to what happens with two orbiting objects, although, since the objects are moving in space, the waves aren’t in a material like water. They are waves in space itself.

To get powerful gravitational waves, you want objects each with a very big mass that are orbiting around each other at very high speed. To get the fast motion, you need the force of gravity between the two objects to be strong; and to get gravity to be as strong as possible, you need the two objects to be as close as possible (since, as Isaac Newton already knew, gravity between two objects grows stronger when the distance between them shrinks.) But if the objects are large, they can’t get too close; they will bump into each other and merge long before their orbit can become fast enough. So to get a really fast orbit, you need **two relatively small objects, each with a relatively big mass** — what scientists refer to as **compact objects**. Neutron stars and black holes are the most compact objects we know about. Fortunately, they do indeed often travel in orbiting pairs, and do sometimes, for a very brief period before they merge, orbit rapidly enough to produce gravitational waves that LIGO and VIRGO can observe.

**Why do we find these objects in pairs in the first place?**

Stars very often travel in pairs… they are called binary stars. They can start their lives in pairs, forming together in large gas clouds, or even if they begin solitary, they can end up pairing up if they live in large densely packed communities of stars where it is common for multiple stars to pass nearby. Perhaps surprisingly, their pairing can survive the collapse and explosion of either star, leaving two black holes, two neutron stars, or one of each in orbit around one another.

**What happens when these objects merge?**

Not surprisingly, there are three classes of mergers which can be detected: two black holes merging, two neutron stars merging, and a neutron star merging with a black hole. The first class was observed in 2015 (and announced in 2016), the second was announced yesterday, and it’s a matter of time before the third class is observed. The two objects may orbit each other for billions of years, very slowly radiating gravitational waves (an effect observed in the 70’s, leading to a Nobel Prize) and gradually coming closer and closer together. Only in the last day of their lives do their orbits really start to speed up. And just before these objects merge, they begin to orbit each other once per second, then ten times per second, then a hundred times per second. Visualize that if you can: objects a few dozen miles (kilometers) across, a few miles (kilometers) apart, each with the mass of the Sun or greater, orbiting each other *100 times each second*. It’s truly mind-boggling — a spinning dumbbell beyond the imagination of even the greatest minds of the 19th century. I don’t know any scientist who isn’t awed by this vision. It all sounds like science fiction. But it’s not.

**How do we know this isn’t science fiction?**

We know, if we believe Einstein’s theory of gravity (and I’ll give you a very good reason to believe in it in just a moment.) Einstein’s theory predicts that such a rapidly spinning, large-mass dumbbell formed by two orbiting compact objects will produce a telltale pattern of ripples in space itself — gravitational waves. That pattern is both complicated and precisely predicted. In the case of black holes, the predictions go right up to and past the moment of merger, to the ringing of the larger black hole that forms in the merger. In the case of neutron stars, the instants just before, during and after the merger are more complex and we can’t yet be confident we understand them, but during tens of seconds before the merger Einstein’s theory is very precise about what to expect. The theory further predicts how those ripples will cross the vast distances from where they were created to the location of the Earth, and how they will appear in the LIGO/VIRGO network of three gravitational wave detectors. The prediction of what to expect at LIGO/VIRGO thus involves not just one prediction but many: the theory is used to predict the existence and properties of black holes and of neutron stars, the detailed features of their mergers, the precise patterns of the resulting gravitational waves, and how those gravitational waves cross space. That LIGO/VIRGO have detected the telltale patterns of these gravitational waves. That these wave patterns agree with Einstein’s theory in every detail is the strongest evidence ever obtained that there is nothing wrong with Einstein’s theory when used in these combined contexts. That then in turn gives us confidence that our interpretation of the LIGO/VIRGO results is correct, confirming that black holes and neutron stars really exist and really merge. *(Notice the reasoning is slightly circular… but that’s how scientific knowledge proceeds, as a set of detailed consistency checks that gradually and eventually become so tightly interconnected as to be almost impossible to unwind. Scientific reasoning is not deductive; it is inductive. We do it not because it is logically ironclad but because it works so incredibly well — as witnessed by the computer, and its screen, that I’m using to write this, and the wired and wireless internet and computer disk that will be used to transmit and store it.)*

**THE SIGNIFICANCE(S) OF YESTERDAY’S ANNOUNCEMENT OF A NEUTRON STAR MERGER**

What makes it difficult to explain the significance of yesterday’s announcement is that it consists of many important results piled up together, rather than a simple takeaway that can be reduced to a single soundbite. (That was also true of the black hole mergers announcement back in 2016, which is why I wrote a long post about it.)

So here is a list of important things we learned. No one of them, by itself, is earth-shattering, but each one is profound, and taken together they form a major event in scientific history.

**First confirmed observation of a merger of two neutron stars**: We’ve known these mergers must occur, but there’s nothing like being sure. And since these things are too far away and too small to see in a telescope, *the only way to be sure these mergers occur, and to learn more details about them, is with gravitational waves*. We expect to see many more of these mergers in coming years as gravitational wave astronomy increases in its sensitivity, and we will learn more and more about them.

**New information about the properties of neutron stars:** Neutron stars were proposed almost a hundred years ago and were confirmed to exist in the 60’s and 70’s. But their precise details aren’t known; we believe they are like a giant atomic nucleus, but they’re so vastly larger than ordinary atomic nuclei that can’t be sure we understand all of their internal properties, and there are debates in the scientific community that can’t be easily answered… until, perhaps, now.

From the detailed pattern of the gravitational waves of this one neutron star merger, scientists already learn two things. First, we confirm that Einstein’s theory correctly predicts the basic pattern of gravitational waves from orbiting neutron stars, as it does for orbiting and merging black holes. Unlike black holes, however, there are more questions about what happens to neutron stars when they merge. The question of what happened to this pair after they merged is still out — did the form a neutron star, an unstable neutron star that, slowing its spin, eventually collapsed into a black hole, or a black hole straightaway?

But something important was already learned about the internal properties of neutron stars. The stresses of being whipped around at such incredible speeds would tear you and I apart, and would even tear the Earth apart. We know neutron stars are much tougher than ordinary rock, but how much more? If they were too flimsy, they’d have broken apart at some point during LIGO/VIRGO’s observations, and the simple pattern of gravitational waves that was expected would have suddenly become much more complicated. That didn’t happen until perhaps just before the merger. So scientists can use the simplicity of the pattern of gravitational waves to infer some new things about how stiff and strong neutron stars are. More mergers will improve our understanding. Again, *there is no other simple way to obtain this information.*

**First visual observation of an event that produces both immense gravitational waves and bright electromagnetic waves:** Black hole mergers aren’t expected to create a brilliant light display, because, as I mentioned above, they’re more like open doors to an invisible playground than they are like rocks, so they merge rather quietly, without a big bright and hot smash-up. But neutron stars are big balls of stuff, and so the smash-up can indeed create lots of heat and light of all sorts, just as you might naively expect. By “light” I mean not just visible light but all forms of electromagnetic waves, at all wavelengths (and therefore at all frequencies.) Scientists divide up the range of electromagnetic waves into categories. These categories are radio waves, microwaves, infrared light, visible light, ultraviolet light, X-rays, and gamma rays, listed from lowest frequency and largest wavelength to highest frequency and smallest wavelength. (Note that these categories and the dividing lines between them are completely arbitrary, but the divisions are useful for various scientific purposes. The **only** fundamental difference between yellow light, a radio wave, and a gamma ray is the wavelength and frequency; otherwise they’re exactly the same type of thing, a wave in the electric and magnetic fields.)

So if and when two neutron stars merge, we expect both gravitational waves and electromagnetic waves, the latter of many different frequencies created by many different effects that can arise when two huge balls of neutrons collide. But just because we expect them doesn’t mean they’re easy to see. These mergers are pretty rare — perhaps one every hundred thousand years in each big galaxy like our own — so the ones we find using LIGO/VIRGO will generally be very far away. If the light show is too dim, none of our telescopes will be able to see it.

But this light show was plenty bright. Gamma ray detectors out in space detected it instantly, confirming that the gravitational waves from the two neutron stars led to a collision and merger that produced very high frequency light. Already, that’s a first. It’s as though one had seen lightning for years but never heard thunder; or as though one had observed the waves from hurricanes for years but never observed one in the sky. Seeing both allows us a whole new set of perspectives; one plus one is often much more than two.

Over time — hours and days — effects were seen in visible light, ultraviolet light, infrared light, X-rays and radio waves. Some were seen earlier than others, which itself is a story, but each one contributes to our understanding of what these mergers are actually like.

**Confirmation of the best guess concerning the origin of “short” gamma ray bursts**: For many years, bursts of gamma rays have been observed in the sky. Among them, there seems to be a class of bursts that are shorter than most, typically lasting just a couple of seconds. They come from all across the sky, indicating that they come from distant intergalactic space, presumably from distant galaxies. Among other explanations, the most popular hypothesis concerning these short gamma-ray bursts has been that they come from merging neutron stars. *The only way to confirm this hypothesis is with the observation of the gravitational waves from such a merger. * That test has now been passed; it appears that the hypothesis is correct. That in turn means that we have, for the first time, both a good explanation of these short gamma ray bursts and, because we know how often we observe these bursts, a good estimate as to how often neutron stars merge in the universe.

**First distance measurement to a source using both a gravitational wave measure and a redshift in electromagnetic waves, allowing a new calibration of the distance scale of the universe and of its expansion rate: **The pattern over time of the gravitational waves from a merger of two black holes or neutron stars is complex enough to reveal many things about the merging objects, including a rough estimate of their masses and the orientation of the spinning pair relative to the Earth. The overall strength of the waves, combined with the knowledge of the masses, reveals how far the pair is from the Earth. That by itself is nice, but the real win comes when the discovery of the object using visible light, or in fact any light with frequency below gamma-rays, can be made. In this case, the galaxy that contains the neutron stars can be determined.

Once we know the host galaxy, we can do something really important. We can, by looking at the starlight, determine how rapidly the galaxy is moving away from us. For distant galaxies, the speed at which the galaxy recedes should be related to its distance because the universe is expanding.

How rapidly the universe is expanding has been recently measured with remarkable precision, but the problem is that there are two different methods for making the measurement, **and they disagree**. This disagreement is one of the most important problems for our understanding of the universe. Maybe one of the measurement methods is flawed, or maybe — and this would be much more interesting — the universe simply doesn’t behave the way we think it does.

What gravitational waves do is give us a third method: the gravitational waves directly provide the distance to the galaxy, and the electromagnetic waves directly provide the speed of recession. *There is no other way to make this type of joint measurement directly for distant galaxies. *The method is not accurate enough to be useful in just one merger, but once dozens of mergers have been observed, the average result will provide important new information about the universe’s expansion. When combined with the other methods, it may help resolve this all-important puzzle.

**Best test so far of Einstein’s prediction that the speed of light and the speed of gravitational waves are identical**: Since gamma rays from the merger and the peak of the gravitational waves arrived within two seconds of one another after traveling 130 million years — that is, about 5 thousand million million seconds — we can say that the speed of light and the speed of gravitational waves are both equal to the cosmic speed limit to within one part in 2 thousand million million. *Such a precise test requires the combination of gravitational wave and gamma ray observations.*

**Efficient production of heavy elements confirmed**: It’s long been said that we are star-stuff, or stardust, and it’s been clear for a long time that it’s true. But there’s been a puzzle when one looks into the details. While it’s known that all the chemical elements from hydrogen up to iron are formed inside of stars, and can be blasted into space in supernova explosions to drift around and eventually form planets, moons, and humans, it hasn’t been quite as clear how the other elements with heavier atoms — atoms such as iodine, cesium, gold, lead, bismuth, uranium and so on — predominantly formed. Yes they can be formed in supernovas, but not so easily; and there seem to be more atoms of heavy elements around the universe than supernovas can explain. There are many supernovas in the history of the universe, but the efficiency for producing heavy chemical elements is just too low.

It was proposed some time ago that the mergers of neutron stars might be a suitable place to produce these heavy elements. Even those these mergers are rare, they might be much more efficient, because the nuclei of heavy elements contain lots of neutrons and, not surprisingly, a collision of two neutron stars would produce lots of neutrons in its debris, suitable perhaps for making these nuclei. A key indication that this is going on would be the following: if a neutron star merger could be identified using gravitational waves, and if its location could be determined using telescopes, then one would observe a pattern of light that would be characteristic of what is now called a “kilonova” explosion. *Warning: I don’t yet know much about kilonovas and I may be leaving out important details.* A kilonova is powered by the process of forming heavy elements; most of the nuclei produced are initially radioactive — i.e., unstable — and they break down by emitting high energy particles, including the particles of light (called photons) which are in the gamma ray and X-ray categories. The resulting characteristic glow would be expected to have a pattern of a certain type: it would be initially bright but would dim rapidly in visible light, with a long afterglow in infrared light. The reasons for this are complex, so let me set them aside for now. The important point is that this pattern was observed, confirming that a kilonova of this type occurred, and thus that, in this neutron star merger, enormous amounts of heavy elements were indeed produced. So we now have a lot of evidence, for the first time, that almost all the heavy chemical elements on and around our planet were formed in neutron star mergers. Again, *we could not know this if we did not know that this was a neutron star merger, and that information comes only from the gravitational wave observation.*

**MISCELLANEOUS QUESTIONS**

**Did the merger of these two neutron stars result in a new black hole, a larger neutron star, or an unstable rapidly spinning neutron star that later collapsed into a black hole?**

We don’t yet know, and maybe we won’t know. Some scientists involved appear to be leaning toward the possibility that a black hole was formed, but others seem to say the jury is out. I’m not sure what additional information can be obtained over time about this.

**If the two neutron stars formed a black hole, why was there a kilonova? Why wasn’t everything sucked into the black hole?**

Black holes aren’t vacuum cleaners; they pull things in via gravity just the same way that the Earth and Sun do, and don’t suck things in some unusual way. The only crucial thing about a black hole is that once you go in you can’t come out. But just as when trying to avoid hitting the Earth or Sun, you can avoid falling in if you orbit fast enough or if you’re flung outward before you reach the edge.

The point in a neutron star merger is that the forces at the moment of merger are so intense that one or both neutron stars are partially ripped apart. The material that is thrown outward in all directions, at an immense speed, somehow creates the bright, hot flash of gamma rays and eventually the kilonova glow from the newly formed atomic nuclei. Those details I don’t yet understand, but I know they have been carefully studied both with approximate equations and in computer simulations such as this one and this one. However, *the accuracy of the simulations can only be confirmed through the detailed studies of a merger, such as the one just announced.* It seems, from the data we’ve seen, that the simulations did a fairly good job. I’m sure they will be improved once they are compared with the recent data.

## October 16, 2017

### Sean Carroll - Preposterous Universe

Everyone is rightly excited about the latest gravitational-wave discovery. The LIGO observatory, recently joined by its European partner VIRGO, had previously seen gravitational waves from coalescing black holes. Which is super-awesome, but also a bit lonely — black holes are black, so we detect the gravitational waves and little else. Since our current gravitational-wave observatories aren’t very good at pinpointing source locations on the sky, we’ve been completely unable to say which galaxy, for example, the events originated in.

This has changed now, as we’ve launched the era of “multi-messenger astronomy,” detecting both gravitational and electromagnetic radiation from a single source. The event was the merger of two neutron stars, rather than black holes, and all that matter coming together in a giant conflagration lit up the sky in a large number of wavelengths simultaneously.

Look at all those different observatories, and all those wavelengths of electromagnetic radiation! Radio, infrared, optical, ultraviolet, X-ray, and gamma-ray — soup to nuts, astronomically speaking.

A lot of cutting-edge science will come out of this, see e.g. this main science paper. Apparently some folks are very excited by the fact that the event produced an amount of gold equal to several times the mass of the Earth. But it’s my blog, so let me highlight the aspect of personal relevance to me: using “standard sirens” to measure the expansion of the universe.

We’re already pretty good at measuring the expansion of the universe, using something called the cosmic distance ladder. You build up distance measures step by step, determining the distance to nearby stars, then to more distant clusters, and so forth. Works well, but of course is subject to accumulated errors along the way. This new kind of gravitational-wave observation is something else entirely, allowing us to completely jump over the distance ladder and obtain an independent measurement of the distance to cosmological objects. See this LIGO explainer.

The simultaneous observation of gravitational and electromagnetic waves is crucial to this idea. You’re trying to compare two things: the distance to an object, and the apparent velocity with which it is moving away from us. Usually velocity is the easy part: you measure the redshift of light, which is easy to do when you have an electromagnetic spectrum of an object. But with gravitational waves alone, you can’t do it — there isn’t enough structure in the spectrum to measure a redshift. That’s why the exploding neutron stars were so crucial; in this event, GW170817, we can for the first time determine the precise redshift of a distant gravitational-wave source.

Measuring the distance is the tricky part, and this is where gravitational waves offer a new technique. The favorite conventional strategy is to identify “standard candles” — objects for which you have a reason to believe you know their intrinsic brightness, so that by comparing to the brightness you actually observe you can figure out the distance. To discover the acceleration of the universe, for example, astronomers used Type Ia supernovae as standard candles.

Gravitational waves don’t quite give you standard candles; every one will generally have a different intrinsic gravitational “luminosity” (the amount of energy emitted). But by looking at the precise way in which the source evolves — the characteristic “chirp” waveform in gravitational waves as the two objects rapidly spiral together — we can work out precisely what that total luminosity actually is. Here’s the chirp for GW170817, compared to the other sources we’ve discovered — much more data, almost a full minute!

So we have both distance and redshift, without using the conventional distance ladder at all! This is important for all sorts of reasons. An independent way of getting at cosmic distances will allow us to measure properties of the dark energy, for example. You might also have heard that there is a discrepancy between different ways of measuring the Hubble constant, which either means someone is making a tiny mistake or there is something dramatically wrong with the way we think about the universe. Having an independent check will be crucial in sorting this out. Just from this one event, we are able to say that the Hubble constant is 70 kilometers per second per megaparsec, albeit with large error bars (+12, -8 km/s/Mpc). That will get much better as we collect more events.

So here is my (infinitesimally tiny) role in this exciting story. The idea of using gravitational-wave sources as standard sirens was put forward by Bernard Schutz all the way back in 1986. But it’s been developed substantially since then, especially by my friends Daniel Holz and Scott Hughes. Years ago Daniel told me about the idea, as he and Scott were writing one of the early papers. My immediate response was “Well, you have to call these things `standard sirens.'” And so a useful label was born.

Sadly for my share of the glory, my Caltech colleague Sterl Phinney also suggested the name simultaneously, as the acknowledgments to the paper testify. That’s okay; when one’s contribution is this extremely small, sharing it doesn’t seem so bad.

By contrast, the glory attaching to the physicists and astronomers who pulled off this observation, and the many others who have contributed to the theoretical understanding behind it, is substantial indeed. Congratulations to all of the hard-working people who have truly opened a new window on how we look at our universe.

### Matt Strassler - Of Particular Significance

Gravitational waves are now the most important new tool in the astronomer’s toolbox. Already they’ve been used to confirm that large black holes — with masses ten or more times that of the Sun — and mergers of these large black holes to form even larger ones, are not uncommon in the universe. Today it goes a big step further.

It’s long been known that neutron stars, remnants of collapsed stars that have exploded as supernovas, are common in the universe. And it’s been known almost as long that sometimes neutron stars travel in pairs. (In fact that’s how gravitational waves were first discovered, indirectly, back in the 1970s.) Stars often form in pairs, and sometimes both stars explode as supernovas, leaving their neutron star relics in orbit around one another. Neutron stars are small — just ten or so kilometers (miles) across. According to Einstein’s theory of gravity, a pair of stars should gradually lose energy by emitting gravitational waves into space, and slowly but surely the two objects should spiral in on one another. Eventually, after many millions or even billions of years, they collide and merge into a larger neutron star, or into a black hole. This collision does two things.

- It makes some kind of brilliant flash of light — electromagnetic waves — whose details are only guessed at. Some of those electromagnetic waves will be in the form of visible light, while much of it will be in invisible forms, such as gamma rays.
- It makes gravitational waves, whose details are easier to calculate and which are therefore distinctive, but couldn’t have been detected until LIGO and VIRGO started taking data, LIGO over the last couple of years, VIRGO over the last couple of months.

It’s possible that we’ve seen the light from neutron star mergers before, but no one could be sure. Wouldn’t it be great, then, if we could see gravitational waves AND electromagnetic waves from a neutron star merger? It would be a little like seeing the flash and hearing the sound from fireworks — seeing and hearing is better than either one separately, with each one clarifying the other. *(Caution: scientists are often speaking as if detecting gravitational waves is like “hearing”. This is only an analogy, and a vague one! It’s not at all the same as acoustic waves that we can hear with our ears, for many reasons… so please don’t take it too literally.) * If we could do both, we could learn about neutron stars and their properties in an entirely new way.

Today, we learned that this has happened. LIGO , with the world’s first two gravitational observatories, detected the waves from two merging neutron stars, 130 million light years from Earth, on August 17th. (Neutron star mergers last much longer than black hole mergers, so the two are easy to distinguish; and this one was so close, relatively speaking, that it was seen for a long while.) VIRGO, with the third detector, allows scientists to triangulate and determine roughly where mergers have occurred. They saw only a very weak signal, but that was extremely important, because it told the scientists* that the merger must have occurred in a small region of the sky where VIRGO has a relative blind spot*. That told scientists where to look.

The merger was detected for more than a full minute… to be compared with black holes whose mergers can be detected for less than a second. It’s not exactly clear yet what happened at the end, however! Did the merged neutron stars form a black hole or a neutron star? The jury is out.

At almost exactly the moment at which the gravitational waves reached their peak, a blast of gamma rays — electromagnetic waves of very high frequencies — were detected by a different scientific team, the one from FERMI. FERMI detects gamma rays from the distant universe every day, and a two-second gamma-ray-burst is not unusual. And INTEGRAL, another gamma ray experiment, also detected it. The teams communicated within minutes. The FERMI and INTEGRAL gamma ray detectors can only indicate the rough region of the sky from which their gamma rays originate, and LIGO/VIRGO together also only give a rough region. But the scientists saw those regions overlapped. The evidence was clear. And with that, astronomy entered a new, highly anticipated phase.

Already this was a huge discovery. Brief gamma-ray bursts have been a mystery for years. One of the best guesses as to their origin has been neutron star mergers. Now the mystery is solved; that guess is apparently correct. *(Or is it? Probably, but the gamma ray discovery is surprisingly dim, given how close it is. So there are still questions to ask.)*

Also confirmed by the fact that these signals arrived within a couple of seconds of one another, after traveling for over 100 million years from the same source, is that, indeed, the speed of light and the speed of gravitational waves are exactly the same — both of them equal to the cosmic speed limit, just as Einstein’s theory of gravity predicts.

Next, these teams quickly told their astronomer friends to train their telescopes in the general area of the source. Dozens of telescopes, from every continent and from space, and looking for electromagnetic waves at a huge range of frequencies, pointed in that rough direction and scanned for anything unusual. *(A big challenge: the object was near the Sun in the sky, so it could be viewed in darkness only for an hour each night!)* Light was detected! At all frequencies! The object was very bright, making it easy to find the galaxy in which the merger took place. The brilliant glow was seen in gamma rays, ultraviolet light, infrared light, X-rays, and radio. (Neutrinos, particles that can serve as another way to observe distant explosions, were not detected this time.)

And with so much information, so much can be learned!

Most important, perhaps, is this: from the pattern of the spectrum of light, the conjecture seems to be confirmed that the mergers of neutron stars are important sources, perhaps the dominant one, for many of the heavy chemical elements — iodine, iridium, cesium, gold, platinum, and so on — that are forged in the intense heat of these collisions. It used to be thought that the same supernovas that form neutron stars in the first place were the most likely source. But now it seems that this second stage of neutron star life — merger, rather than birth — is just as important. That’s fascinating, because neutron star mergers are much more rare than the supernovas that form them. There’s a supernova in our Milky Way galaxy every century or so, but it’s tens of millenia or more between these “kilonovas”, created in neutron star mergers.

If there’s anything disappointing about this news, it’s this: almost everything that was observed by all these different experiments was predicted in advance. Sometimes it’s more important and useful when some of your predictions fail completely, because then you realize how much you have to learn. Apparently our understanding of gravity, of neutron stars, and of their mergers, and of all sorts of sources of electromagnetic radiation that are produced in those merges, is even better than we might have thought. But fortunately there are a few new puzzles. The X-rays were late; the gamma rays were dim… we’ll hear more about this shortly, as NASA is holding a second news conference.

Some highlights from the second news conference:

- New information about neutron star interiors, which affects how large they are and therefore how exactly they merge, has been obtained
- The first ever visual-light image of a gravitational wave source, from the Swope telescope, at the outskirts of a distant galaxy; the galaxy’s center is the blob of light, and the arrow points to the explosion.

- The theoretical calculations for a kilonova explosion suggest that debris from the blast should rather quickly block the visual light, so the explosion dims quickly in visible light — but infrared light lasts much longer. The observations by the visible and infrared light telescopes confirm this aspect of the theory; and you can see evidence for that in the picture above, where four days later the bright spot is both much dimmer and much redder than when it was discovered.
- Estimate: the total mass of the gold and platinum produced in this explosion is vastly larger than the mass of the Earth.
- Estimate: these neutron stars were formed about 10 or so billion years ago. They’ve been orbiting each other for most of the universe’s history, and ended their lives just 130 million years ago, creating the blast we’ve so recently detected.
- Big Puzzle: all of the previous gamma-ray bursts seen up to now have always had shone in ultraviolet light and X-rays as well as gamma rays. But X-rays didn’t show up this time, at least not initially. This was a big surprise. It took 9 days for the Chandra telescope to observe X-rays, too faint for any other X-ray telescope. Does this mean that the two neutron stars created a black hole, which then created a jet of matter that points not quite directly at us but off-axis, and shines by illuminating the matter in interstellar space? This had been suggested as a possibility twenty years ago, but this is the first time there’s been any evidence for it.
- One more surprise: it took 16 days for radio waves from the source to be discovered, with the Very Large Array, the most powerful existing radio telescope. The radio emission has been growing brighter since then! As with the X-rays, this seems also to support the idea of an off-axis jet.
- Nothing quite like this gamma-ray burst has been seen — or rather, recognized — before. When a gamma ray burst doesn’t have an X-ray component showing up right away, it simply looks odd and a bit mysterious. Its harder to observe than most bursts, because without a jet pointing right at us, its afterglow fades quickly. Moreover, a jet pointing at us is bright, so it blinds us to the more detailed and subtle features of the kilonova. But this time, LIGO/VIRGO told scientists that “Yes, this is a neutron star merger”, leading to detailed study from all electromagnetic frequencies, including patient study over many days of the X-rays and radio. In other cases those observations would have stopped after just a short time, and the whole story couldn’t have been properly interpreted.

## October 13, 2017

### Sean Carroll - Preposterous Universe

Trying to climb out from underneath a large pile of looming (and missed) deadlines, and in the process I’m hoping to ramp back up the real blogging. In the meantime, here are a couple of videos to tide you over.

First, an appearance a few weeks ago on Joe Rogan’s podcast. Rogan is a professional comedian and mixed-martial arts commentator, but has built a great audience for his wide-ranging podcast series. One of the things that makes him a good interviewer is his sincere delight in the material, as evidenced here by noting repeatedly that his mind had been blown. We talked for over two and a half hours, covering cosmology and quantum mechanics but also some bits about AI and pop culture.

And here’s a more straightforward lecture, this time at King’s College in London. The topic was “Extracting the Universe from the Wave Function,” which I’ve used for a few talks that ended up being pretty different in execution. This one was aimed at undergraduate physics students, some of whom hadn’t even had quantum mechanics. So the first half is a gentle introduction to many-worlds theory and why it’s the best version of quantum mechanics, and the second half tries to explain our recent efforts to emerge space itself out of quantum entanglement.

I was invited to King’s by Eugene Lim, one of my former grad students and now an extremely productive faculty member in his own right. It’s always good to see your kids grow up to do great things!

## October 09, 2017

### Alexey Petrov - Symmetry factor

I wanted to share some ideas about a teaching method I am trying to develop and implement this semester. Please let me know if you’ve heard of someone doing something similar.

This semester I am teaching our undergraduate mechanics class. This is the first time I am teaching it, so I started looking into a possibility to shake things up and maybe apply some new method of teaching. And there are plenty offered: flipped classroom, peer instruction, Just-in-Time teaching, etc. They all look to “move away from the inefficient old model” where there the professor is lecturing and students are taking notes. I have things to say about that, but not in this post. It suffices to say that most of those approaches are essentially trying to make students *work* (both with the lecturer and their peers) in class and outside of it. At the same time those methods attempt to “compartmentalize teaching” i.e. make large classes “smaller” by bringing up each individual student’s contribution to class activities (by using “clickers”, small discussion groups, etc). For several reasons those approaches did not fit my goal this semester.

Our Classical Mechanics class is a *gateway class* for our physics majors. It is the first class they take after they are done with general physics lectures. So the students are already familiar with the (simpler version of the) material they are going to be taught. The goal of this class is to start *molding physicists out of students*: they learn to simplify problems so physics methods can be properly applied (that’s how “a Ford Mustang improperly parked at the top of the icy hill slides down…” turns into “a block slides down the incline…”), learn to always derive the final formula before plugging in numbers, look at the asymptotics of their solutions as a way to see if the solution makes sense, and many other wonderful things.

So with all that I started doing something I’d like to call *non-linear teaching*. The gist of it is as follows. I give a lecture (and don’t get me wrong, I do make my students talk and work: I ask questions, we do “duels” (students argue different sides of a question), etc — all of that can be done efficiently in a class of 20 students). But instead of one homework with 3-4 problems per week I have two types of homework assignments for them: *short homeworks* and *projects*.

Short homework assignments are *single-problem assignments* given after each class that must be done by the next class. They are designed such that a student need to re-derive material that we discussed previously in class with small new twist added. For example, in the block-down-to-incline problem discussed in class I ask them to choose coordinate axes in a different way and prove that the result is independent of the choice of the coordinate system. Or ask them to find at which angle one should throw a stone to get the maximal possible range (including air resistance), etc. This way, instead of doing an assignment in the last minute at the end of the week, students have to work out what they just learned in class every day! More importantly, I get to change *how* I teach. Depending on how they did on the previous short homework, I adjust the material (both speed and volume) discussed in class. I also design examples for the future sections in such a way that I can repeat parts of the topic that was hard for the students previously. Hence, instead of a linear propagation of the course, we are moving along something akin to helical motion, returning and spending more time on topics that students find more difficult. That’t why my teaching is “non-linear”.

Project homework assignments are designed to develop understanding of how topics in a given chapter relate to each other. There are as many project assignments as there are chapters. Students get two weeks to complete them.

Overall, students solve exactly the same number of problems they would in a normal lecture class. Yet, those problems are scheduled in a different way. In my way, students are forced to learn by constantly re-working what was just discussed in a lecture. And for me, I can quickly react (by adjusting lecture material and speed) using constant feedback I get from students in the form of short homeworks. Win-win!

I will do benchmarking at the end of the class by comparing my class performance to aggregate data from previous years. I’ll report on it later. But for now I would be interested to hear your comments!

## October 05, 2017

### Symmetrybreaking - Fermilab/SLAC

Instead of searching for dark matter particles, a new device will search for dark matter waves.

Researchers are testing a prototype “radio” that could let them listen to the tune of mysterious dark matter particles.

Dark matter is an invisible substance thought to be five times more prevalent in the universe than regular matter. According to theory, billions of dark matter particles pass through the Earth each second. We don’t notice them because they interact with regular matter only very weakly, through gravity.

So far, researchers have mostly been looking for dark matter particles. But with the dark matter radio, they want to look for dark matter waves.

Direct detection experiments for dark matter particles use large underground detectors. Researchers hope to see signals from dark matter particles colliding with the detector material. However, this only works if dark matter particles are heavy enough to deposit a detectable amount energy in the collision.

“If dark matter particles were very light, we might have a better chance of detecting them as waves rather than particles,” says Peter Graham, a theoretical physicist at the Kavli Institute for Particle Astrophysics and Cosmology, a joint institute of Stanford University and the Department of Energy’s SLAC National Accelerator Laboratory. “Our device will take the search in that direction.”

The dark matter radio makes use of a bizarre concept of quantum mechanics known as wave-particle duality: Every particle can also behave like a wave.

Take, for example, the photon: the massless fundamental particle that carries the electromagnetic force. Streams of them make up electromagnetic radiation, or light, which we typically describe as waves—including radio waves.

The dark matter radio will search for dark matter waves associated with two particular dark matter candidates. It could find hidden photons—hypothetical cousins of photons with a small mass. Or it could find axions, which scientists think can be produced out of light and transform back into it in the presence of a magnetic field.

“The search for hidden photons will be completely unexplored territory,” says Saptarshi Chaudhuri, a Stanford graduate student on the project. “As for axions, the dark matter radio will close gaps in the searches of existing experiments.”

### Intercepting dark matter vibes

A regular radio intercepts radio waves with an antenna and converts them into sound. What sound depends on the station. A listener chooses a station by adjusting an electric circuit, in which electricity can oscillate with a certain resonant frequency. If the circuit’s resonant frequency matches the station’s frequency, the radio is tuned in and the listener can hear the broadcast.

The dark matter radio works the same way. At its heart is an electric circuit with an adjustable resonant frequency. If the device were tuned to a frequency that matched the frequency of a dark matter particle wave, the circuit would resonate. Scientists could measure the frequency of the resonance, which would reveal the mass of the dark matter particle.

The idea is to do a frequency sweep by slowly moving through the different frequencies, as if tuning a radio from one end of the dial to the other.

The electric signal from dark matter waves is expected to be very weak. Therefore, Graham has partnered with a team led by another KIPAC researcher, Kent Irwin. Irwin’s group is developing highly sensitive magnetometers known as superconducting quantum interference devices, or SQUIDs, which they’ll pair with extremely low-noise amplifiers to hunt for potential signals.

In its final design, the dark matter radio will search for particles in a mass range of trillionths to millionths of an electronvolt. (One electronvolt is about a billionth of the mass of a proton.) This is somewhat problematic because this range includes kilohertz to gigahertz frequencies—frequencies used for over-the-air broadcasting.

“Shielding the radio from unwanted radiation is very important and also quite challenging,” Irwin says. “In fact, we would need a several-yards-thick layer of copper to do so. Fortunately we can achieve the same effect with a thin layer of superconducting metal.”

One advantage of the dark matter radio is that it does not need to be shielded from cosmic rays. Whereas direct detection searches for dark matter particles must operate deep underground to block out particles falling from space, the dark matter radio can operate in a university basement.

The researchers are now testing a small-scale prototype at Stanford that will scan a relatively narrow frequency range. They plan on eventually operating two independent, full-size instruments at Stanford and SLAC.

“This is exciting new science,” says Arran Phipps, a KIPAC postdoc on the project. “It’s great that we get to try out a new detection concept with a device that is relatively low-budget and low-risk.”

The dark matter disc jockeys are taking the first steps now and plan to conduct their dark matter searches over the next few years. Stay tuned for future results.

## October 03, 2017

### Symmetrybreaking - Fermilab/SLAC

Scientists Rainer Weiss, Kip Thorne and Barry Barish won the 2017 Nobel Prize in Physics for their roles in creating the LIGO experiment.

Three scientists who made essential contributions to the LIGO collaboration have been awarded the 2017 Nobel Prize in Physics.

Rainer Weiss will share the prize with Kip Thorne and Barry Barish for their roles in the discovery of gravitational waves, ripples in space-time predicted by Albert Einstein. Weiss and Thorne conceived of LIGO, and Barish is credited with reviving the struggling experiment and making it happen.

“I view this more as a thing that recognizes the work of about 1000 people,” Weiss said during a Q&A after the announcement this morning. “It’s really a dedicated effort that has been going on, I hate to tell you, for as long as 40 years, people trying to make a detection in the early days and then slowly but surely getting the technology together to do it.”

Another founder of LIGO, scientist Ronald Drever, died in March. Nobel Prizes are not awarded posthumously.

According to Einstein’s general theory of relativity, powerful cosmic events release energy in the form of waves traveling through the fabric of existence at the speed of light. LIGO detects these disturbances when they disrupt the symmetry between the passages of identical laser beams traveling identical distances.

The setup for the LIGO experiment looks like a giant L, with each side stretching about 2.5 miles long. Scientists split a laser beam and shine the two halves down the two sides of the L. When each half of the beam reaches the end, it reflects off a mirror and heads back to the place where its journey began.

Normally, the two halves of the beam return at the same time. When there’s a mismatch, scientists know something is going on. Gravitational waves compress space-time in one direction and stretch it in another, giving one half of the beam a shortcut and sending the other on a longer trip. LIGO is sensitive enough to notice a difference between the arms as small as 1000^{th} the diameter of an atomic nucleus.

Scientists on LIGO and their partner collaboration, called Virgo, reported the first detection of gravitational waves in February 2016. The waves were generated in the collision of two black holes with 29 and 36 times the mass of the sun 1.3 billion years ago. They reached the LIGO experiment as scientists were conducting an engineering test.

“It took us a long time, something like two months, to convince ourselves that we had seen something from outside that was truly a gravitational wave,” Weiss said.

LIGO, which stands for Laser Interferometer Gravitational-Wave Observatory, consists of two of these pieces of equipment, one located in Louisiana and another in Washington state.

The experiment is operated jointly by Weiss’s home institution, MIT, and Barish and Thorne’s home institution, Caltech. The experiment has collaborators from more than 80 institutions from more than 20 countries. A third interferometer, operated by the Virgo collaboration, recently joined LIGO to make the first joint observation of gravitational waves.

## September 28, 2017

### Symmetrybreaking - Fermilab/SLAC

A Fermilab technical specialist recently invented a device that could help alert oncoming trains to large vehicles stuck on the tracks.

Browsing YouTube late at night, Fermilab Technical Specialist Derek Plant stumbled on a series of videos that all begin the same way: a large vehicle—a bus, semi or other low-clearance vehicle—is stuck on a railroad crossing. In the end, the train crashes into the stuck vehicle, destroying it and sometimes even derailing the train. According to the Federal Railroad Administration, every year hundreds of vehicles meet this fate by trains, which can take over a mile to stop.

“I was just surprised at the number of these that I found,” Plant says. “For every accident that’s videotaped, there are probably many more.”

Inspired by a workplace safety class that preached a principle of minimizing the impact of accidents, Plant set about looking for solutions to the problem of trains hitting stuck vehicles.

Railroad tracks are elevated for proper drainage, and the humped profile of many crossings can cause a vehicle to bottom out. “Theoretically, we could lower all the crossings so that they’re no longer a hump. But there are 200,000 crossings in the United States,” Plant says. “Railroads and local governments are trying hard to minimize the number of these crossings by creating overpasses, or elevating roadways. That’s cost-prohibitive, and it’s not going to happen soon.”

Other solutions, such as re-engineering the suspension on vehicles likely to get stuck, seemed equally improbable.

After studying how railroad signaling systems work, Plant came up with an idea: to fake the presence of a train. His invention was developed in his spare time using techniques and principles he learned over his almost two decades at Fermilab. It is currently in the patent application process and being prosecuted by Fermilab’s Office of Technology Transfer.

“If you cross over a railroad track and you look down the tracks, you’ll see red or yellow or green lights,” he says. “Trains have traffic signals too.”

These signals are tied to signal blocks—segments of the tracks that range from a mile to several miles in length. When a train is on the tracks, its metal wheels and axle connect both rails, forming an electric circuit through the tracks to trigger the signals. These signals inform other trains not to proceed while one train occupies a block, avoiding pileups.

Plant thought, “What if other vehicles could trigger the same signal in an emergency?” By faking the presence of a train, a vehicle stuck on the tracks could give advanced warning for oncoming trains to stop and stall for time. Hence the name of Plant’s invention: the Ghost Train Generator.

To replicate the train’s presence, Plant knew he had to create a very strong electric current between the rails. The most straightforward way to do this is with massive amounts of metal, as a train does. But for the Ghost Train Generator to be useful in a pinch, it needs to be small, portable and easily applied. The answer to achieving these features lies in strong magnets and special wire.

“Put one magnet on one rail and one magnet on the other and the device itself mimics—electrically—what a train would look like to the signaling system,” he says. “In theory, this could be carried in vehicles that are at high risk for getting stuck on a crossing: semis, tour buses and first-response vehicles,” Plant says. “Keep it just like you would a fire extinguisher—just behind the seat or in an emergency compartment.”

Once the device is deployed, the train would receive the signal that the tracks were obstructed and stop. Then the driver of the stuck vehicle could call for emergency help using the hotline posted on all crossings.

Plant compares the invention to a seatbelt.

“Is it going to save your life 100 percent of the time? Nope, but smart people wear them,” he says. “It’s designed to prevent a collision when a train is more than two minutes from the crossing.”

And like a seatbelt, part of what makes Plant’s invention so appealing is its simplicity.

“The first thing I thought was that this is a clever invention,” says Aaron Sauers from Fermilab’s technology transfer office, who works with lab staff to develop new technologies for market. “It’s an elegant solution to an existing problem. I thought, ‘This technology could have legs.’”

The organizers of the National Innovation Summit seem to agree. In May, Fermilab received an Innovation Award from TechConnect for the Ghost Train Generator. The invention will also be featured as a showcase technology in the upcoming Defense Innovation Summit in October.

The Ghost Train Generator is currently in the pipeline to receive a patent with help from Fermilab, and its prospects are promising, according to Sauers. It is a nonprovisional patent, which has specific claims and can be licensed. After that, if the generator passes muster and is granted a patent, Plant will receive a portion of the royalties that it generates for Fermilab.

Fermilab encourages a culture of scientific innovation and exploration beyond the field of particle physics, according to Sauers, who noted that Plant’s invention is just one of a number of technology transfer initiatives at the lab.

Plant agrees—Fermilab’s environment helped motivate his efforts to find a solution for railroad crossing accidents.

“It’s just a general problem-solving state of mind,” he says. “That’s the philosophy we have here at the lab.”

*Editor's note: A version of this article was originally published by Fermilab.*

### Symmetrybreaking - Fermilab/SLAC

The national laboratory opened usually inaccessible areas of its campus to thousands of visitors to celebrate 50 years of discovery.

Fermi National Accelerator Laboratory’s yearlong 50th anniversary celebration culminated on Saturday with an Open House that drew thousands of visitors despite the unseasonable heat.

On display were areas of the lab not normally open to guests, including neutrino and muon experiments, a portion of the accelerator complex, lab spaces and magnet and accelerator fabrication and testing areas, to name a few. There were also live links to labs around the world, including CERN, a mountaintop observatory in Chile, and the mile-deep Sanford Underground Research Facility that will house the international neutrino experiment, DUNE.

But it wasn’t all physics. In addition to hands-on demos and a STEM fair, visitors could also learn about Fermilab’s art and history, walk the prairie trails or hang out with the ever-popular bison. In all, some 10,000 visitors got to go behind-the-scenes at Fermilab, shuttled around on 80 buses and welcomed by 900 Fermilab workers eager to explain their roles at the lab. Below, see a few of the photos captured as Fermilab celebrated 50 years of discovery.

## September 27, 2017

### Matt Strassler - Of Particular Significance

Welcome, VIRGO! Another merger of two big black holes has been detected, this time by both LIGO’s two detectors and by VIRGO as well.

Aside from the fact that this means that the VIRGO instrument actually works, which is great news, why is this a big deal? By adding a third gravitational wave detector, built by the VIRGO collaboration, to LIGO’s Washington and Louisiana detectors, the scientists involved in the search for gravitational waves now can determine fairly accurately the direction from which a detected gravitational wave signal is coming. And this allows them to do something new: to tell their astronomer colleagues roughly where to look in the sky, using ordinary telescopes, for some form of electromagnetic waves (perhaps visible light, gamma rays, or radio waves) that might have been produced by whatever created the gravitational waves.

The point is that with three detectors, one can triangulate. The gravitational waves travel for billions of years, traveling at the speed of light, and when they pass by, they are detected at both LIGO detectors and at VIRGO. But because it takes light a few thousandths of a second to travel the diameter of the Earth, the waves arrive at slightly different times at the LIGO Washington site, the LIGO Louisiana site, and the VIRGO site in Italy. The precise timing tells the scientists what direction the waves were traveling in, and therefore roughly where they came from. In a similar way, using the fact that sound travels at a known speed, the times that a gunshot is heard at multiple locations can be used by police to determine where the shot was fired.

You can see the impact in the picture below, which is an image of the sky drawn as a sphere, as if seen from outside the sky looking in. In previous detections of black hole mergers by LIGO’s two detectors, the scientists could only determine a large swath of sky where the observed merger might have occurred; those are the four colored regions that stretch far across the sky. But notice the green splotch at lower left. That’s the region of sky where the black hole merger announced today occurred. The fact that this region is many times smaller than the other four reflects what including VIRGO makes possible. It’s a small enough region that one can search using an appropriate telescope for something that is making visible light, or gamma rays, or radio waves.

While a black hole merger isn’t expected to be observable by other telescopes, and indeed nothing was observed by other telescopes this time, other events that LIGO might detect, such as a merger of two neutron stars, may create an observable effect. We can hope for such exciting news over the next year or two.