Particle Physics Planet


July 18, 2018

Peter Coles - In the Dark

Ongoing Hubble Constant Poll

Here are two interesting plots that I got via Renée Hložek on Twitter from the recent swathe of papers from Planck The first shows the `tension’ between Planck’s parameter estimates `direct’ measurements of the Hubble Constant (as exemplified by Riess et al. 2018); see my recent post for a discussion of the latter. Planck actually produces joint estimates for a set of half-a-dozen basic parameters from which estimates of others, including the Hubble constant, can be derived. The plot  below shows the two-dimensional region that is allowed by Planck if both the Hubble constant (H0) and the matter density parameter (ΩM) are allowed to vary within the limits allowed by various observations. The tightest contours come from Planck but other cosmological probes provide useful constraints that are looser but consistent; `BAO’ refers to `Baryon Acoustic Oscillations‘, and `Pantheon’ is a sample of Type Ia supernovae.

You can see that the Planck measurements (blue) mean that a high value of the Hubble constant requires a low matter density but the allowed contour does not really overlap with the grey shaded horizontal regions. For those of you who like such things, the discrepancy is about 3.5σ..

Another plot you might find interesting is this one:

The solid line shows how the Hubble `constant’ varies with redshift in the standard cosmological model; H0 is the present value of a redshift-dependent parameter H(z) that measures the rate at which the Universe is expanding. You will see that the Hubble parameter is larger at high redshift, but decreases as the expansion of the Universe slows down, until a redshift of around 0.5 and then it increases, indicating that the expansion of the Universe is accelerating.  Direct determinations of the expansion rate at high redshift are difficult, hence the large error bars, but the important feature is the gap between the direct determination at z=0 and what the standard model predicts. If the Riess et al. 2018 measurements are right, the expansion of the Universe seems to have been accelerating more rapidly than the standard model predicts.

So after that little update here’s a little poll I’ve been running for a while on whether people think this apparent discrepancy is serious or not. I’m interested to see whether these latest findings change the voting!

<noscript><a href="http://polldaddy.com/poll/9483425">Take Our Poll</a></noscript>

by telescoper at July 18, 2018 02:55 PM

ZapperZ - Physics and Physicists

Khan Academy's Photoelectric Effect Video Lesson
A lot of people use Khan Academy's video lessons. I know that they are quite popular, and I often time get asked about some of the material in the video, both by my students and also in online discussions. Generally, I have no problems with their videos, but I often wonder who exactly design the content of the videos, because I often find subtle issues and problems. It is not unusual for me to find that they were inaccurate in some things, and these are usually not the type of errors that say, an expert in such subjects would make.

I was asked about this photoelectric effect lesson by someone about a month ago. I've seen it before but never paid much attention to it till now. And now I think I should have looked at it closer, because there are a couple of misleading and inaccurate information about this.

Here is the video:



First, let's tackled the title here, because it is perpetuating a misconception.

Photoelectric effect | Electronic structure of atoms
First of all, the photoelectric effect doesn't have anything to do with "structure of atoms". It has, however, something to do with the structure of the solid metal! The work function, for example, is not part of an atom's energy level. Rather, it is due to the combination of all the atoms of the metal, forming this BANDS of energy. Such bands do not occur in individual atoms. This is why metals have conduction band and atoms do not.

We need to get people to understand that solid state physics is not identical to atomic/molecular physics. When many atoms get together to form a solid, their behavior as a conglomerate is different than their behavior as individual atoms. For many practical purpose, the atoms lose their individuality and instead, form a collective property. This is the most important message that you can learn from this.

And now, the content of the video. I guess the video is trying to tackle a very narrow topic on how to use Einstein's equation, but they are very sloppy on the language that they use. First of all, if you don't know anything else, from the video, you'd get the impression that a photon is an ordinary type of "particle", much like an electron. The illustration of a photon reinforced this erroneous picture. So let's be clear here. A "photon" is not a typical "particle" that we think of. It isn't defined by its "size" or shape. Rather, it is an entity that carries a specific amount of energy and momentum (and angular momentum). That's almost all that we can say without getting into further complications of QED.

But the most serious inaccuracy in the video is when it tackled the energy needed to liberate an electron from the metal. This energy was labelled as E_0. This was then equate to the work function of the metal.

E_0 is equal to the work function of the metal ONLY for the most energetic photoelectrons. It is not the work function for all the other photoelectrons. Photoelectrons are emitted with a range of energies. This is because they came from conduction electrons that are at the Fermi energy or below it. If they came from the Fermi energy, then they only have to overcome the work function. These will correspond to the most energetic photoelectrons. However, if they come from below the Fermi energy, then they have to overcome not only the work function, but also the binding energy. So the kinetic energy of these photoelectrons are not as high as the most energetic ones. So their "E_0" is NOT equal to the work function.

This is why when we have students do the photoelectric effect experiments in General Physics courses, we ask them to find the stopping potential, which is the potential that will stop the most energetic photoelectrons from reaching the anode. Only the info given by these most energetic photoelectrons will give you directly the work function.

Certainly, I don't think that this will affect the viewers ability to use the Einstein equation, which was probably the main purpose of the video. But there is an opportunity here to not mislead the viewers and make the video tighter and more accurate. It also might save many of us from having to explain to other people when they tried to go into this deeper (especially students of physics). For a video that is viewed by such a wide audience, this is not the type of inaccuracies that I expect for them to have missed.

Zz.

by ZapperZ (noreply@blogger.com) at July 18, 2018 02:37 PM

ZapperZ - Physics and Physicists

Multiverse
In this article, Ethan Siegel valiantly tried to explain, in simple language, what "multiverse" is within the astrophysical/cosmological context:

Inflation doesn't end everywhere at once, but rather in select, disconnected locations at any given time, while the space between those locations continues to inflate. There should be multiple, enormous regions of space where inflation ends and a hot Big Bang begins, but they can never encounter one another, as they're separated by regions of inflating space. Wherever inflation begins, it is all but guaranteed to continue for an eternity, at least in places.

Where inflation ends for us, we get a hot Big Bang. The part of the Universe we observe is just one part of this region where inflation ended, with more unobservable Universe beyond that. But there are countlessly many regions, all disconnected from one another, with the same exact story.

Unfortunately, as is the problem with String theory, none of these have testable prediction that can push it out of the realm of speculation and into being a true science.

Zz.

by ZapperZ (noreply@blogger.com) at July 18, 2018 12:43 PM

Peter Coles - In the Dark

Ireland And The Roman Empire. Modern Politics Shaping The Ancient Past?

I’m here in Dublin Airport, not far from Drumanagh, the site discussed in the following post. I’m on my way back to Wales for, among other things, tomorrow’s graduation ceremony for students from the School of Physics & Astronomy at Cardiff University.

I thought I’d reblog the post here because it’s very interesting and it follows on from a comment thread relating to my post a few days ago about the current drought in Ireland which has revealed many previously unknown features of archaeological interest, and the (unrelated but also recent) discovery of a 5500 year-old passage tomb in County Lowth.

The site at Drumanagh is not related to either of those new discoveries, but it is fascinating because of the controversy about whether or not it is evidence of a Roman invasion of Ireland in the first century AD. I think the idea that no Romans ever set foot in Ireland during the occupation of Britain is hard to accept given the extensive trading links of the time, but there’s no evidence of a full-scale military invasion or lengthy period of occupation. The only unambiguously Roman finds at Drumanagh are coins and other artefacts which do not really indicate a military presence and there is no evidence there or anywhere else in Ireland of the buildings, roads or other infrastructure that one finds in Roman Britain.

My own opinion is that the Drumanagh site is more likely to have been some sort of trading post than a military fort, and it may even be entirely Celtic in origin. The position and overall character of the site seems more similar to Iron Age promontory forts than Roman military camps. I am, however, by no means an expert.

You can find a description of the Drumanagh site in its historical context here.

AN SIONNACH FIONN

Way back in 1996, the Sunday Times newspaper in Britain ran an enthusiastic if awkwardly-phrased banner headline proclaiming that a “Fort discovery proves Romans invaded Ireland”. The “fort” in question was an archaeological site in north County Dublin known as Drumanagh, situated on a wave-eroded headland near the coastal village of Loughshinny. Nearly 900 metres long and 190 metres wide, the monument consists of a trio of parallel ditches protecting an oblong thumb of land jutting out into the ocean, the seaward sides of the irregular protrusion relying on the waters of the Irish Sea for defence. The location is fairly typical of a large number of Iron Age promontory settlements found in isolated spots throughout the country. However what made the area at Drumanagh of particular interest was the significant number of Roman artefacts found within its fields.

Unfortunately a comprehensive archaeological survey of the site has yet to be published due to questions over property rights and compensatory payments for finds, meaning most discoveries from the location have come through agricultural work or destructive raids by…

View original post 1,387 more words

by telescoper at July 18, 2018 06:36 AM

The n-Category Cafe

Compositionality: the Editorial Board

An editorial board has now been chosen for the journal Compositionality, and they’re waiting for people to submit papers.

We are happy to announce the founding editorial board of Compositionality, featuring established researchers working across logic, computer science, physics, linguistics, coalgebra, and pure category theory (see the full list below). Our steering board considered many strong applications to our initial open call for editors, and it was not easy narrowing down to the final list, but we think that the quality of this editorial board and the general response bodes well for our growing research community.

In the meantime, we hope you will consider submitting something to our first issue. Look out in the coming weeks for the journal’s official open-for-submissions announcement.

The editorial board of Compositionality:

• Corina Cristea, Alexandru Ioan Cuza University, Romania

• Ross Duncan, University of Strathclyde, UK

• Andrée Ehresmann, University of Picardie Jules Verne, France

• Tobias Fritz, Max Planck Institute, Germany

• Neil Ghani, University of Strathclyde, UK

• Dan Ghica, University of Birmingham, UK

• Jeremy Gibbons, University of Oxford, UK

• Nick Gurski, University of Sheffield, UK

• Helle Hvid Hansen, Delft University of Technology, Netherlands

• Chris Heunen, University of Edinburgh, UK

• Aleks Kissinger, Radboud University, Netherlands

• Joachim Kock, Universitat Autònoma de Barcelona, Spain

• Martha Lewis, University of Amsterdam, Netherlands

• Samuel Mimram, École Polytechnique, France

• Simona Paoli, University of Leicester, UK

• Dusko Pavlovic, University of Hawaii, USA

• Christian Retoré, Université de Montpellier, France

• Mehrnoosh Sadrzadeh, Queen Mary University, UK

• Peter Selinger, Dalhousie University, Canada

• Pawel Sobocinski, University of Southampton, UK

• David Spivak, MIT, USA

• Jamie Vicary, University of Birmingham, UK

• Simon Willerton, University of Sheffield, UK

Best,
Joshua Tan, Brendan Fong, and Nina Otter
Executive editors, Compositionality

by john (baez@math.ucr.edu) at July 18, 2018 05:39 AM

Clifford V. Johnson - Asymptotia

Muskovites Vs Anti-Muskovites…

Saw this split over Elon Musk coming over a year ago. This is panel from my graphic short story “Resolution” that appears in the 2018 SF anthology Twelve Tomorrows, edited by Wade Roush (There’s even an e-version now if you want fast access!) -cvj

The post Muskovites Vs Anti-Muskovites… appeared first on Asymptotia.

by Clifford at July 18, 2018 02:34 AM

July 17, 2018

Christian P. Robert - xi'an's og

ABC variable selection

Prior to the ISBA 2018 meeting, Yi Liu, Veronika Ročková, and Yuexi Wang arXived a paper on relying ABC for finding relevant variables, which is a very original approach in that ABC is not as much the object as it is a tool. And which Veronika considered during her Susie Bayarri lecture at ISBA 2018. In other words, it is not about selecting summary variables for running ABC but quite the opposite, selecting variables in a non-linear model through an ABC step. I was going to separate the two selections into algorithmic and statistical selections, but it is more like projections in the observation and covariate spaces. With ABC still providing an appealing approach to approximate the marginal likelihood. Now, one may wonder at the relevance of ABC for variable selection, aka model choice, given our warning call of a few years ago. But the current paper does not require low-dimension summary statistics, hence avoids the difficulty with the “other” Bayes factor.

In the paper, the authors consider a spike-and… forest prior!, where the Bayesian CART selection of active covariates proceeds through a regression tree, selected covariates appearing in the tree and others not appearing. With a sparsity prior on the tree partitions and this new ABC approach to select the subset of active covariates. A specific feature is in splitting the data, one part to learn about the regression function, simulating from this function and comparing with the remainder of the data. The paper further establishes that ABC Bayesian Forests are consistent for variable selection.

“…we observe a curious empirical connection between π(θ|x,ε), obtained with ABC Bayesian Forests  and rescaled variable importances obtained with Random Forests.”

The difference with our ABC-RF model choice paper is that we select summary statistics [for classification] rather than covariates. For instance, in the current paper, simulation of pseudo-data will depend on the selected subset of covariates, meaning simulating a model index, and then generating the pseudo-data, acceptance being a function of the L² distance between data and pseudo-data. And then relying on all ABC simulations to find which variables are in more often than not to derive the median probability model of Barbieri and Berger (2004). Which does not work very well if implemented naïvely. Because of the immense size of the model space, it is quite hard to find pseudo-data close to actual data, resulting in either very high tolerance or very low acceptance. The authors get over this difficulty by a neat device that reminds me of fractional or intrinsic (pseudo-)Bayes factors in that the dataset is split into two parts, one that learns about the posterior given the model index and another one that simulates from this posterior to compare with the left-over data. Bringing simulations closer to the data. I do not remember seeing this trick before in ABC settings, but it is very neat, assuming the small data posterior can be simulated (which may be a fundamental reason for the trick to remain unused!). Note that the split varies at each iteration, which means there is no impact of ordering the observations.

by xi'an at July 17, 2018 10:18 PM

John Baez - Azimuth

Compositionality: the Editorial Board

The editors of this journal have an announcement:

We are happy to announce the founding editorial board of Compositionality, featuring established researchers working across logic, computer science, physics, linguistics, coalgebra, and pure category theory (see the full list below). Our steering board considered many strong applications to our initial open call for editors, and it was not easy narrowing down to the final list, but we think that the quality of this editorial board and the general response bodes well for our growing research community.

In the meantime, we hope you will consider submitting something to our first issue. Look out in the coming weeks for the journal’s official open-for-submissions announcement.

The editorial board of Compositionality:

• Corina Cristea, Alexandru Ioan Cuza University, Romania
• Ross Duncan, University of Strathclyde, UK
• Andrée Ehresmann, University of Picardie Jules Verne, France
• Tobias Fritz, Max Planck Institute, Germany
• Neil Ghani, University of Strathclyde, UK
• Dan Ghica, University of Birmingham, UK
• Jeremy Gibbons, University of Oxford, UK
• Nick Gurski, University of Sheffield, UK
• Helle Hvid Hansen, Delft University of Technology, Netherlands
• Chris Heunen, University of Edinburgh, UK
• Aleks Kissinger, Radboud University, Netherlands
• Joachim Kock, Universitat Autònoma de Barcelona, Spain
• Martha Lewis, University of Amsterdam, Netherlands
• Samuel Mimram, École Polytechnique, France
• Simona Paoli, University of Leicester, UK
• Dusko Pavlovic, University of Hawaii, USA
• Christian Retoré, Université de Montpellier, France
• Mehrnoosh Sadrzadeh, Queen Mary University, UK
• Peter Selinger, Dalhousie University, Canada
• Pawel Sobocinski, University of Southampton, UK
• David Spivak, MIT, USA
• Jamie Vicary, University of Birmingham, UK
• Simon Willerton, University of Sheffield, UK

Best,
Josh, Brendan, and Nina
Executive editors, Compositionality

by John Baez at July 17, 2018 04:45 PM

ZapperZ - Physics and Physicists

94 Aluminum Pie Pans On A Van de Graaf
What happens when you put 94 aluminum pie pans on a Van de Graaf? Sometime you do things just because it is darn fun!



Now let's see if you can offer your own explanation for this silly thing! :) Happy 10th Anniversary on YouTube, Frostbite Theater!

Zz.

by ZapperZ (noreply@blogger.com) at July 17, 2018 02:56 PM

Emily Lakdawalla - The Planetary Society Blog

How India built NavIC, the country's own GPS network
The country's satellite navigation system faced a long and difficult road, but it's finally operational.

July 17, 2018 02:02 PM

Peter Coles - In the Dark

Planck’s Last Papers

Well, they’ve been a little while coming but just today I heard that the final set of a dozen papers from the European Space Agency’s Planck mission are now available. You can find the latest ones, along with the all the others, here.

This final `Legacy’ set of papers is sure to be a vital resource for many years to come and I can hear in my mind’s ear the sound of cosmologists all around the globe scurrying to download them!

I’m not sure when I’ll get time to read these papers, so if anyone finds any interesting nuggets therein please feel free to comment below!

by telescoper at July 17, 2018 02:00 PM

Peter Coles - In the Dark

Georges Lemaître: Google Doodle Poll

 

I noticed this morning that today’s Google Doodle (above) features none other than Georges Lemaître. That reminded me that a while ago I stumbled across a post on the Physics World Blog concerning a radio broadcast about Georges Lemaître.

Here’s a description of said programme:

Few theories could claim to have a more fundamental status than Big Bang Theory. This is now humanity’s best attempt at explaining how we got here: A Theory of Everything. This much is widely known and Big Bang Theory is now one of the most recognisable scientific brands in the world. What’s less well known is that the man who first proposed the theory was not only an accomplished physicist, he was also a Catholic priest. Father Georges Lemaître wore his clerical collar while teaching physics, and not at Oxford, Cambridge or MIT but at the Catholic University of Leuven in Belgium. It was this unassuming Catholic priest in an academic backwater who has changed the way we look at the origins of the universe. His story also challenges the assumption that science and religion are always in conflict. William Crawley introduces us to the “Father” of the Big Bang.

The question is whether the word “Father” in the last sentence should be taken as anything more than a play on the title he’d be given as a Catholic priest?

Lemaître’s work was indeed highly original and it undoubtedly played an important role in the development of the Big Bang theory, especially in Western Europe and in the United States. However, a far stronger claim to the title of progenitor of this theory belongs to Alexander Alexandrovich Friedman, who obtained the cosmological solutions of Einstein’s general theory of relativity, on which the Big Bang model is based, independently of and shortly before Lemaître did. Unfortunately the Russian Friedman died in 1925 and it was many years before his work became widely known in the West. At least in my book, he’s the real “father” of the Big Bang, but I’m well aware that this is the source of a great deal of argument at cosmology conferences (especially when Russian cosmologists are present), which makes it an apt topic for a quick poll:

<noscript><a href="http://polldaddy.com/poll/6297499">Take Our Poll</a></noscript>

P.S. I prefer to spell Friedman with one “n” rather than two. His name in his own language is Алекса́ндр Алекса́ндрович Фри́дман and the spelling “Friedmann” only arose because of later translations into German.

by telescoper at July 17, 2018 08:08 AM

July 16, 2018

Christian P. Robert - xi'an's og

Hamiltonian tails

“We demonstrate HMC’s sensitivity to these parameters by sampling from a bivariate Gaussian with correlation coefficient 0.99. We consider three settings (ε,L) = {(0.16; 40); (0.16; 50); (0.15; 50)}” Ziyu Wang, Shakir Mohamed, and Nando De Freitas. 2013

In an experiment with my PhD student Changye Wu (who wrote all R codes used below), we looked back at a strange feature in an 2013 ICML paper by Wang, Mohamed, and De Freitas. Namely, a rather poor performance of an Hamiltonian Monte Carlo (leapfrog) algorithm on a two-dimensional strongly correlated Gaussian target, for very specific values of the parameters (ε,L) of the algorithm.

The Gaussian target associated with this sample stands right in the middle of the two clouds, as identified by Wang et al. And the leapfrog integration path for (ε,L)=(0.15,50)

keeps jumping between the two ridges (or tails) , with no stop in the middle. Changing ever so slightly (ε,L) to (ε,L)=(0.16,40) does not modify the path very much

but the HMC output is quite different since the cloud then sits right on top of the target

with no clear explanation except for a sort of periodicity in the leapfrog sequence associated with the velocity generated at the start of the code. Looking at the Hamiltonian values for (ε,L)=(0.15,50)

and for (ε,L)=(0.16,40)

does not help, except to point at a sequence located far in the tails of this Hamiltonian, surprisingly varying when supposed to be constant. At first, we thought the large value of ε was to blame but much smaller values still return poor convergence performances. As below for (ε,L)=(0.01,450)

by xi'an at July 16, 2018 10:18 PM

Peter Coles - In the Dark

Hair pursued by two planets

Joan Miró (1893-1993), painted in 1968. Oil and acrylic on canvas, 195 X 130 cm (Fundació Joan Miró, Barcelona). Original title: Cabell perseguit per dos planetes.

by telescoper at July 16, 2018 02:38 PM

ZapperZ - Physics and Physicists

Neutrinos Come Knocking For Astronomy
I feel as if these are the golden years for astronomy and astrophysics.

First there was the discovery of gravitational waves. Then a major astronomical event occurred, and we were able to detect it using the "old" standard technique via EM radiation, and via the detection of gravitational waves from it. So now astronomy has two different types of "messengers" to tell us about such events.

Well now, make way for a third messenger, and that is ubiquitous neutrinos. Two papers published in Science last week detected neutrinos (along with the accompanying EM radiation) from a "blazer". The neutrino detection part was made predominantly at IceCube detector located in the Antarctica.

Both papers are available as open access here and here. A summary of this discovery can be found at PhysicsWorld (may require free registration).

Zz.

by ZapperZ (noreply@blogger.com) at July 16, 2018 12:35 PM

Emily Lakdawalla - The Planetary Society Blog

Pretty Pictures of the Cosmos: The Cosmic Ocean
Award-winning astrophotographer Adam Block shares some of his most recent images of our amazing and beautiful universe.

July 16, 2018 11:00 AM

Tommaso Dorigo - Scientificblogging

A Beautiful New Spectroscopy Measurement
What is spectroscopy ? 
(A) the observation of ghosts by infrared visors or other optical devices
(B) the study of excited states of matter through observation of energy emissions

If you answered (A), you are probably using a lousy internet search engine; and btw, you are rather dumb. Ghosts do not exist. 

Otherwise you are welcome to read on. We are, in fact, about to discuss a cutting-edge spectroscopy measurement, performed by the CMS experiment using lots of proton-proton collisions by the CERN Large Hadron Collider (LHC). 

read more

by Tommaso Dorigo at July 16, 2018 09:13 AM

July 15, 2018

Christian P. Robert - xi'an's og

la finale

A very pleasant stroll through central Paris this afternoon, during “la” finale, when France was playing Croatia. Bars were all overflowing onto the pavements and sometimes the streets, each action was echoed throughout town, and we certainly did not miss any goal, even from the heart of the Luxembourg gardens! Which were deserted except for the occasional tourist, just as the main thoroughfares, except for police cars and emergency vehicles. Since the game ended, horns have been honking almost nonstop, even in the quietest suburbs.

by xi'an at July 15, 2018 10:18 PM

July 14, 2018

Christian P. Robert - xi'an's og

graph of the day & AI4good versus AI4bad

Apart from the above graph from Nature, rendering in a most appalling and meaningless way the uncertainty about the number of active genes in the human genome, I read a couple of articles in this issue of Nature relating to the biases and dangers of societal algorithms. One of which sounded very close to the editorial in the New York Times on which Kristian Lum commented on this blog. With the attached snippet on what is fair and unfair (or not).

The second article was more surprising as it defended the use of algorithms for more democracy. Nothing less. Written by Wendy Tam Cho, professor of political sciences, law, statistics, and mathematics at UIUC, it argued that the software that she develops to construct electoral maps produces fair maps. Which sounds over-rosy imho, as aiming to account for all social, ethnic, income, &tc., groups, i.e., most of the axes that define a human, is meaningless, if only because the structure of these groups is not frozen in time. To state that “computers are impervious to the lure of power” is borderline ridiculous, as computers and algorithms are [so far] driven by humans. This is not to say that gerrymandering should not be fought by technological means, especially and obviously by open source algorithms, as existing proposals (discussed here) demonstrate, but to entertain the notion of a perfectly representative redistricting is not only illusory, but also far from democratic as it shies away from the one person one vote  at the basis of democracy. And the paper leaves us on the dark as to whom will decide on which group or which characteristic need be represented in the votes. Of course, this is the impression obtained by reading a one page editorial in Nature [in an overcrowded and sweltering commuter train] rather than the relevant literature. Nonetheless, I remain puzzled at why this editorial was ever published. (Speaking of democracy, the issue contains also warning reports about Hungary’s ultra-right government taking over the Hungarian Academy of Sciences.)

by xi'an at July 14, 2018 10:18 PM

July 13, 2018

John Baez - Azimuth

Applied Category Theory Course: Collaborative Design

In my online course we’re reading the fourth chapter of Fong and Spivak’s book Seven Sketches. Chapter 4 is about collaborative design: building big projects from smaller parts. This is based on work by Andrea Censi:

• Andrea Censi, A mathematical theory of co-design.

The main mathematical content of this chapter is the theory of enriched profunctors. We’ll mainly talk about enriched profunctors between categories enriched in monoidal preorders. The picture above shows what one of these looks like!

Here are my lectures so far:

Lecture 55 – Chapter 4: Enriched Profunctors and Collaborative Design
Lecture 56 – Chapter 4: Feasibility Relations
Lecture 57 – Chapter 4: Feasibility Relations
Lecture 58 – Chapter 4: Composing Feasibility Relations
Lecture 59 – Chapter 4: Cost-Enriched Profunctors
Lecture 60 – Chapter 4: Closed Monoidal Preorders
Lecture 61 – Chapter 4: Closed Monoidal Preorders
Lecture 62 – Chapter 4: Constructing Enriched Categories

by John Baez at July 13, 2018 09:17 PM

John Baez - Azimuth

Random Points on a Group

In Random Points on a Sphere (Part 1), we learned an interesting fact. You can take the unit sphere in \mathbb{R}^n, randomly choose two points on it, and compute their distance. This gives a random variable, whose moments you can calculate.

And now the interesting part: when n = 1, 2 or 4, and seemingly in no other cases, all the even moments are integers.

These are the dimensions in which the spheres are groups. We can prove that the even moments are integers because they are differences of dimensions of certain representations of these groups. Rogier Brussee and Allen Knutson pointed out that if we want to broaden our line of investigation, we can look at other groups. So that’s what I’ll do today.

If we take a representation of a compact Lie group G, we get a map from group into a space of square matrices. Since there is a standard metric on any space of square matrices, this lets us define the distance between two points on the group. This is different than the distance defined using the shortest geodesic in the group: instead, we’re taking a straight-line path in the larger space of matrices.

If we randomly choose two points on the group, we get a random variable, namely the distance between them. We can compute the moments of this random variable, and today I’ll prove that the even moments are all integers.

So, we get a sequence of integers from any representation \rho of any compact Lie group G. So far we’ve only studied groups that are spheres:

• The defining representation of \mathrm{O}(1) \cong S^0 on the real numbers \mathbb{R} gives the powers of 2.

• The defining representation of \mathrm{U}(1) \cong S^1 on the complex numbers \mathbb{C} gives the central binomial coefficients \binom{2n}{n}.

• The defining representation of \mathrm{Sp}(1) \cong S^3 on the quaternions \mathbb{H} gives the Catalan numbers.

It could be fun to work out these sequences for other examples. Our proof that the even moments are integers will give a way to calculate these sequences, not by doing integrals over the group, but by counting certain ‘random walks in the Weyl chamber’ of the group. Unfortunately, we need to count walks in a certain weighted way that makes things a bit tricky for me.

But let’s see why the even moments are integers!

If our group representation is real or quaternionic, we can either turn it into a complex representation or adapt my argument below. So, let’s do the complex case.

Let G be a compact Lie group with a unitary representation \rho on \mathbb{C}^n. This means we have a smooth map

\rho \colon G \to \mathrm{End}(\mathbb{C}^n)

where \mathrm{End}(\mathbb{C}^n) is the algebra of n \times n complex matrices, such that

\rho(1) = 1

\rho(gh) = \rho(g) \rho(h)

and

\rho(g) \rho(g)^\dagger = 1

where A^\dagger is the conjugate transpose of the matrix A.

To define a distance between points on G we’ll give \mathrm{End}(\mathbb{C}^n) its metric

\displaystyle{ d(A,B) = \sqrt{ \sum_{i,j} \left|A_{ij} - B_{ij}\right|^2} }

This clearly makes \mathrm{End}(\mathbb{C}^n) into a 2n^2-dimensional Euclidean space. But a better way to think about this metric is that it comes from the norm

\displaystyle{ \|A\|^2 = \mathrm{tr}(AA^\dagger) = \sum_{i,j} |A_{ij}|^2 }

where \mathrm{tr} is the trace, or sum of the diagonal entries. We have

d(A,B) = \|A - B\|

I want to think about the distance between two randomly chosen points in the group, where ‘randomly chosen’ means with respect to normalized Haar measure: the unique translation-invariant probability Borel measure on the group. But because this measure and also the distance function are translation-invariant, we can equally well think about the distance between the identity 1 and one randomly chosen point g in the group. So let’s work out this distance!

I really mean the distance between \rho(g) and \rho(1), so let’s compute that. Actually its square will be nicer, which is why we only consider even moments. We have

\begin{array}{ccl}  d(\rho(g),\rho(1))^2 &=& \|\rho(g) - \rho(1)\|^2  \\ \\  &=& \|\rho(g) - 1\|^2  \\  \\  &=& \mathrm{tr}\left((\rho(g) - 1)(\rho(g) - 1)^\dagger)\right) \\ \\  &=& \mathrm{tr}\left(\rho(g)\rho(g)^\dagger - \rho(g) - \rho(g)^\ast + 1\right) \\ \\  &=& \mathrm{tr}\left(2 - \rho(g) - \rho(g)^\dagger \right)   \end{array}

Now, any representation \sigma of G has a character

\chi_\sigma \colon G \to \mathbb{C}

defined by

\chi_\sigma(g) = \mathrm{tr}(\sigma(g))

and characters have many nice properties. So, we should rewrite the distance between g and the identity using characters. We have our representation \rho, whose character can be seen lurking in the formula we saw:

d(\rho(g),\rho(1))^2 = \mathrm{tr}\left(2 - \rho(g) - \rho(g)^\dagger \right)

But there’s another representation lurking here, the dual

\rho^\ast \colon G \to \mathrm{End}(\mathbb{C}^n)

given by

\rho^\ast(g)_{ij} = \overline{\rho(g)_{ij}}

This is a fairly lowbrow way of defining the dual representation, good only for unitary representations on \mathbb{C}^n, but it works well for us here, because it lets us instantly see

\mathrm{tr}(\rho(g)^\dagger) = \mathrm{tr}(\rho^\ast(g)) = \chi_{\rho^\ast}(g)

This is useful because it lets us write our distance squared

d(\rho(g),\rho(1))^2 = \mathrm{tr}\left(2 - \rho(g) - \rho(g)^\dagger \right)

in terms of characters:

d(\rho(g),\rho(1))^2 = 2n - \chi_\rho(g) - \chi_{\rho^\ast}(g)

So, the distance squared is an integral linear combination of characters. (The constant function 1 is the character of the 1-dimensional trivial representation.)

And this does the job: it shows that all the even moments of our distance squared function are integers!

Why? Because of these two facts:

1) If you take an integral linear combination of characters, and raise it to a power, you get another integral linear combination of characters.

2) If you take an integral linear combination of characters, and integrate it over G, you get an integer.

I feel like explaining these facts a bit further, because they’re part of a very beautiful branch of math, called character theory, which every mathematician should know. So here’s a quick intro to character theory for beginners. It’s not as elegant as I could make it; it’s not as simple as I could make it: I’ll try to strike a balance here.

There’s an abelian group R(G) consisting of formal differences of isomorphism classes of representations of G, mod the relation

[\rho] + [\sigma] = [\rho \oplus \sigma]

Elements of R(G) are called virtual representations of G. Unlike actual representations we can subtract them. We can also add them, and the above formula relates addition in R(G) to direct sums of representations.

We can also multiply them, by saying

[\rho] [\sigma] = [\rho \otimes \sigma]

and decreeing that multiplication distributes over addition and subtraction. This makes R(G) into a ring, called the representation ring of G.

There’s a map

\chi \colon R(G) \to C(G)

where C(G) is the ring of continuous complex-valued functions on G. This map sends each finite-dimensional representation \rho to its character \chi_\rho. This map is one-to-one because we know a representation up to isomorphism if we know its character. This map is also a ring homomorphism, since

\chi_{\rho \oplus \sigma} = \chi_\rho + \chi_\sigma

and

\chi_{\rho \otimes \sigma} = \chi_\rho \chi_\sigma

These facts are easy to check directly.

We can integrate continuous complex-valued functions on G, so we get a map

\displaystyle{\int} \colon C(G) \to \mathbb{C}

The first non-obvious fact in character theory is that we can compute inner products of characters as follows:

\displaystyle{\int} \overline{\chi_\sigma} \chi_\rho  =   \dim(\mathrm{hom}(\sigma,\rho))

where the expression at right is the dimension of the space of ‘intertwining operators’, or morphisms of representations, between the representation \sigma and the representation \rho.

What matters most for us now is that this inner product is an integer. In particular, if \chi_\rho is the character of any representation,

\displaystyle{\int} \chi_\rho

is an integer because we can take \sigma to be the trivial representation in the previous formula, giving \chi_\sigma = 1.

Thus, the map

R(G) \stackrel{\chi}{\longrightarrow} C(G) \stackrel{\int}{\longrightarrow} \mathbb{C}

actually takes values in \mathbb{Z}.

Now, our distance squared function

2n - \chi_\rho - \chi_{\rho^\ast} \in C(G)

is actually the image under \chi of an element of the representation ring, namely

2n - [\rho] - [\rho^\ast]

So the same is true for any of its powers—and when we integrate any of these powers we get an integer!

This stuff may seem abstract, but if you’re good at tensoring representations of some group, like \mathrm{SU}(3), you should be able to use it to compute the even moments of the distance function on this group more efficiently than using the brute-force direct approach. Instead of complicated integrals we wind up doing combinatorics.

I would like to know what sequence of integers we get for \mathrm{SU}(3). A much easier, less thrilling but still interesting example is \mathrm{SO}(3). This is the 3-dimensional real projective space \mathbb{R}\mathrm{P}^3, which we can think of as embedded in the 9-dimensional space of 3\times 3 real matrices. It’s sort of cool that I could now work out the even moments of the distance function on this space by hand! But I haven’t done it yet.

by John Baez at July 13, 2018 07:43 PM

ZapperZ - Physics and Physicists

The Most Significant Genius
No, not Einstein, or Feynman, or Newton. Fermilab's Don Lincoln celebrates the hugely-important contribution of Emmy Noether.



I have highlighted this genius previously, especially in connection to her insight relating symmetry to conservation laws (read here, here, and here).

Zz.

by ZapperZ (noreply@blogger.com) at July 13, 2018 05:10 PM

Emily Lakdawalla - The Planetary Society Blog

NEA Scout unfurls solar sail for full-scale test
The next time its solar sail is deployed, NEA Scout will be out near the Moon.

July 13, 2018 11:00 AM

Clifford V. Johnson - Asymptotia

Radio Radio Summer Reading!

Friday will see me busy in the Radio world! Two things: (1) On the WNPR Connecticut morning show “Where We Live” they’ll be doing Summer reading recommendations. I’ll be on there live talking about my graphic non-fiction book The Dialogues: Conversations about the Nature of the Universe. Tune in either … Click to continue reading this post

The post Radio Radio Summer Reading! appeared first on Asymptotia.

by Clifford at July 13, 2018 05:23 AM

July 12, 2018

Clifford V. Johnson - Asymptotia

Splashes

In case you’re wondering, after yesterday’s post… Yes I did find some time to do a bit of sketching. Here’s one that did not get finished but was fun for working the rust off… The caption from instagram says: Quick Sunday watercolour pencil dabbling … been a long time. This … Click to continue reading this post

The post Splashes appeared first on Asymptotia.

by Clifford at July 12, 2018 08:32 PM

Matt Strassler - Of Particular Significance

“Seeing” Double: Neutrinos and Photons Observed from the Same Cosmic Source

There has long been a question as to what types of events and processes are responsible for the highest-energy neutrinos coming from space and observed by scientists.  Another question, probably related, is what creates the majority of high-energy cosmic rays — the particles, mostly protons, that are constantly raining down upon the Earth.

As scientists’ ability to detect high-energy neutrinos (particles that are hugely abundant, electrically neutral, very light-weight, and very difficult to observe) and high-energy photons (particles of light, though not necessarily of visible light) have become more powerful and precise, there’s been considerable hope of getting an answer to these question.  One of the things we’ve been awaiting (and been disappointed a couple of times) is a violent explosion out in the universe that produces both high-energy photons and neutrinos at the same time, at a high enough rate that both types of particles can be observed at the same time coming from the same direction.

In recent years, there has been some indirect evidence that blazars — narrow jets of particles, pointed in our general direction like the barrel of a gun, and created as material swirls near and almost into giant black holes in the centers of very distant galaxies — may be responsible for the high-energy neutrinos.  Strong direct evidence in favor of this hypothesis has just been presented today.   Last year, one of these blazars flared brightly, and the flare created both high-energy neutrinos and high-energy photons that were observed within the same period, coming from the same place in the sky.

I have written about the IceCube neutrino observatory before; it’s a cubic kilometer of ice under the South Pole, instrumented with light detectors, and it’s ideal for observing neutrinos whose motion-energy far exceeds that of the protons in the Large Hadron Collider, where the Higgs particle was discovered.  These neutrinos mostly pass through Ice Cube undetected, but one in 100,000 hits something, and debris from the collision produces visible light that Ice Cube’s detectors can record.   IceCube has already made important discoveries, detecting a new class of high-energy neutrinos.

On Sept 22 of last year, one of these very high-energy neutrinos was observed at IceCube. More precisely, a muon created underground by the collision of this neutrino with an atomic nucleus was observed in IceCube.  To create the observed muon, the neutrino must have had a motion-energy tens of thousand times larger than than the motion-energy of each proton at the Large Hadron Collider (LHC).  And the direction of the neutrino’s motion is known too; it’s essentially the same as that of the observed muon.  So IceCube’s scientists knew where, on the sky, this neutrino had come from.

(This doesn’t work for typical cosmic rays; protons, for instance, travel in curved paths because they are deflected by cosmic magnetic fields, so even if you measure their travel direction at their arrival to Earth, you don’t then know where they came from. Neutrinos, beng electrically neutral, aren’t affected by magnetic fields and travel in a straight line, just as photons do.)

Very close to that direction is a well-known blazar (TXS-0506), four billion light years away (a good fraction of the distance across the visible universe).

The IceCube scientists immediately reported their neutrino observation to scientists with high-energy photon detectors.  (I’ve also written about some of the detectors used to study the very high-energy photons that we find in the sky: in particular, the Fermi/LAT satellite played a role in this latest discovery.) Fermi/LAT, which continuously monitors the sky, was already detecting high-energy photons coming from the same direction.   Within a few days the Fermi scientists had confirmed that TXS-0506 was indeed flaring at the time — already starting in April 2017 in fact, six times as bright as normal.  With this news from IceCube and Fermi/LAT, many other telescopes (including the MAGIC cosmic ray detector telescopes among others) then followed suit and studied the blazar, learning more about the properties of its flare.

Now, just a single neutrino on its own isn’t entirely convincing; is it possible that this was all just a coincidence?  So the IceCube folks went back to their older data to snoop around.  There they discovered, in their 2014-2015 data, a dramatic flare in neutrinos — more than a dozen neutrinos, seen over 150 days, had come from the same direction in the sky where TXS-0506 is sitting.  (More precisely, nearly 20 from this direction were seen, in a time period where normally there’d just be 6 or 7 by random chance.)  This confirms that this blazar is indeed a source of neutrinos.  And from the energies of the neutrinos in this flare, yet more can be learned about this blazar, and how it makes  high-energy photons and neutrinos at the same time.  Interestingly, so far at least, there’s no strong evidence for this 2014 flare in photons, except perhaps an increase in the number of the highest-energy photons… but not in the total brightness of the source.

The full picture, still emerging, tends to support the idea that the blazar arises from a supermassive black hole, acting as a natural particle accelerator, making a narrow spray of particles, including protons, at extremely high energy.  These protons, millions of times more energetic than those at the Large Hadron Collider, then collide with more ordinary particles that are just wandering around, such as visible-light photons from starlight or infrared photons from the ambient heat of the universe.  The collisions produce particles called pions, made from quarks and anti-quarks and gluons (just as protons are), which in turn decay either to photons or to (among other things) neutrinos.  And its those resulting photons and neutrinos which have now been jointly observed.

Since cosmic rays, the mysterious high energy particles from outer space that are constantly raining down on our planet, are mostly protons, this is evidence that many, perhaps most, of the highest energy cosmic rays are created in the natural particle accelerators associated with blazars. Many scientists have suspected that the most extreme cosmic rays are associated with the most active black holes at the centers of galaxies, and now we have evidence and more details in favor of this idea.  It now appears likely that that this question will be answerable over time, as more blazar flares are observed and studied.

The announcement of this important discovery was made at the National Science Foundation by Francis Halzen, the IceCube principal investigator, Olga Botner, former IceCube spokesperson, Regina Caputo, the Fermi-LAT analysis coordinator, and Razmik Mirzoyan, MAGIC spokesperson.

The fact that both photons and neutrinos have been observed from the same source is an example of what people are now calling “multi-messenger astronomy”; a previous example was the observation in gravitational waves, and in photons of many different energies, of two merging neutron stars.  Of course, something like this already happened in 1987, when a supernova was seen by eye, and also observed in neutrinos.  But in this case, the neutrinos and photons have energies millions and billions of times larger!

 

by Matt Strassler at July 12, 2018 04:59 PM

John Baez - Azimuth

Random Points on a Sphere (Part 2)

This is the tale of a mathematical adventure. Last time our hardy band of explorers discovered that if you randomly choose two points on the unit sphere in 1-, 2- or 4-dimensional space and look at the probability distribution of their distances, then the even moments of this probability distribution are always integers. I gave a proof using some group representation theory.

On the other hand, with the help of Mathematica, Greg Egan showed that we can work out these moments for a sphere in any dimension by actually doing the bloody integrals.

He looked at the nth moment of the distance for two randomly chosen points in the unit sphere in \mathbb{R}^d, and he got

\displaystyle{ \text{moment}(d,n) = \frac{2^{d+n-2} \Gamma(\frac{d}{2}) \Gamma(\frac{1}{2} (d+n-1))}{\sqrt{\pi} \, \Gamma(d+ \frac{n}{2} - 1)} }

This looks pretty scary, but you can simplify it using the relation between the gamma function and factorials. Remember, for integers we have

\Gamma(n) = (n-1)!

We also need to know \Gamma at half-integers, which we can get knowing

\Gamma(\frac{1}{2}) = \sqrt{\pi}

and

\Gamma(x + 1) =  x \Gamma(x)

Using these we can express moment(d,n) in terms of factorials, but the details depend on whether d and n are even or odd.

I’m going to focus on the case where both the dimension d and the moment number n are even, so let

d = 2e, \; n = 2m

In this case we get

\text{moment}(2e,2m) = \displaystyle{\frac{\binom{2(e+m-1)} {m}}{\binom{e+m-1}{m}} }

Here ‘we’ means that Greg Egan did all the hard work:

From this formula

\text{moment}(2e,2m) = \displaystyle{\frac{\binom{2(e+m-1)} {m}}{\binom{e+m-1}{m}} }

you can show directly that the even moments in 4 dimensions are Catalan numbers:

\text{moment}(4,2m) = C_{m+1}

while in 2 dimensions they are binomial coefficients:

\mathrm{moment}(2,2m) = \displaystyle{ {2m \choose m} }

More precisely, they are ‘central’ binomial cofficients, forming the middle column of Pascal’s triangle:

1, 2, 6, 20, 70, 252,  924, 3432, 12870, 48620, \dots

So, it seems that with some real work one can get vastly more informative results than with my argument using group representation theory. The only thing you don’t get, so far, is an appealing explanation of why the even moments are integral in dimensions 1, 2 and 4.

The computational approach also opens up a huge new realm of questions! For example, are there any dimensions other than 1, 2 and 4 where the even moments are all integral?

I was especially curious about dimension 8, where the octonions live. Remember, 1, 2 and 4 are the dimensions of the associative normed division algebras, but there’s also a nonassociative normed division algebra in dimension 8: the octonions.

The d = 8 row seemed to have a fairly high fraction of integer entries:


distance_between_points_on_unit_sphere_moments_cropped.jpg

I wondered if there were only finitely many entries in the 8th row that weren’t integers. Greg Egan did a calculation and replied:

The d=8 moments don’t seem to become all integers permanently at any point, but the non-integers become increasingly sparse.

He also got evidence suggesting that for any even dimension d, a large fraction of the even moments are integers. After some further conversation he found the nice way to think about this. Recall that

\text{moment}(2e,2m) = \displaystyle{\frac{\binom{2(e+m-1)} {m}}{\binom{e+m-1}{m}} }

If we let

r = e-1

then this moment is just

\text{moment}(2r+2,2m) = \displaystyle{\frac{\binom{2(m+r)}{m}}{\binom{m+r}{m}} }

so the question becomes: when is this an integer?

It’s good to think about this naively a bit. We can cancel out a bunch of stuff in that ratio of binomial coefficents and write it like this:

\displaystyle{ \text{moment}(2r+2,2m) = \frac{(2r+m+1) \cdots (2r+2m)}{(r+1) \cdots (r+m)} }

So when is this an integer? Let’s do the 8th moment in 4 dimensions:

\text{moment}(4,8) = \displaystyle{ \frac{7 \cdot 8 \cdot 9 \cdot 10 }{2 \cdot 3 \cdot 4 \cdot 5} }

This is an integer, namely the Catalan number 42: the Answer to the Ultimate Question of Life, the Universe, and Everything.  But apparently we had to be a bit ‘lucky’ to get an integer. For example, we needed the 10 on top to deal with the 5 on the bottom.

It seems plausible that our chances of getting an integer increase as the moment gets big compared to the dimension. For example, try the 4th moment in dimension 10:

\text{moment}(10,4) = \displaystyle{ \frac{11 \cdot 12}{5 \cdot 6}}

This not an integer, because we’re just not multiplying enough numbers to handle the prime 5 in the denominator. The 6th moment in dimension 10 is also not an integer. But if we try the 8th moment, we get lucky:

\text{moment}(10,8) = \displaystyle{ \frac{13 \cdot 14 \cdot 15 \cdot 16}{5 \cdot 6 \cdot 7 \cdot 8}}

This is an integer! We’ve got enough in the numerator to handle everything in the denominator.

Greg posted a question about this on MathOverflow:

• Greg Egan, When does doubling the size of a set multiply the number of subsets by an integer?, 9 July 2018.

He got a very nice answer from a mysterious figure named Lucia, who pointed out relevant results from this interesting paper:

• Carl Pomerance, Divisors of the middle binomial coefficient, American Mathematical Monthly 122 (2015), 636–644.

Using these, Lucia proved a result that implies the following:

Theorem. If we fix a sphere of some even dimension, and look at the even moments of the probability distribution of distances between randomly chosen points on that sphere, from the 2nd moment to the (2m)th, the fraction of these that are integers approaches 1 as m → ∞.

On the other hand, Lucia also believes Pomerance’s techniques can be used to prove a result that would imply this:

Conjecture. If we fix a sphere of some even dimension > 4, and consider the even moments of the probability distribution of distances between randomly chosen points on that sphere, infinitely many of these are not integers.

In summary: we’re seeing a more or less typical rabbit-hole in mathematics. We started by trying to understand how noncommutative quaternions are on average. We figured that out, but we got sidetracked by thinking about how far points on a sphere are on average. We started calculating, we got interested in moments of the probability distribution of distances, we noticed that the Catalan numbers show up, and we got pulled into some representation theory and number theory!

I wouldn’t say our results are earth-shaking, but we definitely had fun and learned a thing or two. One thing at least is clear. In pure math, at least, it pays to follow the ideas wherever they lead. Math isn’t really divided into different branches—it’s all connected!

Afterword

Oh, and one more thing. Remember how this quest started with John D. Cook numerically computing the average of |xy - yx| over unit quaternions? Well, he went on and numerically computed the average of |(xy)z - x(yz)| over unit octonions!

• John D. Cook, How close is octonion multiplication to being associative?, 9 July 2018.

He showed the average is about 1.095, and he created this histogram:

Later, Greg Egan computed the exact value! It’s

\displaystyle{ \frac{147456}{42875 \pi} \approx 1.0947335878 \dots }

On Twitter, Christopher D. Long, whose handle is @octonion, pointed out the hidden beauty of this answer—it equals

\displaystyle{ \frac{2^{14}3^2}{5^3 7^3 \pi}    }

Nice! Here’s how Greg did this calculation:

• Greg Egan, The average associator, 12 July 2018.

Details

If you want more details on the proof of this:

Theorem. If we fix a sphere of some even dimension, and look at the even moments of the probability distribution of distances between randomly chosen points on that sphere, from the 2nd moment to the (2m)th, the fraction of these that are integers approaches 1 as m → ∞.

you should read Greg Egan’s question on Mathoverflow, Lucia’s reply, and Pomerance’s paper. Here is Greg’s question:

For natural numbers m, r, consider the ratio of the number of subsets of size m taken from a set of size 2(m+r) to the number of subsets of the same size taken from a set of size m+r:

\displaystyle{ R(m,r)=\frac{\binom{2(m+r)}{m}}{\binom{m+r}{m}} }

For r=0 we have the central binomial coefficients, which of course are all integers:

\displaystyle{ R(m,0)=\binom{2m}{m} }

For r=1 we have the Catalan numbers, which again are integers:

\displaystyle{ R(m,1)=\frac{\binom{2(m+1)}{m}}{m+1}=\frac{(2(m+1))!}{m!(m+2)!(m+1)}}
            \displaystyle{ = \frac{(2(m+1))!}{(m+2)!(m+1)!}=C_{m+1}}

However, for any fixed r\ge 2, while R(m,r) seems to be mostly integral, it is not exclusively so. For example, with m ranging from 0 to 20000, the number of times R(m,r) is an integer for r= 2,3,4,5 are 19583, 19485, 18566, and 18312 respectively.

I am seeking general criteria for R(m,r) to be an integer.

Edited to add:

We can write:

\displaystyle{ R(m,r) = \prod_{k=1}^m{\frac{m+2r+k}{r+k}} }

So the denominator is the product of m consecutive numbers r+1, \ldots, m+r, while the numerator is the product of m consecutive numbers m+2r+1,\ldots,2m+2r. So there is a gap of r between the last of the numbers in the denominator and the first of the numbers in the numerator.

Lucia replied:

Put n=m+r, and then we can write R(m,r) more conveniently as

\displaystyle{ R(m,r) = \frac{(2n)!}{m! (n+r)!} \frac{m! r!}{n!} = \frac{\binom{2n}{n} }{\binom{n+r}{r}}. }

So the question essentially becomes one about which numbers n+k for k=1, \ldots, r divide the middle binomial coefficient \binom{2n}{n}. Obviously when k=1, n+1 always divides the middle binomial coefficient, but what about other values of k? This is treated in a lovely Monthly article of Pomerance:

• Carl Pomerance, Divisors of the middle binomial coefficient, American Mathematical Monthly 122 (2015), 636–644.

Pomerance shows that for any k \ge 2 there are infinitely many integers with n+k not dividing \binom{2n}{n}, but the set of integers n for which n+k does divide \binom{2n}{n} has density 1. So for any fixed r, for a density 1 set of values of n one has that (n+1), \ldots, (n+k) all divide \binom{2n}{n}, which means that their lcm must divide \binom{2n}{n}. But one can check without too much difficulty that the lcm of n+1, \ldots, n+k is a multiple of \binom{n+k}{k}, and so for fixed r one deduces that R(m,r) is an integer for a set of values m with density 1. (Actually, Pomerance mentions explicitly in (5) of his paper that (n+1)(n+2)\cdots (n+k) divides \binom{2n}{n} for a set of full density.)

I haven’t quite shown that R(m,r) is not an integer infinitely often for r\ge 2, but I think this can be deduced from Pomerance’s paper (by modifying his Theorem 1).

I highly recommend Pomerance’s paper—you don’t need to care much about which integers divide

\displaystyle{ \binom{2n}{n} }

to find it interesting, because it’s full of clever ideas and nice observations.

by John Baez at July 12, 2018 12:00 PM

Emily Lakdawalla - The Planetary Society Blog

Generation Zero of JPL Planetary Rovers
The Jet Propulsion Laboratory has a fabled history of planetary rovers. But how do you start such a program?

July 12, 2018 11:00 AM

Clifford V. Johnson - Asymptotia

Retreated

Sorry I've been quiet on the blog for a few weeks. An unusually long gap, I think (although those of you following on instagram, twitter, Facebook and so forth have not noticed a gap). I've been hiding out at the Aspen Center for Physics for a while.

You've probably read things I've written about it here many times in past years, but if not, here's a movie that I produced/directed/designed/etc about it some time back. (You can use the search bar upper right to find earlier posts mentioning Aspen, or click here.)

Anyway, I arrived and pretty much immediately got stuck into an interesting project, as I had an idea that I just had to pursue. I filled up a whole notebook with computations and mumblings about ideas, and eventually a narrative (and a nice set of results) has emerged. So I've been putting those into some shape. I hope to tell you about it all soon. You'll be happy to know it involves black holes, entropy, thermodynamics, and quantum information [...] Click to continue reading this post

The post Retreated appeared first on Asymptotia.

by Clifford at July 12, 2018 04:44 AM

July 11, 2018

Emily Lakdawalla - The Planetary Society Blog

New goodies from asteroid Ryugu!
Two new global views of Ryugu from Hayabusa2, plus a 3-D animation.

July 11, 2018 04:21 PM

July 10, 2018

John Baez - Azimuth

Random Points on a Sphere (Part 1)

John D. Cook, Greg Egan, Dan Piponi and I had a fun mathematical adventure on Twitter. It started when John Cook wrote a program to compute the probability distribution of distances |xy - yx| where x and y were two randomly chosen unit quaternions:

• John D. Cook, How far is xy from yx on average for quaternions?, 5 July 2018.

Three things to note before we move on:

• Click the pictures to see the source and get more information—I made none of them!

• We’ll be ‘randomly choosing’ lots of points on spheres of various dimensions. Whenever we do this, I mean that they’re chosen independently, and uniformly with respect to the unique rotation-invariant Borel measure that’s a probability measure on the sphere. In other words: nothing sneaky, just the most obvious symmetrical thing!

• We’ll be talking about lots of distances between points on the unit sphere in n dimensions. Whenever we do this, I mean the Euclidean distance in \mathbb{R}^n, not the length of the shortest path on the sphere connecting them.

Okay:

If you look at the histogram above, you’ll see the length |xy - yx| is between 0 and 2. That’s good, since xy and yx are on the unit sphere in 4 dimensions. More interestingly, the mean looks bigger than 1. John Cook estimated it at 1.13.

Greg Egan went ahead and found that the mean is exactly

\displaystyle{\frac{32}{9 \pi}} \approx 1.13176848421 \dots

He did this by working out a formula for the probability distribution:

All this is great, but it made me wonder how surprised I should be. What’s the average distance between two points on the unit sphere in 4 dimensions, anyway?

Greg Egan worked this out too:



So, the mean distance |x - y| for two randomly chosen unit quaternions is

\displaystyle{\frac{64}{15 \pi}} \approx 1.35812218105\dots

The mean of |xy - yx| is smaller than this. In retrospect this makes sense, since I know what quaternionic commutators are like: for example the points x = \pm 1 at the ‘north and south poles’ of the unit sphere commute with everybody. However, we can now say the mean of |xy - yx| is exactly

\displaystyle{\frac{32}{9\pi} } \cdot  \frac{15 \pi}{64} = \frac{5}{6}

times the mean of |x - y|, and there’s no way I could have guessed that.

While trying to get a better intuition for this, I realized that as you go to higher and higher dimensions, and you standing at the north pole of the unit sphere, the chance that a randomly chosen other point is quite near the equator gets higher and higher! That’s how high dimensions work. So, the mean value of |x - y| should get closer and closer to \sqrt{2}. And indeed, Greg showed that this is true:

The graphs here show the probability distributions of distances for randomly chosen pairs of points on spheres of various dimensions. As the dimension increases, the probability distribution gets more sharply peaked, and the mean gets closer to \sqrt{2}.

Greg wrote:

Here’s the general formula for the distribution, with plots for n=2,…,10. The mean distance does tend to √2, and the mean of the squared distance is always exactly 2, so the variance tends to zero.

But now comes the surprising part.

Dan Piponi looked at the probability distribution of distances s = |x - y| in the 4-dimensional case:

P(s) = \displaystyle{\frac{s^2\sqrt{4 - s^2}}{\pi} }

and somehow noticed that its moments

\int_0^2 P(s) s^{n} \, dx

when n is even, are the Catalan numbers!

Now if you don’t know about moments of probability distributions you should go read about those, because they’re about a thousand times more important than anything you’ll learn here.

And if you don’t know about Catalan numbers, you should go read about those, because they’re about a thousand times more fun than anything you’ll learn here.

So, I’ll assume you know about those. How did Dan Piponi notice that the Catalan numbers

C_0 = 1, C_1 = 1, C_2 = 2, C_3 = 5, C_4 = 14, C_5 = 42, \dots

were the even moments of this probability distribution? Maybe it’s because he recently managed to get ahold of Richard Stanley’s book on Amazon for just $11 instead of its normal price of $77.

(I don’t know how that happened. Some people write 7’s that look like 1’s, but….)

Anyway, you’ll notice that this strange phenomenon is all about points on the unit sphere in 4 dimensions. It doesn’t seem to involve quaternions anymore! So I asked if something similar happens in other dimensions, maybe giving us other interesting sequences of integers.

Greg Egan figured it out, and got some striking results:

Here d is the dimension of the Euclidean space containing our unit sphere, and Egan is tabulating the nth moment of the probability distribution of distances between two randomly chosen points on that sphere. The gnarly formula on top is a general expression for this moment in terms of the gamma function.

The obvious interesting feature of this table is that only for d = 2 and d = 4 rows are all the entries integers.

But Dan made another great observation: Greg left out the rather trivial d = 1 row, and that all the entries of this row would be integers too! Even better, d = 1, 2, and 4 are the dimensions of the associative normed division algebras: the real numbers, the complex numbers and the quaternions!

This made me eager to find a proof that all the even moments of the probability distribution of distances between points on the unit sphere in \mathbb{R}^d are integers when \mathbb{R}^d is an associative normed division algebra.

The first step is to notice the significance of even moments.

First, we don’t need to choose both points on the sphere randomly: we can fix one and let the other vary. So, we can think of the distance

D(x) = |(x_1, \dots, x_d) - (1, \dots, 0)| = \sqrt{(x_1 - 1)^2 + x_2^2 + \cdots + x_d^2}

as a function on the sphere, or more generally a function of x \in \mathbb{R}^d. And when we do this we instantly notice that the square root is rather obnoxious, but all the even powers of the function D are polynomials on \mathbb{R}^d.

Then, we notice that restricting polynomials from Euclidean space to the sphere is how we get spherical harmonics, so this problem is connected to spherical harmonics and ‘harmonic analysis’. The nth moment of the probability distribution of distances between points on the unit sphere in \mathbb{R}^d is

\int_{S^{d-1}} D^n

where we are integrating with respect to the rotation-invariant probability measure on the sphere. We can rewrite this as an inner product in L^2(S^{d-1}), namely

\langle D^n , 1 \rangle

where 1 is the constant function equal to 1 on the whole sphere.

We’re looking at the even moments, so let n = 2m. Now, why should

\langle D^{2m} , 1 \rangle

be an integer when d = 1, 2 and 4? Well, these are the cases where the sphere S^{d-1} is a group! For d = 1,

S^0 \cong \mathbb{Z}/2

is the multiplicative group of unit real numbers, \{\pm 1\}. For d = 2,

S^1 \cong \mathrm{U}(1)

is the multiplicative group of unit complex numbers. And for d = 4,

S^3 \cong \mathrm{SU}(2)

is the multiplicative group of unit quaternions.

These are compact Lie groups, and L^2 of a compact Lie group is very nice. Any finite-dimensional representation \rho of a compact Lie group G gives a function \chi_\rho \in L^2(G) called its character, given by

\chi_\rho(g) = \mathrm{tr}(\rho(g))

And it’s well-known that for two representations \rho and \sigma, the inner product

\langle \chi_\rho, \chi_\sigma \rangle

is an integer! In fact it’s a natural number: just the dimension of the space of intertwining operators from \rho to \sigma. So, we should try to prove that

\langle D^{2m} , 1 \rangle

is an integer this way. The function 1 is the character of the trivial 1-dimensional representation, so we’re okay there. What about D^{2m}?

Well, there’s a way to take the mth tensor power \rho^{\otimes m} of a representation \rho: you just tensor the representation with itself m times. And then you can easily show

\chi_{\rho^{\otimes m}} = (\chi_\rho)^m

So, if we can show D^2 is the character of a representation, we’re done: D^{2m} will be one as well, and the inner product

\langle D^{2m}, 1 \rangle

will be an integer! Great plan!

Unfortunately, D^2 is not the character of a representation.

Unless \rho is the completely silly 0-dimensional representation we have

\chi_\rho(1) = \mathrm{tr}(\rho(1)) = \dim(\rho) > 0

where 1 is the identity element of G. But suppose we let D(g) be the distance of g from the identity element—the natural choice of ‘north pole’ when we make our sphere into a group. Then we have

D(1)^2 = 0

So D^2 can’t be a character. (It’s definitely not the character of the completely silly 0-dimensional representation: that’s zero.)

But there’s a well-known workaround. We can work with virtual representations, which are formal differences of representations, like this:

\delta = \rho - \sigma

The character of a virtual representation is defined in the obvious way

\chi_\delta = \chi_\rho - \chi_\sigma

Since the inner product of characters of two representations is a natural number, the inner product of characters of two virtual representations will be an integer. And we’ll be completely satisfied if we prove that

\langle D^{2m}, 1 \rangle

is an integer, since it’s obviously ≥ 0.

So, we just need to show that D^{2m} is the character of a virtual representation. This will easily follow if we can show D^2 itself is the character of a virtual representation: you can tensor virtual representations, and then their characters multiply.

So, let’s do it! I’ll just do the quaternionic case. I’m doing it right now, thinking out loud here. I figure I should start with a really easy representation, take its character, compare that to our function D^2, and then fix it by subtracting something.

Let \rho be the spin-1/2 representation of \mathrm{SU}(2), which just sends every matrix in \mathrm{SU}(2) to itself. Every matrix in \mathrm{SU}(2) is conjugate to one of the form

g = \left(\begin{array}{cc} \exp(i\theta) & 0 \\ 0 & \exp(-i\theta) \end{array}\right)

so we can just look at those, and we have

\chi_\rho(g) = \mathrm{tr}(\rho(g)) = \mathrm{tr}(g) = 2 \cos \theta

On the other hand, we can think of g as a unit quaternion, and then

g = \cos \theta + i \sin \theta

where now i stands for the quaternion of that name! So, its distance from 1 is

D(g) = |\cos \theta + i \sin \theta - 1|

and if we square this we get

D(g)^2 = (1 - \cos \theta)^2 + \sin^2 \theta = 2 - 2 \cos \theta

So, we’re pretty close:

D(g)^2 = 2 - \chi_{\rho}

In particular, this means D^2 is the character of the virtual representation

(1 \oplus 1) - \rho

where 1 is the 1d trivial rep and \rho is the spin-1/2 rep.

So we’re done!

At least we’re done showing the even moments of the distance between two randomly chosen points on the 3-sphere is an integer. The 1-sphere and 0-sphere cases are similar.

But course there’s another approach! We can just calculate the darn moments and see what we get. This leads to deeper puzzles, which we have not completely solved. But I’ll talk about these next time, in Part 2.

by John Baez at July 10, 2018 10:06 PM

CERN Bulletin

Summer is coming, enjoy our offers for the aquatic parks: Walibi & Aquaparc!

Summer is coming, enjoy our offers for the aquatic parks: Walibi & Aquaparc!

Walibi:

Tickets "Zone terrestre": 25 € instead of de 31 €.

Access to Aqualibi: 5 € instead of 8 € on presentation of your Staff Association member ticket.

Free for children under 100 cm, with limited access to the attractions.

Free car park.

*  *  *  *  *  *

Aquaparc:

Full day ticket:

  • Children: 33 CHF instead of 39 CHF
  • Adults: 33 CHF instead of 49 CHF

Free for children under 5.

July 10, 2018 11:07 AM

CERN Bulletin

Interfon

Cooperative open to international civil servants. We welcome you to discover the advantages and discounts negotiated with our suppliers either on our website www.interfon.fr or at our information office located at CERN, on the ground floor of bldg. 504, open Monday through Friday from 12.30 to 15.30.

July 10, 2018 11:07 AM

CERN Bulletin

Club de pétanque : résultats du Challenge Carteret 2018

Vingt-six joueurs était présent ce jeudi 5 juillet 2018 pour disputer le Challenge de notre regretté ami Claude Carteret qui vu le temps a été organisé au boulodrome de Saint Genis Pouilly.

Nos habitués de la table de marque père et fils Claude et David Jouve après trois parties parfois assez serrées proclamait vainqueur notre président du club Claude Cerruti avec trois parties gagnées devançant au goal-average Jean-Claude Frot de retour parmi nous et toujours aussi adroit.

Le troisième David Jouve qui cumulait les tâches de joueur et arbitre.

La première féminine Gabrielle Cerrutin, elle aussi joueuse battante et appliquée.

Cette soirée se terminait par un bon repas préparé par Sylvie Jouve et sa fille Jennifer que nous remercions infiniment.

Rendez-vous au prochain concours, Challenge Patrick Durand qui aura lieu le jeudi 26 juillet 2018.

July 10, 2018 10:07 AM

July 09, 2018

CERN Bulletin

Reducing waste in the workplace

Paper, cardboard, PET, aluminium cans, glass, Nespresso capsules, wood and worksite waste: in 2016, CERN produced no less than 5700 tonnes of waste, about 50% of which was recycled. How can we improve on this?

Many measures are already in place at CERN to limit waste and encourage recycling. Several articles have been published to raise awareness among users, Cernois and visitors on ways to limit our waste: https://home.cern/fr/cern-people/updates/2018/05/much-less-plastic-thats-fantastic

Did you know?

NOVAE restaurants offer a 10 cent discount at the cash register for people who use their own cups/mugs.

Focusing on the ubiquitous and well-loved coffee break, note how much waste can be generated, from the sugar packets to coffee pods, plastic cutlery and especially disposable plastic or paper cups. In the same way our shopping outings are accompanied by reusable shopping bags, why not bring your own mug or cup for your morning or afternoon coffee? Also  think about the packaging of the products you consume (coffee, sugar, biscuits...) favouring larger quantities as opposed to single items is often cheaper and especially a source of less waste.

The Staff Association encourages these initiatives and would like to hear your ideas, and environmental concerns. Feel free to contact us by email: staff association@cern.ch or speak directly with your delegates.

July 09, 2018 04:07 PM

CERN Bulletin

Questions about your employment and working conditions at CERN? Contact your nearest staff association representative!

One of the Staff Association's Infom-Action Commission’s responsibilities is facilitating direct communication between members of the personnnel and the Association.

With the aim of finding an efficient means to identify staff association representatives, the commission worked closely with the SMB department and using the GIS portal, set-up a platform for you to look up your representative and their physical location on the CERN Site.

How to find and contact your representatives?

Your delegates are located all over CERN, on the Meyrin and Prevessin sites. Today, by going to the SA website (http://cern.ch/go/7hNM) you can easily locate your nearest delegate.

In one click, various information is provided such as e-mail address, telephone  number as well as group and department. Additional information is also available by clicking on the "more information" option.

Feel free to meet them!

July 09, 2018 04:07 PM

The n-Category Cafe

Beyond Classical Bayesian Networks

guest post by Pablo Andres-Martinez and Sophie Raynor

In the final installment of the Applied Category Theory seminar, we discussed the 2014 paper “Theory-independent limits on correlations from generalized Bayesian networks” by Henson, Lal and Pusey.

In this post, we’ll give a short introduction to Bayesian networks, explain why quantum mechanics means that one may want to generalise them, and present the main results of the paper. That’s a lot to cover, and there won’t be a huge amount of category theory, but we hope to give the reader some intuition about the issues involved, and another example of monoidal categories used in causal theory.

Introduction

Bayesian networks are a graphical modelling tool used to show how random variables interact. A Bayesian network consists of a pair <semantics>(G,P)<annotation encoding="application/x-tex">(G,P)</annotation></semantics> of directed acyclic graph (DAG) <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> together with a joint probability distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> on its nodes, satisfying the Markov condition. Intuitively the graph describes a flow of information.

The Markov condition says that the system doesn’t have memory. That is, the distribution on a given node <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> is only dependent on the distributions on the nodes <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> for which there is an edge <semantics>XY<annotation encoding="application/x-tex">X \rightarrow Y</annotation></semantics>. Consider the following chain of binary events. In spring, the pollen in the air may cause someone to have an allergic reaction that may make them sneeze.

a poset

In this case the Markov condition says that given that you know that someone is having an allergic reaction, whether or not it is spring is not going to influence your belief about the likelihood of them sneezing. Which seems sensible.

Bayesian networks are useful

  • as an inference tool, thanks to belief propagation algorithms,

  • and because, given a Bayesian network <semantics>(G,P)<annotation encoding="application/x-tex">(G,P)</annotation></semantics>, we can describe d-separation properties on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> which enable us to discover conditional independences in <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics>.

It is this second point that we’ll be interested in here.

Before getting into the details of the paper, let’s try to motivate this discussion by explaining its title: “Theory-independent limits on correlations from generalized Bayesian networks" and giving a little more background to the problem it aims to solve.

Crudely put, the paper aims to generalise a method that assumes classical mechanics to one that holds in quantum and more general theories.

Classical mechanics rests on two intuitively reasonable and desirable assumptions, together called local causality,

  • Causality:

    Causality is usually treated as a physical primitive. Simply put it is the principle that there is a (partial) ordering of events in space time. In order to have information flow from event <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> to event <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>, <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> must be in the past of <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>.

    Physicists often define causality in terms of a discarding principle: If we ignore the outcome of a physical process, it doesn’t matter what process has occurred. Or, put another way, the outcome of a physical process doesn’t change the initial conditions.

  • Locality:

    Locality is the assumption that, at any given instant, the values of any particle’s properties are independent of any other particle. Intuitively, it says that particles are individual entities that can be understood in isolation of any other particle.

    Physicists usually picture particles as having a private list of numbers determining their properties. The principle of locality would be violated if any of the entries of such a list were a function whose domain is another particle’s property values.

In 1935 Einstein, Podolsky and Rosen showed that quantum mechanics (which was a recently born theory) predicted that a pair of particles could be prepared so that applying an action on one of them would instantaneously affect the other, no matter how distant in space they were, thus contradicting local causality. This seemed so unreasonable that the authors presented it as evidence that quantum mechanics was wrong.

But Einstein was wrong. In 1964, John S. Bell set the bases for an experimental test that would demonstrate that Einstein’s “spooky action at a distance” (Einstein’s own words), now known as entanglement, was indeed real. Bell’s experiment has been replicated countless of times and has plenty of variations. This video gives a detailed explanation of one of these experiments, for a non-physicist audience.

But then, if acting on a particle has an instantaneous effect on a distant point in space, one of the two principle above is violated: On one hand, if we acted on both particles at the same time, each action being a distinct event, both would be affecting each other’s result, so it would not be possible to decide on an ordering; causality would be broken. The other option would be to reject locality: a property’s value may be given by a function, so the resulting value may instantaneously change when the distant ‘domain’ particle is altered. In that case, the particles’ information was never separated in space, as they were never truly isolated, so causality is preserved.

Since causality is integral to our understanding of the world and forms the basis of scientific reasoning, the standard interpretation of quantum mechanics is to accept non-locality.

The definition of Bayesian networks implies a discarding principle and hence there is a formal sense in which they are causal (even if, as we shall see, the correlations they model do not always reflect the temporal order). Under this interpretation, the causal theory Bayesian networks describe is classical. Precisely, they can only model probability distributions that satisfy local causality. Hence, in particular, they are not sufficient to model all physical correlations.

The goal of the paper is to develop a framework that generalises Bayesian networks and d-separation results, so that we can still use graph properties to reason about conditional dependence under any given causal theory, be it classical, quantum, or even more general. In particular, this theory will be able to handle all physically observed correlations, and all theoretically postulated correlations.

Though category theory is not mentioned explicitly, the authors achieve their goal by using the categorical framework of operational probablistic theories (OPTs).

Bayesian networks and d-separation

Consider the situation in which we have three Boolean random variables. Alice is either sneezing or she is not, she either has a a fever or she does not, and she may or may not have flu.

Now, flu can cause both sneezing and fever, that is

<semantics>P(sneezing|flu)P(sneezing) and likewise P(fever|flu)P(fever)<annotation encoding="application/x-tex">P(sneezing \ | \ flu ) \neq P( sneezing) \ \text{ and likewise } \ P(fever \ | \ flu ) \neq P( fever)</annotation></semantics>

so we could represent this graphically as

a poset

Moreover, intuitively we wouldn’t expect there to be any other edges in the above graph. Sneezing and fever, though correlated - each is more likely if Alice has flu - are not direct causes of each other. That is,

<semantics>P(sneezing|fever)P(sneezing) but P(sneezing|fever,flu)=P(sneezing|flu).<annotation encoding="application/x-tex">P(sneezing \ | \ fever ) \neq P(sneezing) \ \text{ but } \ P(sneezing \ | \ fever, \ flu ) = P(sneezing \ | \ flu).</annotation></semantics>

Bayesian networks

Let <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> be a directed acyclic graph or DAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>. (Here a directed graph is a presheaf on (<semantics><annotation encoding="application/x-tex">\bullet \rightrightarrows \bullet</annotation></semantics>)).

The set <semantics>Pa(Y)<annotation encoding="application/x-tex">Pa(Y)</annotation></semantics> of parents of a node <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> contains those nodes <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> such that there is a directed edge <semantics>XY<annotation encoding="application/x-tex">X \to Y</annotation></semantics>.

So, in the example above <semantics>Pa(flu)=<annotation encoding="application/x-tex">Pa(flu) = \emptyset</annotation></semantics> while <semantics>Pa(fever)=Pa(sneezing)={flu}<annotation encoding="application/x-tex">Pa(fever) = Pa(sneezing) = \{ flu \}</annotation></semantics>.

To each node <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> of a directed graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, we may associate a random variable, also denoted <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>. If <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics> is the set of nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> and <semantics>(x X) XV<annotation encoding="application/x-tex">(x_X)_{X \in V}</annotation></semantics> is a choice of value <semantics>x X<annotation encoding="application/x-tex">x_X</annotation></semantics> for each node <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, such that <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> is the chosen value for <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics>, then <semantics>pa(y)<annotation encoding="application/x-tex">pa(y)</annotation></semantics> will denote the <semantics>Pa(Y)<annotation encoding="application/x-tex">Pa(Y)</annotation></semantics>-tuple of values <semantics>(x X) XPa(Y)<annotation encoding="application/x-tex">(x_X)_{X \in Pa(Y)}</annotation></semantics>.

To define Bayesian networks, and establish the notation, let’s revise some probability basics.

Let <semantics>P(x,y|z)<annotation encoding="application/x-tex">P(x,y \ | \ z)</annotation></semantics> mean <semantics>P(X=x and Y=y|Z=z)<annotation encoding="application/x-tex">P(X = x \text{ and } \ Y = y \ | \ Z = z)</annotation></semantics>, the probability that <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has the value <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> has the value <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> given that <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> has the value <semantics>z<annotation encoding="application/x-tex">z</annotation></semantics>. Recall that this is given by

<semantics>P(x,y|z)=P(x,y,z)P(z).<annotation encoding="application/x-tex">P(x,y \ |\ z) = \frac{ P(x,y,z) }{P(z)}.</annotation></semantics>

The chain rule says that, given a value <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and sets of values <semantics>Ω,Λ<annotation encoding="application/x-tex">\Omega, \Lambda</annotation></semantics> of other random variables,

<semantics>P(x,Ω|Λ)=P(x|Λ)P(Ω|x,Λ).<annotation encoding="application/x-tex">P(x, \Omega \ | \ \Lambda) = P( x \ | \ \Lambda) P( \Omega \ | \ x, \Lambda).</annotation></semantics>

Random variables <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> are said to be conditionally independent given <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics>, written <semantics>XY|Z<annotation encoding="application/x-tex">X \perp\!\!\!\!\!\!\!\perp Y \ | \ Z</annotation></semantics>, if for all values <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> of <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> and <semantics>z<annotation encoding="application/x-tex">z</annotation></semantics> of <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics>

<semantics>P(x,y|z)=P(x|z)P(y|z).<annotation encoding="application/x-tex">P(x,y \ | \ z) = P(x \ | \ z) P(y \ | \ z).</annotation></semantics>

By the chain rule this is equivalent to

<semantics>P(x|y,z)=P(x|z),x,y,z.<annotation encoding="application/x-tex">P(x \ | \ y,z ) = P (x \ | \ z) , \ \forall x,y, z.</annotation></semantics>

More generally, we may replace <semantics>X,Y<annotation encoding="application/x-tex">X,Y</annotation></semantics> and <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> with sets of random variables. So, in the special case that <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> is empty, then <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> are independent if and only if <semantics>P(x,y)=P(x)P(y)<annotation encoding="application/x-tex">P(x, y) = P(x)P(y)</annotation></semantics> for all <semantics>x,y<annotation encoding="application/x-tex">x,y</annotation></semantics>.

Markov condition

A joint probability distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> on the nodes of a DAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> is said to satisfy the Markov condition if for any set of random variable <semantics>{X i} i=1 n<annotation encoding="application/x-tex">\{X_i\}_{i = 1}^n</annotation></semantics> on the nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, with choice of values <semantics>{x i} i=1 n<annotation encoding="application/x-tex">\{x_i\}_{i = 1}^n</annotation></semantics>

<semantics>P(x i,,x n)= i=1 nP(x i|pa(x i)).<annotation encoding="application/x-tex">P(x_i, \dots, x_n) = \prod_{i = 1}^n P(x_i \ | \ {pa(x_i)}).</annotation></semantics>

So, for the flu, fever and sneezing example above, a distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> satisfies the Markov condition if

<semantics>P(flu,fever,sneezing)=P(fever|flu)P(sneezing|flu)P(flu).<annotation encoding="application/x-tex">P(flu, \ fever, \ sneezing) = P(fever \ | \ flu) P(sneezing \ | \ flu) P(flu).</annotation></semantics>

A Bayesian network is defined as a pair <semantics>(G,P)<annotation encoding="application/x-tex">(G,P)</annotation></semantics> of a DAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> and a joint probability distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> on the nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> that satisfies the Markov condition with respect to <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>. This means that each node in a Bayesian network is conditionally independent, given its parents, of any of the remaining nodes.

In particular, given a Bayesian network <semantics>(G,P)<annotation encoding="application/x-tex">(G,P)</annotation></semantics> such that there is a directed edge <semantics>XY<annotation encoding="application/x-tex">X \to Y</annotation></semantics>, the Markov condition implies that

<semantics> yP(x,y)= yP(x)P(y|x)=P(x) yP(y|x)=P(x)<annotation encoding="application/x-tex">\sum_{y} P(x,y) = \sum_y P(x) P(y \ | \ x) = P(x) \sum_y P(y \ | \ x) = P(x)</annotation></semantics>

which may be interpreted as a discard condition. (The ordering is reflected by the fact that we can’t derive <semantics>P(y)<annotation encoding="application/x-tex">P(y)</annotation></semantics> from <semantics> xP(x,y)= xP(x)P(y|x)<annotation encoding="application/x-tex">\sum_{x} P(x,y) = \sum_x P(x) P(y \ | \ x)</annotation></semantics>.)

Let’s consider some simple examples.

Fork

In the example of flu, sneezing and fever above, the graph has a fork shape. For a probability distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> to satisfy the Markov condition for this graph we must have

<semantics>P(x,y,z)=P(x|z)P(y|z)P(z),x,y,z.<annotation encoding="application/x-tex">P(x, y, z) = P(x \ | \ z) P(y \ | \ z)P(z), \ \forall x,y,z.</annotation></semantics>

However, <semantics>P(x,y)P(x)P(y)<annotation encoding="application/x-tex">P(x,y) \neq P(x) P(y)</annotation></semantics>.

In other words, <semantics>XY|Z<annotation encoding="application/x-tex">X \perp\!\!\!\!\!\!\!\perp Y \ | \ Z</annotation></semantics>, though <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> are not independent. This makes sense, we wouldn’t expect sneezing and fever to be uncorrelated, but given that we know whether or not Alice has flu, telling us that she has fever isn’t going to tell us anything about her sneezing.

Collider

Reversing the arrows in the fork graph above gives a collider as in the following example.

a poset

Clearly whether or not Alice has allergies other than hayfever is independent of what season it is. So we’d expect a distribution on this graph to satisfy <semantics>XY|<annotation encoding="application/x-tex">X \perp\!\!\!\!\!\!\!\perp Y \ | \ \emptyset</annotation></semantics>. However, if we know that Alice is having an allergic reaction, and it happens to be spring, we will likely assume that she has some allergy, i.e. <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> are not conditionally independent given <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics>.

Indeed, the Markov condition and chain rule for this graph gives us <semantics>XY|<annotation encoding="application/x-tex">X \perp\!\!\!\!\!\!\!\perp Y \ | \ \emptyset</annotation></semantics>:

<semantics>P(x,y,z)=P(x)P(y)P(z|x,y)=P(z|x,y)P(x|y)P(y)x,y,z.<annotation encoding="application/x-tex">P(x, y, z) = P(x)P(y) P(z \ | \ x,\ y) = P(z \ | \ x,\ y) P( x\ | \ y) P(y) \ \forall x,y,z.</annotation></semantics>

from which we cannot derive <semantics>P(x|z)P(y|z)=P(x,y|z)<annotation encoding="application/x-tex">P(x \ | \ z) P(y \ | \ z) = P(x,y \ | \ z)</annotation></semantics>. (However, it could still be true for some particular choice of probability distribution.)

Chain

Finally, let us return to the chain of correlations presented in the introduction.

Clearly the probabilities that it is spring and that Alice is sneezing are not independent, and indeed, we cannot derive <semantics>P(x,y)=P(x)P(y)<annotation encoding="application/x-tex">P(x, y) = P(x) P(y)</annotation></semantics>. However observe that, by the chain rule, a Markov distribution on the chain graph must satisfy <semantics>XY|Z<annotation encoding="application/x-tex">X\perp\!\!\!\!\!\!\!\perp Y \ | \ Z</annotation></semantics>. If we know Alice is having an allergic reaction that is not hayfever, whether or not she is sneezing is not going to affect our guess as to what season it is.

Crucially, in this case, knowing the season is also not going to affect whether we think Alice is sneezing. By definition, conditional independence of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> given <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> is symmetric in <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics>. In other words, a joint distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> on the variables <semantics>X,Y,Z<annotation encoding="application/x-tex">X,Y,Z</annotation></semantics> satisfies the Markov condition with respect to the chain graph

<semantics>XZY<annotation encoding="application/x-tex">X \longrightarrow Z \longrightarrow Y</annotation></semantics>

if and only if <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> satisfies the Markov condition on

<semantics>YZX.<annotation encoding="application/x-tex">Y \longrightarrow Z \longrightarrow X .</annotation></semantics>

d-separation

The above observations can be generalised to statements about conditional independences in any Bayesian network. That is, if <semantics>(G,P)<annotation encoding="application/x-tex">(G,P)</annotation></semantics> is a Bayesian network then the structure of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> is enough to derive all the conditional independences in <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> that are implied by the graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> (in reality there may be more that have not been included in the network!).

Given a DAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> and a set of vertices <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, let <semantics>m(U)<annotation encoding="application/x-tex">m(U)</annotation></semantics> denote the union of <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> with all the vertices <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics> of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> such that there is a directed edge from <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> to <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics>. The set <semantics>W(U)<annotation encoding="application/x-tex">W(U)</annotation></semantics> will denote the non-inclusive future of <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics>, that is, the set of vertices <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics> of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> for which there is no directed (possibly trivial) path from <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics> to <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics>.

For a graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, let <semantics>X,Y,Z<annotation encoding="application/x-tex">X, Y, Z</annotation></semantics> now denote disjoint subsets of the vertices of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> (and their corresponding random variables). Set <semantics>W:=W(XYZ)<annotation encoding="application/x-tex">W := W(X \cup Y \cup Z)</annotation></semantics>.

Then <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> are said to be d-separated by <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics>, written <semantics>XY|Z<annotation encoding="application/x-tex">X \perp Y \ | \ Z</annotation></semantics>, if there is a partition <semantics>{U,V,W,Z}<annotation encoding="application/x-tex">\{U,V,W,Z\}</annotation></semantics> of the nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> such that

  • <semantics>XU<annotation encoding="application/x-tex">X \subseteq U</annotation></semantics> and <semantics>YV<annotation encoding="application/x-tex">Y \subseteq V</annotation></semantics>, and

  • <semantics>m(U)m(V)W,<annotation encoding="application/x-tex">m(U) \cap m(V) \subseteq W,</annotation></semantics> in other words <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> and <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics> have no direct influence on each other.

(This is lemma 19 in the paper.)

Now d-separation is really useful since it tells us everything there is to know about the conditional dependences on Bayesian networks with underlying graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>. Indeed,

Theorem 5

  • Soundness of d-separation (Verma and Pearl, 1988) If <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> is a Markov distribution with respect to a graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> then for all disjoint subsets <semantics>X,Y,Z<annotation encoding="application/x-tex">X,Y,Z</annotation></semantics> of nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> <semantics>XY|Z<annotation encoding="application/x-tex">X \perp Y \ | \ Z</annotation></semantics> implies that <semantics>XY|Z<annotation encoding="application/x-tex">X \perp\!\!\!\!\!\!\!\perp Y \ | \ Z</annotation></semantics>.

  • Completeness of d-separation (Meek, 1995) If <semantics>XY|Z<annotation encoding="application/x-tex">X \perp\!\!\!\!\!\!\!\perp Y \ | \ Z</annotation></semantics> for all <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> Markov with respect to <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, then <semantics>XY|Z<annotation encoding="application/x-tex">X \perp Y \ | \ Z</annotation></semantics>.

We can combine the previous examples of fork, collider and chain graphs to get the following

a poset

A priori, Allergic reaction is conditionally independent of Fever. Indeed, we have the partition

a poset

which clearly satisfies d-separation. However, if Sneezing is known then <semantics>W=<annotation encoding="application/x-tex">W = \emptyset</annotation></semantics>, so Allergic reaction and Fever are not independent. Indeed, if we use the same sets <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> and <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics> as before, then <semantics>m(U)m(V)={Sneezing}<annotation encoding="application/x-tex">m(U) \cap m(V) = \{ Sneezing \}</annotation></semantics>, so the condition for d-separation fails; and it does for any possible choice of <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> and <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>. Interestingly, if Flu is also known, we again obtain conditional independence between Allergic reaction and Fever, as shown below.

a poset

Before describing the limitations of this setup and why we may want to generalise it, it is worth observing that Theorem 5 is genuinely useful computationally. Theorem 5 says that given a Bayesian network <semantics>(G,P)<annotation encoding="application/x-tex">(G,P)</annotation></semantics>, the structure of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> gives us a recipe to factor <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics>, thereby greatly increasing computation efficiency for Bayesian inference.

Latent variables, hidden variables, and unobservables

In the context of Bayesian networks, there are two reasons that we may wish to add variables to a probabilistic model, even if we are not entirely sure what the variables signify or how they are distributed. The first reason is statistical and the second is physical.

Consider the example of flu, fever and sneezing discussed earlier. Although our analysis told us <semantics>FeverSneezing|Flu<annotation encoding="application/x-tex">Fever \perp\!\!\!\!\!\!\!\perp Sneezing \ | \ Flu</annotation></semantics>, if we conduct an experiment we are likely to find:

<semantics>P(fever|sneezing,flu)P(fever|flu).<annotation encoding="application/x-tex">P(fever \ | \ sneezing, \ flu) \neq P(fever \ | \ flu).</annotation></semantics>

The problem is caused by the graph not properly modelling reality, but a simplification of it. After all, there are a whole bunch of things that can cause sneezing and flu. We just don’t know what they all are or how to measure them. So, to make the network work, we may add a hypothetical latent variable that bunches together all the unknown joint causes, and equip it with a distribution that makes the whole network Bayesian, so that we are still able to perform inference methods like belief propagation.

a poset

On the other hand, we may want to add variables to a Bayesian network if we have evidence that doing so will provide a better model of reality.

For example, consider the network with just two connected nodes

a poset

Every distribution on this graph is Markov, and we would expect there to be a correlation between a road being wet and the grass next to it being wet as well, but most people would claim that there’s something missing from the picture. After all, rain could be a ‘common cause’ of the road and the grass being wet. So, it makes sense to add a third variable.

But maybe we can’t observe whether it has rained or not, only whether the grass and/or road are wet. Nonetheless, the correlation we observe suggests that they have a common cause. To deal with such cases, we could make the third variable hidden. We may not know what information is included in a hidden variable, nor its probability distribution.

All that matters is that the hidden variable helps to explain the observed correlations.

a poset

So, latent variables are a statistical tool that ensure the Markov condition holds. Hence they are inherently classical, and can, in theory, be known. But the universe is not classical, so, even if we lump whatever we want into as many classical hidden variables as we want and put them wherever we need, in some cases, there will still be empirically observed correlations that do not satisfy the Markov condition.

Most famously, Bell’s experiment shows that it is possible to have distinct variables <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> and <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> that exhibit correlations that cannot be explained by any classical hidden variable, since classical variables are restricted by the principle of locality.

In other words, though <semantics>AB|Λ<annotation encoding="application/x-tex">A \perp B \ | \ \Lambda</annotation></semantics>,

<semantics>P(a|b,λ)P(a|λ).<annotation encoding="application/x-tex">P(a \ | b,\ \lambda) \neq P(a \ | \ \lambda).</annotation></semantics>

Implicitly, this means that a classical <semantics>Λ<annotation encoding="application/x-tex">\Lambda</annotation></semantics> is not enough. If we want <semantics>P(a|b,λ)P(a|λ)<annotation encoding="application/x-tex">P(a \ | b,\ \lambda) \neq P(a \ | \ \lambda)</annotation></semantics> to hold, <semantics>Λ<annotation encoding="application/x-tex">\Lambda</annotation></semantics> must be a non-local (non-classical) variable. Quantum mechanics implies that we can’t possibly empirically find the value of a non-local variable (for similar reasons to the Heisenberg’s uncertainty principle), so non-classical variables are often called unobservables. In particular, it is irrelevant to question whether <semantics>AB|Λ<annotation encoding="application/x-tex">A \perp\!\!\!\!\!\!\!\perp B \ | \ \Lambda</annotation></semantics>, as we would need to know the value of <semantics>Λ<annotation encoding="application/x-tex">\Lambda</annotation></semantics> in order to condition over it.

Indeed, this is the key idea behind what follows. We declare certain variables to be unobservable and then insist that conditional (in)dependence only makes sense between observable variables conditioned over observable variables.

Generalising classical causality

The correlations observed in the Bell experiment can be explained by quantum mechanics. But thought experiments such as the one described here suggest that theoretically, correlations may exist that violate even quantum causality.

So, given that graphical models and d-separation provide such a powerful tool for causal reasoning in the classical context, how can we generalise the Markov condition and Theorem 5 to quantum, and even more general causal theories? And, if we have a theory-independent Markov condition, are there d-separation results that don’t correspond to any given causal theory?

Clearly the first step in answering these questions is to fix a definition of a causal theory.

Operational probabilistic theories

An operational theory is a symmetric monoidal category <semantics>(C,,I)<annotation encoding="application/x-tex">(\mathsf {C}, \otimes, I)</annotation></semantics> whose objects are known as systems or resources. Morphisms are finite sets <semantics>f={𝒞 i} iI<annotation encoding="application/x-tex">f = \{\mathcal {C}_i\}_{i \in I}</annotation></semantics> called tests, whose elements are called outcomes. Tests with a single element are called deterministic, and for each system <semantics>Aob(C)<annotation encoding="application/x-tex">A \in ob (\mathsf {C})</annotation></semantics>, the identity <semantics>id A(A,A)<annotation encoding="application/x-tex">id_A \in \mathsf (A,A)</annotation></semantics> is a deterministic test.

In this discussion, we’ll identify tests <semantics>{𝒞 i} i,{𝒟 j} j<annotation encoding="application/x-tex">\{\mathcal {C}_i \}_i , \{\mathcal {D}_j\}_j</annotation></semantics> in <semantics>C<annotation encoding="application/x-tex">\mathsf {C}</annotation></semantics> if we may always replace one with the other without affecting the distributions in <semantics>C(I,I)<annotation encoding="application/x-tex">\mathsf {C}(I, I)</annotation></semantics>.

Given <semantics>{𝒞 i} iC(B,C)<annotation encoding="application/x-tex">\{\mathcal {C}_i \}_i \in \mathsf {C}(B, C)</annotation></semantics> and <semantics>{𝒟 j}C(A,B)<annotation encoding="application/x-tex">\{\mathcal {D}_j \} \in \mathsf {C}(A, B)</annotation></semantics>, their composition <semantics>fg<annotation encoding="application/x-tex">f \circ g</annotation></semantics> is given by

<semantics>{𝒞 i𝒟 j} i,jC(A,C).<annotation encoding="application/x-tex">\{ \mathcal {C}_i \circ \mathcal {D}_j \}_{i,j} \in \mathsf {C}(A, C).</annotation></semantics>

First apply <semantics>𝒟<annotation encoding="application/x-tex">\mathcal {D}</annotation></semantics> with output <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> then apply <semantics>𝒞<annotation encoding="application/x-tex">\mathcal {C}</annotation></semantics> with outcome <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>.

The monoidal composition <semantics>{𝒞 i𝒟 j} i,jC(AC,BD)<annotation encoding="application/x-tex">\{ \mathcal {C}_i \otimes \mathcal {D}_j \}_{i, j} \in \mathsf {C}(A \otimes C, B \otimes D)</annotation></semantics> corresponds to applying <semantics>{𝒞 i} iC(A,B)<annotation encoding="application/x-tex">\{\mathcal {C}_i\}_i \in \mathsf {C}(A,B)</annotation></semantics> and <semantics>{𝒟 j} j<annotation encoding="application/x-tex">\{ \mathcal {D}_j \}_j</annotation></semantics> separately on <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> and <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>.

An operational probabilistic theory or OPT is an operational theory such that every test <semantics>II<annotation encoding="application/x-tex">I \to I</annotation></semantics> is a probability distribution.

A morphism <semantics>{𝒞 i} iC(A,I)<annotation encoding="application/x-tex">\{ \mathcal {C}_i \}_i \in \mathsf {C}(A, I)</annotation></semantics> is called an effect on <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>. An OPT <semantics>C<annotation encoding="application/x-tex">\mathsf {C}</annotation></semantics> is called causal or a causal theory if, for each system <semantics>Aob(C)<annotation encoding="application/x-tex">A \in ob (\mathsf {C})</annotation></semantics>, there is a unique deterministic effect <semantics> AC(A,I)<annotation encoding="application/x-tex">\top_A \in \mathsf {C}( A, I)</annotation></semantics> which we call the discard of <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>.

In particular, for a causal OPT <semantics>C<annotation encoding="application/x-tex">\mathsf {C}</annotation></semantics>, uniqueness of the discard implies that, for all systems <semantics>A,Bob(C)<annotation encoding="application/x-tex">A, B \in ob (\mathsf {C})</annotation></semantics>,

<semantics> A B= AB,<annotation encoding="application/x-tex">\top_A \otimes \top_B = \top_{A \otimes B},</annotation></semantics> and, given any determinstic test <semantics>𝒞C(A,B)<annotation encoding="application/x-tex">\mathcal {C} \in \mathsf {C}(A, B)</annotation></semantics>,

<semantics> B𝒞= A.<annotation encoding="application/x-tex">\top_B \circ \mathcal {C} = \top_A.</annotation></semantics>

The existence of a discard map allows a definition of causal morphisms in a causal theory. For example, as we saw in January when we discussed Kissinger and Uijlen’s paper, a test <semantics>{𝒞 i} iC(A,B)<annotation encoding="application/x-tex">\{ \mathcal {C}_i \}_i \in \mathsf {C} (A, B)</annotation></semantics> is causal if

<semantics> B{𝒞 i} i= AC(A,I).<annotation encoding="application/x-tex">\top_B \circ \{ \mathcal {C}_i \}_i = \top_A \in \mathsf {C}( A, I).</annotation></semantics>

In other words, for a causal test, discarding the outcome is the same as not performing the test. Intuitively it is not obvious why such morphisms should be called causal. But this definition enables the formulation of a non-signalling condition that describes the conditions under which the possibility of cause-effect correlation is excluded, in particular, it implies the impossibility of time travel.

Examples

The category <semantics>Mat( +)<annotation encoding="application/x-tex">Mat(\mathbb {R}_+)</annotation></semantics> of natural numbers and with <semantics>Mat( +)(m,n)<annotation encoding="application/x-tex">Mat(\mathbb {R}_+)(m,n)</annotation></semantics> the set of <semantics>n×m<annotation encoding="application/x-tex">n \times m</annotation></semantics> matrices, has the structure of a causal OPT. The causal morphisms in <semantics>Mat( +)<annotation encoding="application/x-tex">Mat(\mathbb {R}_+)</annotation></semantics> are the stochastic maps (the matrices whose columns sum to 1). This category describes classical probability theory.

The category <semantics>CPM<annotation encoding="application/x-tex">\mathsf{CPM}</annotation></semantics> of sets of linear operators on Hilbert spaces and completely positive maps between them is an OPT and describes quantum relations. The causal morphisms are the trace preserving completely positive maps.

Finally, Boxworld is the theory that allows to describe any correlation between two variables as the cause of some resource of the theory in the past.

Generalised Bayesian networks

So, we’re finally ready to give the main construction and results of the paper. As mentioned before, to get a generalised d-separation result, the idea is that we will distinguish observable and unobservable variables, and simply insist that conditional independence is only defined relative to observable variables.

To this end, a generalised DAG or GDAG is a DAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> together with a partition on the nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> into two subsets called observed and unobserved. We’ll represent observed nodes by triangles, and unobserved nodes by circles. An edge out of an (un)observed node will be called (un)observed and represented by a (solid) dashed arrow.

In order to get a generalisation of Theorem 5, we still need to come up with a sensible generalisation of the Markov property which will essentially say that at an observed node that has only observed parents, the distribution must be Markov. However, if an observed node has an unobserved parent, the latter’s whole history is needed to describe the distribution.

To state this precisely, we will associate a causal theory <semantics>(C,,I)<annotation encoding="application/x-tex">(\mathsf {C}, \otimes, I)</annotation></semantics> to a GDAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> via an assignment of systems to edges of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> and tests to nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, such that the observed edges of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> will ‘carry’ only the outcomes of classical tests (so will say something about conditional probability) whereas unobserved edges will carry only the output system.

Precisely, such an assignment <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> satisfies the generalised Markov condition (GMC) and is called a generalised Markov distribution if

  • Each unobserved edge corresponds to a distinct system in the theory.

  • If we can’t observe what is happening at a node, we can’t condition over it: To each unobserved node and each value of its observed parents, we assign a deterministic test from the system defined by the product of its incoming (unobserved) edges to the system defined by the product of its outgoing (unobserved) edges.

  • Each observed node <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is an observation test, i.e. a morphism in <semantics>C(A,I)<annotation encoding="application/x-tex">\mathsf {C}(A, I)</annotation></semantics> for the system <semantics>Aob(C)<annotation encoding="application/x-tex">A \in ob( \mathsf {C})</annotation></semantics> corresponding to the product of the systems assigned to the unobserved input edges of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>. Since <semantics>C<annotation encoding="application/x-tex">\mathsf {C}</annotation></semantics> is a causal theory, this says that <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is assigned a classical random variable, also denoted <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, and that if <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> is an observed node, and has observed parent <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, the distribution at <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> is conditionally dependent on the distribution at <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> (see here for details).

  • It therefore follows that each observed edge is assigned the trivial system <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>.

  • The joint probability distribution on the observed nodes of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> is given by the morphism <semantics>C(I,I)<annotation encoding="application/x-tex">\mathsf {C}(I, I)</annotation></semantics> that results from these assignments.

A generalised Bayesian network consists of a GDAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> together with a generalised Markov distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>.

Example

Consider the following GDAG

a poset

Let’s build its OPT morphism as indictated by the generalised Markov condition.

The observed node <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has no incoming edges so it corresponds to a <semantics>C(I,I)<annotation encoding="application/x-tex">\mathsf {C}(I, I)</annotation></semantics> morphism, and thus we assign a probability distribution to it.

The unobserved node A depends on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, and has no unobserved inputs, so we assign a deterministic test <semantics>A(x):IA<annotation encoding="application/x-tex">A(x): I \to A</annotation></semantics> for each value <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>.

a poset

The observed node <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> has one incoming unobserved edge and no incoming observed edges so we assign to it a test <semantics>Y:AI<annotation encoding="application/x-tex">Y: A \to I</annotation></semantics> such that, for each value <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, <semantics>YA(x)<annotation encoding="application/x-tex">Y \circ A(x)</annotation></semantics> is a probability distribution.

Building up the rest of the picture gives an OPT diagram of the form

a poset

which is a <semantics>C(I,I)<annotation encoding="application/x-tex">\mathsf {C}(I, I)</annotation></semantics> morphism that defines the joint probability distribution <semantics>P(x,y,z,w)<annotation encoding="application/x-tex">P(x,y,z,w)</annotation></semantics>. We now have all the ingredients to state Theorem 22, the generalised d-separation theorem. This is the analogue of Theorem 5 for generalised Markov distributions.

Theorem 22

Given a GDAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> and subsets <semantics>X,Y,Z<annotation encoding="application/x-tex">X,Y, Z</annotation></semantics> of observed nodes

  • if a probability distribution <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> is generalised Markov relative to <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> then <semantics>XY|ZXY|Z<annotation encoding="application/x-tex">X \perp Y \ | \ Z \Rightarrow X\perp\!\!\!\!\!\!\!\perp Y \ | \ Z</annotation></semantics>.

  • If <semantics>XY|Z<annotation encoding="application/x-tex">X\perp\!\!\!\!\!\!\!\perp Y \ | \ Z</annotation></semantics> holds for all generalised Markov probability distributions on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, then <semantics>XY|Z<annotation encoding="application/x-tex">X \perp Y \ | \ Z</annotation></semantics>.

Note in particular that there is no change in the definition of d-separation: d-separation of a GDAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> is simply d-separation with respect to its underlying DAG. There is also no change in the definition of conditional independence. Now, however, we restrict to statements of conditional independence with respect to observed nodes only. This enables the generalised soundness and completeness statements of the theorem.

The proof of soundness uses uniqueness of discarding, and completeness follows since generalised Markov is a stronger condition on a distribution than classically Markov.

Classical distributions on GDAGs

Theorem 22 is all well and good. But does it really generalise the classical case? That is, can we recover Theorem 5 for all classical Bayesian networks from Theorem 22?

As a first step, Proposition 17 states that if all the nodes of a generalised Bayesian network are observed, then it is a classical bayesian network. In fact, this follows pretty immediately from the definitions.

Moreover, it is easily checked that, given a classical Bayesian network, even if it has hidden or latent variables, it can still be expressed directly as a generalised Bayesian network with no unobserved nodes.

In fact, Theorem 22 generalises Theorem 5 in a stricter sense. That is, the generalised Bayesian network setup together with classical causality adds nothing extra to the theory of classical Bayesian networks. If a generalised Markov distribution is classical (then hidden and latent variables may be represented by unobserved nodes), it can be viewed as a classical Bayesian network. More precisely, Lemma 18 says that, given any generalised Bayesian network <semantics>(G,P)<annotation encoding="application/x-tex">(G,P)</annotation></semantics> with underlying DAG <semantics>G<annotation encoding="application/x-tex">G'</annotation></semantics> and distribution <semantics>P𝒞<annotation encoding="application/x-tex">P \in \mathcal {C}</annotation></semantics>, we can construct a classical Bayesian network <semantics>(G,P)<annotation encoding="application/x-tex">(G', P')</annotation></semantics> such that <semantics>P<annotation encoding="application/x-tex">P'</annotation></semantics> agrees with <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> on the observed nodes.

It is worth voicing a note of caution. The authors themselves mention in the conclusion that the construction based on GDAGs with two types of nodes is not entirely satisfactory. The problem is that, although the setups and results presented here do give a generalisation of Theorem 5, they do not, as such, provide a way of generalising Bayesian networks as they are used for probabilistic inference to non-classical settings. For example, belief propagation works through observed nodes, but there is no apparent way of generalising it for unobserved nodes.

Theory independence

More generally, given a GDAG <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, we can look at the set of distributions on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> that are generalised Markov with respect to a given causal theory. Of particular importance are the following.

  • The set <semantics>𝒞<annotation encoding="application/x-tex">\mathcal {C}</annotation></semantics> of generalised Markov distributions in <semantics>Mat( +)<annotation encoding="application/x-tex">Mat(\mathbb {R}_+)</annotation></semantics> on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>.

  • The set <semantics>𝒬<annotation encoding="application/x-tex">\mathcal {Q}</annotation></semantics> of generalised Markov distributions in <semantics>CPM<annotation encoding="application/x-tex">\mathsf{CPM}</annotation></semantics> on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>.

  • The set <semantics>𝒢<annotation encoding="application/x-tex">\mathcal {G}</annotation></semantics> of all generalised Markov distributions on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>. (This is the set of generalised Markov distributions in Boxworld.)

Moreover, we can distinguish another class of distributions on <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>, by not restricting to d-seperation of observed nodes, but considering distributions that satisfy the observable conditional independences given by any d-separation properties on the graph. Theorem 22 implies, in particular that <semantics>GI<annotation encoding="application/x-tex">G \subseteq I</annotation></semantics>.

And, so, since <semantics>Mat( +)<annotation encoding="application/x-tex">Mat(\mathbb {R}_+)</annotation></semantics> embeds into <semantics>CPM<annotation encoding="application/x-tex">\mathsf{CPM}</annotation></semantics>, we have <semantics>𝒞𝒬𝒢<annotation encoding="application/x-tex">\mathcal {C} \subseteq \mathcal {Q} \subseteq \mathcal {G} \subseteq \mathcal {I}</annotation></semantics>.

This means that one can ask for which graphs (some or all of) these inequalities are strict, and the last part of the paper explores these questions. In the original paper, a sufficient condition is given for graphs to satisfy <semantics>𝒞<annotation encoding="application/x-tex">\mathcal {C} \neq \mathcal {I}</annotation></semantics>. I.e. for these graphs it is guaranteed that the causal structure admits correlations that are non-local. Moreover the authors show that their condition is necessary for small enough graphs.

Another interesting result is that there exist graphs for which <semantics>𝒢<annotation encoding="application/x-tex">\mathcal {G} \neq \mathcal {I}</annotation></semantics>. This means that using a theory of resources, whatever theory it may be, to explain correlations imposes constraints that are stronger than those imposed by the relations themselves.

What next?

This setup represents one direction for using category theory to generalise Bayesian networks. In our group work at the ACT workshop, we considered another generalisation of Bayesian networks, this time staying within the classical realm. Namely, building on the work of Bonchi, Gadducci, Kissinger, Sobocinski, and Zanasi, we gave a functorial Markov condition on directed graphs admitting cycles. Hopefully we’ll present this work here soon.

by john (baez@math.ucr.edu) at July 09, 2018 03:58 PM

Lubos Motl - string vacua and pheno

Spin correlations at ATLAS: tops deviate by 3.2 or 3.7 sigma
After some time, we saw an LHC preprint with an intriguing deviation from the Standard Model predictions. It appeared in the preprint
Measurements of top-quark pair spin correlations in the \(e\mu\) channel at \(\sqrt s = 13\TeV\) using \(pp\) collisions in the ATLAS detector
You should also see a 27-page-long presentation by Reinhild Peters.




To make the story short, the measured correlation between the top quark spins – in events with one \(e^\pm\) and one oppositely charged \(\mu^\mp\) at the end and a top quark pair in the middle – exceeds the theoretical prediction by 3.7 standard deviations if you pretend that the theoretical prediction is exact, or 3.2 sigma if you choose some sensible nonzero error margin for the theoretical prediction.

The chance that a deviation of this size appears by chance is comparable to 1 in 1,000.




It may be a fluke – after all, ATLAS and CMS have measured a thousand of similar numbers so one of them may deviate by a large deviation that seems like "one in one thousand cases". As always, there's some possibility that the top quarks' spin correlation is enhanced by some physics beyond the Standard Model. It could be many things, I have no idea what should be the default explanation. If the top quarks sometimes came from some new spinless or spin-one intermediate particles, you could move the spin correlation up or down, respectively.



The LHC (Les Horribles Cernettes) girls have sung a song about the spins of quarks. You are invited to listen to the song, measure the correlations yourself, and determine whether the deviation from the Standard Model is exciting enough.

by Luboš Motl (noreply@blogger.com) at July 09, 2018 08:27 AM

July 08, 2018

Marco Frasca - The Gauge Connection

ICHEP 2018

The great high-energy physics conference ICHEP 2018 is over and, as usual, I spend some words about it. The big collaborations of CERN presented their last results. I think the most relevant of this is about the evidence (3\sigma) that the Standard Model is at odds with the measurement of spin correlation between top-antitop pair of quarks. More is given in the ATLAS communicate. As expected, increasing precision proves to be rewarding.

About the Higgs particle, after the important announcement about the existence of the ttH process, both ATLAS and CMS are pursuing further their improvement of precision. About the signal strength they give the following results. For ATLAS (see here)

\mu=1.13\pm 0.05({\rm stat.})\pm 0.05({\rm exp.})^{+0.05}_{-0.04}({\rm sig. th.})\pm 0.03({\rm bkg. th})

and CMS (see here)

\mu=1.17\pm 0.06({\rm stat.})^{+0.06}_{-0.05}({\rm sig. th.})\pm 0.06({\rm other syst.}).

The news is that the error is diminished and both agrees. They show a small tension, 13% and 17% respectively, but the overall result is consistent with the Standard Model.

When the different contributions are unpacked in the respective contributions due to different processes, CMS claims some tensions in the WW decay that should be taken under scrutiny in the future (see here). They presented the results from 35.9{\rm fb}^{-1} data and so, there is no significant improvement, for the moment, with respect to Moriond conference this year. The situation is rather better for the ZZ decay where no tension appears and the agreement with the Standard Model is there in all its glory (see here). Things are quite different, but not too much, for ATLAS as in this case they observe some tensions but these are all below 2\sigma (see here). For the WW decay, ATLAS does not see anything above 1\sigma (see here).

So, although there is something to take under attention with the increase of data, that will reach 100 {\rm fb}^{-1} this year, but the Standard Model is in good health with respect to the Higgs sector even if there is a lot to be answered yet and precision measurements are the main tool. The correlation in the tt pair is absolutely promising and we should hope this will be confirmed a discovery.

 

by mfrasca at July 08, 2018 10:58 AM

July 04, 2018

The n-Category Cafe

Symposium on Compositional Structures

There’s a new conference series, whose acronym is pronounced “psycho”. It’s part of the new trend toward the study of “compositionality” in many branches of thought, often but not always using category theory:

  • First Symposium on Compositional Structures (SYCO1), School of Computer Science, University of Birmingham, 20-21 September, 2018. Organized by Ross Duncan, Chris Heunen, Aleks Kissinger, Samuel Mimram, Simona Paoli, Mehrnoosh Sadrzadeh, Pawel Sobocinski and Jamie Vicary.

The Symposium on Compositional Structures is a new interdisciplinary series of meetings aiming to support the growing community of researchers interested in the phenomenon of compositionality, from both applied and abstract perspectives, and in particular where category theory serves as a unifying common language. We welcome submissions from researchers across computer science, mathematics, physics, philosophy, and beyond, with the aim of fostering friendly discussion, disseminating new ideas, and spreading knowledge between fields. Submission is encouraged for both mature research and work in progress, and by both established academics and junior researchers, including students.

More details below! Our very own David Corfield is one of the invited speakers.

The Symposium on Compositional Structures is a new interdisciplinary series of meetings aiming to support the growing community of researchers interested in the phenomenon of compositionality, from both applied and abstract perspectives, and in particular where category theory serves as a unifying common language. We welcome submissions from researchers across computer science, mathematics, physics, philosophy, and beyond, with the aim of fostering friendly discussion, disseminating new ideas, and spreading knowledge between fields. Submission is encouraged for both mature research and work in progress, and by both established academics and junior researchers, including students.

Submission is easy, with no format requirements or page restrictions. The meeting does not have proceedings, so work can be submitted even if it has been submitted or published elsewhere.

While no list of topics could be exhaustive, SYCO welcomes submissions with a compositional focus related to any of the following areas, in particular from the perspective of category theory:

  • logical methods in computer science, including classical and quantum programming, type theory, concurrency, natural language processing and machine learning;
  • graphical calculi, including string diagrams, Petri nets and reaction networks;
  • languages and frameworks, including process algebras, proof nets, type theory and game semantics;
  • abstract algebra and pure category theory, including monoidal category theory, higher category theory, operads, polygraphs, and relationships to homotopy theory;
  • quantum algebra, including quantum computation and representation theory;
  • tools and techniques, including rewriting, formal proofs and proof assistants, and game theory;
  • industrial applications, including case studies and real-world problem descriptions.

This new series aims to bring together the communities behind many previous successful events which have taken place over the last decade, including “Categories, Logic and Physics”, “Categories, Logic and Physics (Scotland)”, “Higher-Dimensional Rewriting and Applications”, “String Diagrams in Computation, Logic and Physics”, “Applied Category Theory”, “Simons Workshop on Compositionality”, and the “Peripatetic Seminar in Sheaves and Logic”.

The steering committee hopes that SYCO will become a regular fixture in the academic calendar, running regularly throughout the year, and becoming over time a recognized venue for presentation and discussion of results in an informal and friendly atmosphere. To help create this community, in the event that more good-quality submissions are received than can be accommodated in the timetable, we may choose to defer some submissions to a future meeting, rather than reject them. This would be done based on submission order, giving an incentive for early submission, and avoiding any need to make difficult choices between strong submissions. Deferred submissions would be accepted for presentation at any future SYCO meeting without the need for peer review. This will allow us to ensure that speakers have enough time to present their ideas, without creating an unnecessarily competitive atmosphere. Meetings would be held sufficiently frequently to avoid a backlog of deferred papers.

Invited Speakers

  • David Corfield, Department of Philosophy, University of Kent: “The ubiquity of modal type theory”.

  • Jules Hedges, Department of Computer Science, University of Oxford: “Compositional game theory”

Important Dates

All times are anywhere-on-earth.

  • Submission deadline: Sunday 5 August 2018
  • Author notification: Monday 13 August 2018
  • Travel support application deadline: Monday 20 August 2018
  • Symposium dates: Thursday 20 September and Friday 21 September 2018

Submissions

Submission is by EasyChair, via the following link:

Submissions should present research results in sufficient detail to allow them to be properly considered by members of the programme committee, who will assess papers with regards to significance, clarity, correctness, and scope. We encourage the submission of work in progress, as well as mature results. There are no proceedings, so work can be submitted even if it has been previously published, or has been submitted for consideration elsewhere. There is no specific formatting requirement, and no page limit, although for long submissions authors should understand that reviewers may not be able to read the entire document in detail.

Funding

Some funding is available to cover travel and subsistence costs, with a priority for PhD students and junior researchers. To apply for this funding, please contact the local organizer Jamie Vicary at j.o.vicary@bham.ac.uk by the deadline given above, with a short statement of your travel costs and funding required.

Programme Committee

The symposium managed by the following people, who also serve as the programme committee.

  • Ross Duncan, University of Strathclyde
  • Chris Heunen, University of Edinburgh
  • Aleks Kissinger, Radboud University Nijmegen
  • Samuel Mimram, École Polytechnique
  • Simona Paoli, University of Leicester
  • Mehrnoosh Sadrzadeh, Queen Mary, University of London
  • Pawel Sobocinski, University of Southampton
  • Jamie Vicary, University of Birmingham and University of Oxford (local organizer)

by john (baez@math.ucr.edu) at July 04, 2018 05:57 PM

Tommaso Dorigo - Scientificblogging

Chasing The Higgs Self Coupling: New CMS Results
Happy Birthday Higgs boson! The discovery of the last fundamental particle of the Standard Model was announced exactly 6 years ago at CERN (well, plus one day, since I decided to postpone to July 5 the publication of this post...).

In the Standard Model, the theory of fundamental interactions among elementary particles which enshrines our current understanding of the subnuclear world,  particles that constitute matter are fermionic: they have a haif-integer value of a quantity we call spin; and particles that mediate interactions between those fermions, keeping them together and governing their behaviour, are bosonic: they have an integer value of spin. 

read more

by Tommaso Dorigo at July 04, 2018 12:57 PM

July 03, 2018

Lubos Motl - string vacua and pheno

David Gross: make America great again
The first string theory's formula, the Veneziano amplitude, was introduced to physics in 1968, i.e. half a century ago.

In that year of 1968, Czechoslovakia tried its "socialism with a human face", and the experiment was terminated by the Warsaw Pact tanks in August (next month, we will "celebrate" that). Meanwhile, the youth in the West tried a seemingly similar revolution. Only in recent years, I was forced to realize that what started in the West was really going in the opposite direction than the 1968 Prague Spring.



At any rate, the annual conference, Strings 2018, didn't forget about the Veneziano far-reaching playing with the Euler Beta function. Veneziano was present. The 2-hour-long panel discussion at the end of the conference is arguably the most layman-friendly video produced by the conference. One frustrating fact is that the video only has 1500 views as of this moment. No journalists were interested in the conference.




Some 500 string theorists (perhaps 1/2 of the world's currently employed professional string theorists) gathered in the tropical Okinawa, Japan. I feel absolutely confident that among gatherings of 500-1,000 people, the annual string conferences have by far the highest average IQ of the participants. No Bilderberg, Davos, freemason meetings, and even the parties of the old Nobel prize winners could compete. If the U.S. invaded and bombed Okinawa again, like in April 1945, and all the people who are there would be killed, the set of the world's people with the IQ above 160 would be detectably decimated.




You may watch the panel discussion which recalls some topics and/or controversies. You may also try to read Tetragraviton's earlier sketch of the conference.

Although David Gross pretends that he doesn't love Donald Trump too much, it's still true that the great minds think alike. Before 1:54:40, he lists the next four places to host the annual conferences as Brussels, Capetown, Vacuum, and Vienna, where Vacuum represents a pause in 2021. It would be a shame to have a hole.

Who can fill the hole? Gross shows the organizers and in the late 20th century, the U.S. were much more important, relatively speaking. After 2000, the conferences spread out of the U.S. OK, he gradually converges to the – not quite a priori obvious – punch line: Make America great again. A larger number of conferences should be held in the U.S. again. Gross also tells Cambridge, MA and Palo Alto, CA to feel no pressure. ;-)

America is still the world's most attractive place for investment. It's the most likely country to create places with a huge concentration of high brainpower – like in the Silicon Valley. In Europe and other places, we often place limits and we moderate things. We like to vote for the The Party of Moderate Progress Within the Bounds of the Law founded by Jaroslav Hašek, the author of the Good soldier Švejk. But in America, there are no limits. Or you can take it to the limit. Or you can get to the Moon – in fact, literally.

This is how many of us have understood America's WOW factor. Maybe in the computer technologies, it's still true. But I think that since 2000 or so, America started to lose this WOW factor. In fact, I think that the U.S. became the main source of the political correctness and related toxic diseases that are gradually poisoning and devouring the Western civilization.

Some of the "outsourcing" of the string conferences was purely due to the political correctness – those second-class, third-class, and other places that aren't quite as good as America shouldn't think that they're worse than America. Well, almost all of them still are. But such games can never remain just games. I think that some of the outsourcing is real. America has lost much of its motivation to lead the world.

Donald Trump obviously isn't the man who should be expected to revitalize the string theory research. But helpfully enough, David Gross accepted the role of the Donald in string theory. Well, he's had this role for some 33 years, I think. You know, folks in America should realize that they should still be the bosses of the world because things won't work too well without them.

Already since the early 21st century, I grew very disillusioned with America. I noticed that some of the garbage that almost defines the European Union exists in the U.S. as well – and in some cases, America harbors more hardcore versions of the low-quality humans and their pathetic excuses to keep the world as a network of muddy sycophants connected to a stagnant bureaucratic structure.

For example, when I translated Brian Greene's first bestseller to Czech, it was a great success in Czechia and the feedback was almost entirely enthusiastic. In some sense, I brought a piece of America to Czechia. There was one exception who wrote some tirade against string theory, against Brian Greene, against myself, and several other related entities. I hadn't been quite familiar with that shocking šithead before – but the exposure has taught me a very speedy lesson. This "Prof" Jiří Chýla – at that time, the boss of Particle Physics at the Czech Academy of Sciences who found his job appropriate for spending days and trying to harm a Rutgers grad student and his book with a 30-page-long rant (which he didn't, thank God) – was the best example of the communist era crap – the frogs sitting on all the springs – that keeps the Czech institutions uncompetitive in theoretical physics.

He is a symbol of the culture of old men who haven't produced any ideas that anyone in the world would give a damn about – at least no such idea since a 30-citation paper they wrote as postdocs 40 years ago, but that one wasn't important, either. But they want to mask this uselessness and pretend that they're doing pretty much the same thing as the best people in the world so they are hiring younger people who just lick their aßes, much like Mr Chýla licks the aßes of the jerks in the European Union all the time (who pay him the money because it's so wonderful to steal lots of money from the European taxpayers and pour it on useless parasites such as Mr Chýla).

Shortly after 2000, I was learning about some younger people – people of my generation – who found Mr Chýla's behavior and character OK and I just lost my emotional attachment to them, too. I just can't understand how someone may be so incredibly morally fudged up. Everyone who defends the likes of Mr Chýla is just scum.

For years, I had thought that the influence of fudged up individuals of Mr Chýla's type on the institutions is an artifact of my nation's not being so good as other nations. Our DNA is perhaps not so good and the communism has screwed our social structure and morality, too. But I no longer think that there's something very special about Czechia here. It was an unreasonable and unfair "masochist racism". I think that the likes of Peter W*it, Sabine Hossenfelder, and many others are pretty much analogous pieces of crap as Mr Chýla – and they and their pathetic excuses for their own inadequacy got comparably influential in the U.S. and Western Europe. All this filth just like to spread ludicrous propaganda that they're on par with the actual physicists – much like communism was producing propaganda that we were better off than the capitalist world – even though they must know that all these things are ludicrous lies. They're forming alternative structures that try to conquer the environment.

In the panel discussion, lots of questions were asked. My understanding is that all the questions were posed by actual registered participants of the conference. Nevertheless, the "plurality" (Gross' word) was about the falsifiability, gloom, and all this garbage. (Other repeated questions dealt with the existence of de Sitter solutions and other "more normal" or "professional" topics that have existed at previous conferences, too.) Most of the authors of such questions were probably some young participants. But where is the mankind going? It's not trivial to fly to Okinawa (even the U.S. troops in April 1945 would agree) and it's not trivial to do the other things needed to host a participant. Does it make any sense to fly to these islands if you have such serious doubts about the very justification of the field?

I don't really believe that the Millennial generation will advance any things that are actually hard enough, like string theory. Shiraz Minwalla said that the field was healthy, diverse, and people allow the evidence to take them wherever it goes. They shouldn't listen to anybody else, that's how things should be. It sounds nice. I think that there's still some individual stubbornness in Shiraz's attitude and I think that a big part of my generation is close to it.

But the individual stubbornness is exactly what the Millennials are completely lacking – so Shiraz's bullish description talks about something that will go extinct with the older currently alive generations. I've met some great exceptions but their percentage – even among the folks who should be intellectual elites of the Millennials in one way or another – is just insanely tiny. Almost all the Millennials want to be obedient, behave as members of a herd of stupid sheep, and say how smart and original they are by being stupid sheep in the herd. They want to join a club of one million holders of the Bitcoin who just buy the Bitcoin for their parents' money (and they think that being in a bunch of millions of people who do an exercise turns you into an "elite" of a new kind, wow) and say that this is how they change the world. Or they want to parrot pathetic lies about multiculturalism, environmentalism, and several other prominent delusions for the stupid masses.

This attitude isn't compatible with serious research in cutting-edge theoretical physics. It's incompatible with lots of other things that are needed for some true progress of the mankind, science, and technology.



Another topic: Don Garbutt recorded a nice track called "String Theory" with a 2012 animation showing the scales from the Planck scale to the observable Universe. There are lots of funny things of many sizes – e.g. Italy and Pluto are neighbors somewhere in the middle.

by Luboš Motl (noreply@blogger.com) at July 03, 2018 06:18 PM

June 28, 2018

Lubos Motl - string vacua and pheno

James Wells' anti-naturalness quackery
Sabine Hossenfelder celebrates a preprint titled
Naturalness, Extra-Empirical Theory Assessments, and the Implications of Skepticism
and rightfully so because its author, James Wells, could literally shake her hand right away and join her personal movement of crackpots. Wells' paper isn't just wrong – it's incredibly stupid. Thankfully, he only sent it to the "History and Philosophy of Physics" subarchive although it was cross-listed to the professional subarchives. (Maybe the arXiv moderators should be thanked for correctly classifying this paper as social sciences, pseudosciences, and humanities.)




OK, Wells (like Hossenfelder) wants to eliminate naturalness – and any "extra-empirical quality" – from science. Do you really think it's possible? Not really. Let us discuss the abstract carefully.
Naturalness is an extra-empirical quality that aims to assess plausibility of a theory.
It's a proposed definition or classification and it's fair enough.




Now,
Finetuning measures are one way to quantify the task.
Very well. Everyone knows that. The following sentence says:
However, knowing statistical distributions on parameters appears necessary for rigor.
Yup. If you want to precisely (it's a better word than "rigorously") calculate a fine-tuning measure or another quantity telling you how much a theory is fine-tuned, you need statistical distributions on the space of possible theories and on their parameter spaces.
Such meta-theories are not known yet.
Strictly speaking, it may be true because there's no precise or rigorous prescription to calculate the probability of some values of parameters or the probability of one theory consistent with observations or another.

However, what Wells completely misses is that some i.e. not precise and not rigorous prescription to compare two theories has to be used, anyway, otherwise the scientific method as a whole would be impossible. Without this type of – imprecise or not rigorous – thinking, we couldn't say whether evolution or creationism is a better theory of the origin of species. We wouldn't be able to say anything.

Again, I must quote Feynman's monologue about the flying saucers. All the statements that science produces are of the form that one statement is more likely and another one is less likely etc. All such probabilities always depend on the priors, not only on the evidence. It's unavoidable. If you ban sentences that "flying saucers are unlikely" (because you find the dependence on the prior probabilities "unscientific"), and Feynman's antagonist wanted to ban them, then you are banning science as a whole.

So it's not true that such meta-theories are not known yet. They are known, they are imprecise and not rigorous, but they are absolutely essential for science and successful, too.
A critical discussion of these issues is presented, including their possible resolutions in fixed points.
He includes a technical discussion of fixed points (scaling-invariant field theories) but claims that all "extra-empirical reasoning" is unacceptable in their context, too.
Skepticism of naturalness's utility remains credible, as is skepticism to any extra-empirical theory assessment (SEETA) that claims to identify "more correct" theories that are equally empirically adequate.
This skepticism is as credible as creationism and all other wrong approaches to science – in fact, this skepticism is a key part of them. Otherwise, it's great that he invented a new acronym. Brain-dead journalists will surely boast about their ability to copy and hype this new meaningless acronym.

A crucial proposal for "a new kind of science" appears here:
Specifically to naturalness, SEETA implies that one must accept all concordant theory points as a priori equally plausible, with the practical implication that a theory can never have its plausibility status diminished by even a "massive reduction" of its viable parameter space as long as a single theory point still survives.
Wow. You know, saying that all theories are "equally possible" means that they have the same probability, namely \(p\). But a problem is that they're mutually exclusive and there are infinitely many of them. It follows that\[

\sum_{i=1}^\infty p \leq 1.

\] Their total probability is at most equal to one. I chose the \(\leq\) sign to emphasize that we're only summing over the known theories and there may be additional ones that have a chance to be correct. But the left hand side above is equal to \(p\cdot \infty\) and the only allowed value of \(p\) that obeys the inequality above is \(p=0\). If all theories in an infinite list were equally plausible, then all of them would be strictly ruled out, too!

In reality, the theories are also parameterized by continuous parameters so the sum above should be replaced or supplemented with an integral. With an integral, the statement that they are "equally plausible" becomes ill-defined because, as Wells admitted, he doesn't have any measure. He wants to use the absence of a canonical measure as a "weapon against others" but overlooks that it's a weapon against his own claims, too.

If he forbids you to use any measure, then his statement that two points (or regions) at a continuous parameter space are "equally plausible" becomes nonsensical.
A second implication of SEETA suggests that only falsifiable theories allow their plausibility status to change, but only after discovery or after null experiments with total theory coverage.
Excellent. If this rule is interpreted literally, you really can't eliminate creationism or any wrong theory. In those seven days He had to create all the species, He could have used tools with a sufficient number of parameters so that He created the correct DNA of all the species we need. If you can't exclude "all creationist models" and every single one of them, you can't really say that creationism is very unlikely, Wells (just like Feynman's antagonist) tells us.

Many of us say that evolution is a far better theory of the origin of species than creationism. Why? Because the fine-tuning that creationism needs to agree with the observed details is massive. And when the required fine-tuning is massive, it just doesn't really matter what's the "precise" or "rigorous" way to quantify it. Any sensible way to quantify it will still conclude that it is massive. Now, the word "sensible" in the previous sentence also fails to be defined precisely. But at some moment, you have to stop with these complaints, otherwise you just can't get anywhere in science.

That's why Wells' claim that you should completely abandon naturalness and "extra-empirical criteria" just because they're not perfectly precise is so unbelievably idiotic. You could try to apply his fundamentalist attitude in any other context. Child porn cannot be precisely defined, either. Does it mean that we can't ban it? Well, Justice Potter Stewart defined porn by saying that "I know it when I see it".

That's really the point in the discussion of naturalness, too. There may be some marginal cases in which the absence of a precise definition or quantification will make it impossible to reliably decide whether something is porn or whether something is natural. But in a huge fraction of the cases that are relevant for law enforcement officials and for physicists, the quantities labeling the "amount of porn" or the "naturalness" end up being so far from the "disputable lines" that the imprecision won't matter at all. In so many cases, we will say: "This is porn." We will say it even without a rigorous definition of "porn". And in the same way, we will say that a creationist model explaining some DNA sequences is "unnatural" even though we don't really have a canonical, unique, universal, ultimate, precise definition of "naturalness of a hypothesis about the origin of species", either!

So when some theories are really heavily unnatural, we simply see it. And we need this judgment, despite its lack of rigor and precision, to do science. We have always needed it. We couldn't decide even about the basic questions if we banned this "extra-empirical" reasoning. Everyone who questions the need for this imprecise or "extra-empirical" reasoning is absolutely deluded.

Sometimes the implausibility of a theory – like creationism – is understood informally, intuitively, and qualitatively. Sometimes, especially in fundamental physics, we need a bit more quantitative treatment. This treatment is not rigorous or precise but it's more quantitative than the arguments we need to criticize creationism. So we assume that the distributions are some natural uniform distributions mostly spanning the values of dimensionless parameters that are of order one. The detailed choice doesn't really matter when something is really unnatural! In most cases, we have pretty good arguments to say that the choice of a uniform distribution for \(g\) or \(g^2\) or \(1/g^2\) is more natural than the other two etc.

The last sentence of the abstract is very cute, too:
And a third implication of SEETA is that theory preference then becomes not about what theory is more correct but what theory is practically more advantageous, such as fewer parameters, easier to calculate, or has new experimental signatures to pursue.
The only problem is that a genuine scientist, pretty much by definition, looks for the more correct or more likely theories. He wants to answer the questions such as whether creationism or evolution is a better theory of the origin of species, whether a proton is composed of smaller particles, whether there is a Higgs boson, and millions of other things.

So the correctness or probability of different possibilities simply has to be compared, otherwise you're not doing science at all. You're not producing any scientific results, any laws, nothing. By promoting SEETA, Wells pretends to be "more scientific" but in reality, he wants to throw the key baby of the scientific method out with the bath water.

A real scientist is working to find the truth about Nature – which means the (more) correct and (more) likely theories that explain our observations. If he's looking for a theory that is "practically more advantageous, such as fewer parameters, easier to calculate, or has new experimental signatures to pursue", then he is simply not a scientist in the proper sense. He's a utilitarian of a sort.

Theory A may be simpler to calculate with than theory B. But that doesn't mean that it's more correct or more likely to be true.

At the beginning of the abstract, Wells declared his goal to liberate science from all the distributions and "extra-empirical" judgments. But in the last sentence, he contradicts himself and basically admits it's impossible. So he also has some "extra-empirical" criteria, after all. The only difference is that his criteria aren't designed to look for more likely theories. He's looking for more convenient (and similar adjectives that are not equivalent to the truth) ideas.

There is an overlap between his criteria and the criteria of physicists who are actually looking for the truth about Nature. For example, both seem to prefer "theories with a small number of parameters". But Wells only picks this criterion because of some convenience. In proper physics, we may actually justify why we start with a theory with a fewer parameters. Why is it so? Because theories with a larger number of parameters are either
  1. much less likely than the theory with a few parameters because most of the "new parameter space" spoils the predictions – because additional parameters have to be adjusted and it's unlikely that it's done right, or
  2. the theory with fewer parameters may be considered as a "subset" of more complex theories, so if you study the simpler theory in this sense, you're not wasting your time – most of your work may be recycled once you deal with the possible more complex theories (the whole paradigm of effective field theories is a broad subcategory of this phenomenon)
These arguments aren't "rigorously proven" to be correct but if we didn't use any "extra-empirical" guides at all, we just couldn't possibly make a single decision in science, ever, because an arbitrarily wrong theory may always be modified, engineered, and tuned to be formally compatible with the data.

His list of the "preferred extra-empirical" criteria includes
simplicity, testability, falsifiability, naturalness, calculability, and diversity.
None of them actually tries to be equivalent to the validity of a theory, the probability that it's correct, which is why those aren't really scientific criteria. But in this list, the last entry, "diversity", must have shocked many readers just like it has shocked me. What kind of diversity? Does he want to prefer papers written by black or female or transsexual authors? ;-)
Or, a scientist may wish to widen her vision of observable consequences of concordant theories in order to cast a wider experimental net, which would lead her to pursue diverse theories over simple theories.
Well, the choice of pronouns indicates that he really wants to prefer theories by female authors, even if he never makes this statement explicitly. Well, I am sure you still hope that he doesn't actually talk about the identity politics. Another sentence says the following about diversity:
No theory of theory preference will be given here, except to say that “diversity” has a strong claim to a quality for preference.
It's rather hard to figure out what he means by the "diversity of a single theory". We usually understand "diversity" as a property of whole sets or groups (e.g. groups of people), not the individual elements or members. But a few sentences later, we read:
A few examples out of many in the literature that have the quality of diversity at least going for it are clockwork theories [19, 20] and theories of superlight dark matter (see, e.g., [21, 22]). These theories lead to new experiments, or new experimental analyses, that may not have been performed otherwise.
He just picks some – not really terribly motivated – theories, clockwork theories and superlight dark matter, and wants to prefer them because they have a "quality of diversity". The last sentence explains that by "diversity", he means that the theory "leads" to new experiments or new analyses.

It's just nutty. Theories never "lead" to experiments. Experimenters may decide to build an experiment but it's their practical decision that doesn't follow in any logical way from a theory. An experimenter needs some creativity, practical skills including some intuition for the economy of some efforts, knowledge of the established theory as well as proposed hypotheses to go beyond them, and good luck to successfully decide which things are interesting to be tested or measured and how he can find something interesting or new.

There's no "straightforward" way to derive these experimenters' decisions from any theory by itself. There's surely no "rigorous" way to do so – but you see the double standards. Other people's criteria have to be "rigorous", otherwise they need to be thrown away. But his criteria may be totally non-rigorous. What the fudge?

So if an experimenter is inspired by some theory, and the experiment may only be justified by a clockwork theory or a theory of superlight dark matter, good for him. But the experimenter isn't guaranteed to find the damn new effect. And if the new effect is only predicted by some very special theory, or one theory among hundreds, then – sorry to say – it probably makes it less likely, not more likely, that the experiment will lead to some interesting results. Such a dependence of the new effect on some very special theory is clearly an argument (not an indisputable one, but still an argument) against the experiment if the experimenter is rational.

Wells clearly wants to invalidate the self-evidently rational reasoning above. How does he invalidate it? If a theory C predicts something that no other theory predicts, this theory will be declared "more important" because it passes a test from "diversity". Holy crap. Even if he talks about some technical features of theories, their predictions, the logic of his reasoning is almost isomorphic to affirmative action, reverse racism, and reverse sexism, indeed. For all purposes, clockwork theories are transsexual Muslims and the superlight dark matter is a female vegan who loves steaks. And that's why he wants to make them more widespread. But from a rational viewpoint, what he calls "diversity" should be viewed as a negative trait, at least a negative recommendation for an experimenter.

His "extra preferences" are absolutely irrational from the viewpoint of the search for the truth and due to their similarity to the toxic left-wing identity politics, every decent physicist must immediately vomit when he hears about Wells' proposals for "new criteria". If you fail to vomit, you are probably not a good physicist.

Sorry but as long as science remains science, it is looking for the truth i.e. for theories that are more likely to be true or compatible with a body of observations. And this is always evaluated by meritocratic criteria using justifiable probability distributions. Because the final theory of everything isn't known yet, these probability distributions and criteria aren't totally precise and rigorously defined. But they're parts of the required theorist's toolkit, they're being tested by the experiments as well, and their current form as believed by the best theorists are good enough – and good scientists also spend some time by trying to improve and refine them. At any rate, they're vastly better than the pseudoscientific and borderline political new criteria proposed by Wells that have nothing whatever to do with the chances of the theories to be true.

And that's the memo.

by Luboš Motl (noreply@blogger.com) at June 28, 2018 03:27 PM

The n-Category Cafe

Elmendorf's Theorem

I want to tell you about Elmendorf’s theorem on equivariant homotopy theory. This theorem played a key role in a recent preprint I wrote with Hisham Sati and Urs Schreiber:

We figured out how to apply this theorem in mathematical physics. But Elmendorf’s theorem by itself is a gem of homotopy theory and deserves to be better known. Here’s what it says, roughly: given any <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-space <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, the equivariant homotopy type of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is determined by the ordinary homotopy types of the fixed point subspaces <semantics>X H<annotation encoding="application/x-tex">X^H</annotation></semantics>, where <semantics>H<annotation encoding="application/x-tex">H</annotation></semantics> runs over all subgroups of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>. I don’t know how to intuitively motivate this fact; I would like to know, and if any of you have ideas, please comment. Below the fold, I will spell out the precise theorem, and show you how it gives us a way to define a <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-equivariant version of any homotopy theory.

We know that in ordinary homotopy theory, there are two kinds of spaces we can study. We can study CW-complexes up to homotopy equivalence, or we can study topological spaces up to weak homotopy equivalence. Weak homotopy equivalence is morally the right kind of equivalence, but Whitehead’s theorem tells us that for the nicer kind of space, the CW-complex, weak homotopy equivalence is the same as strong homotopy equivalence. Moreover, the CW-approximation theorem says that any space is weak homotopy equivalent to a CW-complex. So, they’re really two ways of studying the same thing. One is more flexible, the other more concrete.

NB. In this post, I’ll use the adjective “strong” to contrast homotopy equivalence with weak homotopy equivalence. People usually call strong homotopy equivalence just homotopy equivalence.

Now let <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> be a compact Lie group. For <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-spaces, we can also define both strong and weak homotopy equivalence. The strong homotopy equivalence is the obvious thing: you have two equivariant maps <semantics>f:XY<annotation encoding="application/x-tex">f \colon X \to Y</annotation></semantics> and <semantics>g:YX<annotation encoding="application/x-tex">g \colon Y \to X</annotation></semantics>, that are inverse to each other up to equivariant homotopies <semantics>η:fg1 Y<annotation encoding="application/x-tex">\eta \colon f g \Rightarrow 1_Y</annotation></semantics> and <semantics>η:gf1 X<annotation encoding="application/x-tex">\eta' \colon g f \Rightarrow 1_X</annotation></semantics>. This lets us consider <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-spaces up to homotopy equivalence. But as for spaces, the morally correct notion of equivalence is weak homotopy equivalence, and this is much stranger: a <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-equivariant map <semantics>f:XY<annotation encoding="application/x-tex">f \colon X \to Y</annotation></semantics> is a equivariant weak homotopy equivalence if it restricts to an ordinary weak homotopy equivalence between the fixed points spaces, <semantics>f:X HY H<annotation encoding="application/x-tex">f \colon X^H \to Y^H</annotation></semantics>, for all closed subgroups <semantics>HG<annotation encoding="application/x-tex">H \subseteq G</annotation></semantics>.

Why on earth should these two notions of equivalence be so different? The equivariant Whitehead theorem justifies this, though again I don’t have a good intuitive explanation for why it should be true. To state this theorem, first I have to tell you what a <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-CW-complex is. We can construct them much as we do ordinary CW-complexes, except they are built from cells of the form:

<semantics>D n×G/H<annotation encoding="application/x-tex"> D^n \times G/H </annotation></semantics>

where <semantics>D n<annotation encoding="application/x-tex">D^n</annotation></semantics> is the <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-disk with the trivial <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> action, and <semantics>G/H<annotation encoding="application/x-tex">G/H</annotation></semantics> is a coset space of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> with the left <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> action. These cells are then glued together by <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-equivariant attaching maps, just like an ordinary CW-complex. The result is a <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-CW-complex. The equivariant Whitehead theorem, due to Bredon, then says that for any pair of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-CW-complexes, they are weak homotopy equivalent if and only if they are strong homotopy equivalent.

This suggests the key insight behind Elmendorf’s theorem: that we can study <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-spaces simply by looking at <semantics>X H<annotation encoding="application/x-tex">X^H</annotation></semantics> for all closed subgroups <semantics>HG<annotation encoding="application/x-tex">H \subseteq G</annotation></semantics>. But this operation, of taking a subgroup <semantics>H<annotation encoding="application/x-tex">H</annotation></semantics> to a space <semantics>X H<annotation encoding="application/x-tex">X^H</annotation></semantics>, actually defines a functor:

<semantics>X:Orb G opSpaces.<annotation encoding="application/x-tex"> X \colon Orb_G^{op} \to Spaces . </annotation></semantics>

Here, the domain of this contravariant functor is the orbit category <semantics>Orb G<annotation encoding="application/x-tex">Orb_G</annotation></semantics>. This is the category with:

  • objects the coset spaces <semantics>G/H<annotation encoding="application/x-tex">G/H</annotation></semantics>, for each closed subgroup <semantics>HG<annotation encoding="application/x-tex">H \subseteq G</annotation></semantics>.
  • morphisms the <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-equivariant maps.

This is called the orbit category thanks to the elementary fact that any orbit in any <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-space is of the form <semantics>G/H<annotation encoding="application/x-tex">G/H</annotation></semantics>, for a closed subgroup <semantics>H<annotation encoding="application/x-tex">H</annotation></semantics> the stabilizer of some point in the orbit.

Since the functor associated to <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is contravariant, it is a presheaf on the orbit category <semantics>Orb G<annotation encoding="application/x-tex">Orb_G</annotation></semantics>, valued in the category of spaces, <semantics>Spaces<annotation encoding="application/x-tex">Spaces</annotation></semantics>. The assignment taking a <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-space <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> to the presheaf with value <semantics>X H<annotation encoding="application/x-tex">X^H</annotation></semantics> on the orbit space <semantics>G/H<annotation encoding="application/x-tex">G/H</annotation></semantics> defines an embedding:

<semantics>y:GSpacesPSh(Orb G,Spaces)<annotation encoding="application/x-tex"> y \colon G Spaces \to PSh(Orb_G, Spaces) </annotation></semantics>

from the category <semantics>GSpaces<annotation encoding="application/x-tex">G Spaces</annotation></semantics> of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-spaces into the category of all presheaves on <semantics>Orb G<annotation encoding="application/x-tex">Orb_G</annotation></semantics>. This is a souped up version of the Yoneda embedding: <semantics>Orb G<annotation encoding="application/x-tex">Orb_G</annotation></semantics> is a subcategory of <semantics>GSpaces<annotation encoding="application/x-tex">G Spaces</annotation></semantics>, and the embedding above is just Yoneda when restricted to this subcategory.

It turns out this embedding doesn’t change the homotopy theory at all, as long as we choose the correct weak equivalences on the right hand side: we choose them to be the levelwise weak equivalences. That is, two presheaves <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> are weak equivalent if there is a natural transformation <semantics>f:XY<annotation encoding="application/x-tex">f \colon X \Rightarrow Y</annotation></semantics> whose components <semantics>f H:X HY H<annotation encoding="application/x-tex">f^H \colon X^H \to Y^H</annotation></semantics> are ordinary weak equivalences of spaces. With this choice of weak equivalences, the homotopy theory of presheaves on <semantics>Orb G<annotation encoding="application/x-tex">Orb_G</annotation></semantics> is the same as that of <semantics>GSpaces<annotation encoding="application/x-tex">G Spaces</annotation></semantics>. That’s Elmendorf’s theorem:

Theorem (Elmendorf). There is an equivalence of homotopy theories <semantics>GSpacesPSh(Orb G,Spaces).<annotation encoding="application/x-tex"> G Spaces \simeq PSh(Orb_G, Spaces) . </annotation></semantics> In the direction <semantics>GSpacesPSh(Orb G,Spaces)<annotation encoding="application/x-tex">G Spaces \to PSh(Orb_G, Spaces)</annotation></semantics>, this equivalence is simply the embedding <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics>.

You can read more about Elmendorf’s theorem in the original paper:

A much more modern treatment is in Andrew Blumberg’s lectures on equivariant homotopy theory. The theorem is so foundational to the topic that it first appears in Section 1.2 of these notes, and Section 1.3 is devoted to it:

Let us step back and appreciate what this theorem has bought us. Besides being a really nice reformulation from a categorical point of view, it gives us a paradigm for constructing equivariant homotopy theories more generally. That is, if we have a homotopy theory in the guise of a category <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> with weak equivalences, then you might go ahead and define the equivariant homotopy theory of <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> to be: <semantics>G𝒞=PSh(Orb G,𝒞)<annotation encoding="application/x-tex"> G \mathcal{C} = PSh(Orb_G, \mathcal{C}) </annotation></semantics> where the weak equivalences are the levelwise weak equivalences, as in Elmendorf.

For instance, if <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> is a model of rational homotopy theory <semantics>Spaces <annotation encoding="application/x-tex">Spaces_{\mathbb{Q}}</annotation></semantics>, then <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-equivariant rational homotopy ought to be: <semantics>PSh(Orb G,Spaces ).<annotation encoding="application/x-tex"> PSh(Orb_G, Spaces_{\mathbb{Q}}) . </annotation></semantics> This is precisely what one finds in the literature, at least in the case when <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> is a finite group:

This paper actually came before Elmendorf’s - perhaps it served as inspiration!

Or, if you want to get more adventurous, you can define “rational super homotopy theory”, a supersymmetric version of rational homotopy theory, modeled by some category with weak equivalences called <semantics>SuperSpace <annotation encoding="application/x-tex">SuperSpace_{\mathbb{Q}}</annotation></semantics>. Then the <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>-equivariant rational super homotopy theory ought to be: <semantics>GSuperSpace =PSh(Orb G,SuperSpace ).<annotation encoding="application/x-tex"> G SuperSpace_{\mathbb{Q}} = PSh(Orb_G, SuperSpace_{\mathbb{Q}}) . </annotation></semantics> This is the homotopy theory where the work in our preprint takes place! We use Elmendorf’s theorem to get our hands on what physicists call “black branes”. These turn out to be the fixed point subspaces <semantics>X H<annotation encoding="application/x-tex">X^H</annotation></semantics>, for <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> a particular rational superspace equipped with an action.

To close, let me ask if you or anyone you know has a nice conceptual explanation for Elmendorf’s theorem, or at the very least for the equivariant Whitehead theorem:

Question. What is an intuitive reason that equivariant homotopy types are captured by the homotopy types of their fixed point subspaces?

by huerta (john.huerta@gmail.com) at June 28, 2018 05:05 AM

June 27, 2018

Lubos Motl - string vacua and pheno

Vafa, quintessence vs Gross, Silverstein
It has been one year since Strings 2017 ended in the Israeli capital (yes, I mean Jerusalem, that's where Czechia has the honorary consulate) and Strings 2018 in Okinawa (list of titles) is here.

The Japanese organizers have tried an original reform of the YouTube activity. They post the whole days as unified videos.



Most of you want to delay your dinner by 3 hours and 46 minutes – so you should watch the video above for the food to taste better.




Cumrun Vafa whom I know much better than the other 20 speakers in the video starts to speak at 31:44 and his topic is a paper released on Monday,
De Sitter Space and the Swampland (Obied, Ooguri, Spodyneiko, Vafa)
Recall that Vafa's Swampland is a giant parameter space of effective field theories that cannot be realized within a consistent theory of quantum gravity i.e. within string/M-theory. Only a tiny island inside this Swampland, namely the stringy Landscape, is compatible with quantum gravity. String/M-theory makes lots of very strong predictions – namely that we don't live in the Swampland. We have to live in the special hospitable Landscape.

Along with his friend Donald Trump, Cumrun Vafa decided to drain the Swampland. ;-)




OK, these comments became ludicrous too early so let us be more serious for a while. In the paper I mentioned, Vafa and pals have proposed a new inequality obeyed by the potential energy \(V\) in every consistent theory of quantum gravity\[

V \leq \frac{|\nabla V|}{c}

\] which excludes too high positive cosmological constants and de Sitter spaces in particular. The gradient is calculated on the field space, with the metric given by the kinetic terms of the scalar fields. Use 4D Planck units (Einstein frame) if others aren't good enough.

It's a fun proposal – a potential cousin of the Weak Gravity Conjecture, but a younger and less justified one (at least so far). Note that \(V_0\leq 0\) constraining the minimum would be hard to justify and the appearance of the minimum \(V_0\) only would be unnatural. So in a sense, Vafa and co-authors have found and supported a stronger version of that inequality that places an upper bound on the potential energy density at each point of the configuration space. Neat.

For a while after I saw the paper, I was worried about the infinite tower of massive stringy fields which make the gradient arbitrarily high – because the mass is high. But then I understood my mistake. If the vacua are stabilized, the gradient from the inequality by Vafa et al. is strictly zero and they make the bold statement that the cosmological constant in stable vacua has to be zero (Minkowski) or negative (AdS).

I think it's a cool inequality and people should investigate the arguments for and against and the consequences. Such a simple inequality could indeed summarize the absence of persuasive constructions of metastable de Sitter spaces within string/M-theory.

Now, I am confident that Cumrun Vafa is one of the staunchest believers in string/M-theory in the world – I might have a hard time to compete with him even when it comes to the strength of the belief. So doesn't he think that this "no-go theorem" for the positive cosmological constant excludes string theory because the positive cosmological constant has been observed?

A good question, indeed. Well, it turns out that Cumrun Vafa has become a quintessence believer. So the cosmological constant isn't really constant, some scalar field is rolling, and its relative constancy only emulates the cosmological constant. As David Gross reminds everybody around 0:58:00 in the video above, quintenessence has a widely believed problem: it seems to predict time-dependent fundamental constants such as the fine-structure constant (the conditions are changing with time).



David Gross came to Okinawa from California.

These predicted variations in time don't have a very good reason to be small. Because experiments exist that prove that the fine-structure constant was the same within an impressive accuracy a few bilion years ago, quintessence seems to be basically excluded.

Cumrun believes that the thing can work in some way but I haven't quite absorbed his new belief – it's still rather non-standard for me.

(Along with Steinhardt and two other authors, Vafa posted an even more recent paper against inflation that tries to suggest that even inflation is in tension with the Swampland rules of quantum gravity. Cumrun has discussed this paper in his talk, too. He would probably replace inflation with the string gas cosmology or something like that – well, it's possible to research it but I think that the string gas cosmology is even less likely to be true than quintessence.)

Around 1:01:00, Eva Silverstein joins David Gross in the criticisms. Both of them have offered some lore that somewhat contradicts Vafa's picture, to put it mildly. Let me say the following: I have gone through similar thoughts, have been exposed to similar arguments, and I also tend to think that the lore is more likely than not.

But something is still wrong with the overall picture of dark energy and other things in string theory so I think it's right to be open-minded. By constantly repeating the lore – and David Gross does it rather often – one may force the quantum Zeno effect on the string researchers. They won't have the opportunity to make a jump that is almost certainly needed.

Cumrun Vafa knows quite something and when he becomes a quitenessence believer, I think it's useful to be interested in the mental processes that have led him to this transformation. Gross repeats some lore but there is really no rock-solid argument against quintessence. In fact, I think that some people could sensibly argue that swampland-like inequalities are actually more solid consequences of a theoretical framework – and you may view them as consequences of either quantum gravity or string/M-theory (which are ultimately equivalent but the phrases "sound" different) – than Gore's lore [I meant Gross' lore but I kept the typo because it looked funny] that disfavors quintessence models.

I have had similar feelings when David Gross was rather heavily attacking (repeated TRF guest blogger) Gordon Kane. You know, I have shared Gross' observation that in his M-theory phenomenology, Kane had to make numerous additional assumptions, some of which were inspired by rather detailed empirical observations (about possible ways to extend the Standard Model to a theory that also agrees with some cosmological criteria).

So while I have never believed that Kane has derived the necessity of his kind of models from the first principles – and indeed, it seems that his prediction of a gluino below \(2\TeV\) has been invalidated – there was something that I disliked about Gross' criticism. It just sounded to me that Gross wanted to discourage the people from thinking about specific scenarios, specific classes of models with some extra assumptions that just happen to look natural.

Well, I would even put it in this way: Gross apparently wanted everybody to treat the whole "landscape" as a set of equal elements and avoid the focusing on any particular elements or subgroups of the vacua because that would be a discrimination. In December, Gross dreamed about the early death of Donald Trump and a month later, he rather brutally attacked mainstream conservative Indians.



But those things are fine, no one cares what David Gross thinks about politics – everyone more or less correctly assumes he's just another cloned leftist in the Academia. However, I feel that a similar constant imposition of his lore and group think is something he does in physics, too. And it isn't right. People like Kane must have the freedom to study and propose realistic M-theory compactifications; and people like Vafa must have the freedom to investigate quintessence within string/M-theory. People must combine and recombine assumptions, pick privileged classes of vacua that look more promising to them given these assumptions and the observations, and so on. If those choices are discrimination, then it is a basic moral duty of a physicist to discriminate at basically all times!

I don't understand his reasoning but I find it (less likely than 50% but) conceivable that Cumrun has some reasons to think that the time variation of the constants might be compatible with their quintessence picture. On the other hand, Vafa admits that they don't solve the cosmological constant problem – why the "apparent" current vacuum energy density is so small.

But the inequality they propose – especially if you assume that it wants to be near-saturated – eliminates the "double fine-tuning" that you would need in generic quintessence models, those that were studied before their Swampland findings. In regular quintessence, \(V\approx 10^{-122}\) and \(|V|\approx 10^{-122}\) are two independent fine-tunings. With the near-saturated Swampland inequality, these two fine-tunings reduce to basically one, they are not independent. So you could say that with the Swampland findings, if established, the quintessence is as natural as the cosmological constant (one fine-tuning by 122 orders of magnitude). Vafa has made additional comparisons of naturalness within the standard or their axiomatic system and theirs seems to win.

Eva Silverstein has joined the polemics with her pet topic that I have been aware of since 2005: the affirmative action for supercritical strings. Supercritical string vacua must be treated on par with the critical string theory's compactifications, there can't be any discrimination. Please: Not again! (You know me as someone who passionately argues about an extremely diverse spectrum of topics. But I think that I have actually never argued as tensely about string theory with a string theorist as I did with Silverstein in 2005.)

The discussion at Strings 2018 made it clear that she has tried to convince Cumrun about that supercritical affirmative action and Cumrun has rejected in a very similar way – and maybe for similar reasons – as I did. Sorry, Eva, but supercritical string theory simply isn't an equally likely or convincing picture of the real world as presented by string theory as the critical string theory is.

The prediction of the critical dimension, e.g. \(D=10\) for the weakly coupled superstring, is one of the first heroic predictions of string theory that obviously go beyond the predictive power of quantum field theories. Some supercritical string world sheet CFTs may be defined but it is much less clear whether these theories may be completed to non-perturbative consistent theories.

In the case of critical superstring theory, S-dualities etc. make it extremely likely that the theory is fully consistent at any finite coupling (because it seems OK at zero and infinity). But in the supercritical case, there are no known S-dualities like that and lots of other arguments in favor of non-perturbative consistency that work in critical string theory simply can't be applied to supercritical string theory.

Moreover, the terms proportional to \((D-10)\) appear in the beta-function for the dilaton, a scalar field that plays a preferred role at weak coupling but that should become a generic scalar field at a stronger coupling. The beta-function for the dilaton dictates the Euler-Lagrange equation of motion that one would derive from varying the dilaton in the effective action.

I actually find it likely that there exists a swampland-style inequality similar to the one that Cumrun just discussed that says that the other terms \(t\) in the equation \((D-10)c + t = 0\) cannot be large enough to actually beat the "wrong dimension term" for \(D\neq 10\). As far as I know, all these questions are rather difficult and convincing justifications for one answer or another just don't exist. In fact, I can imagine that this statement excluding the supercritical string theory's stabilized vacua directly follows exactly from the inequality that Vafa et al. propose: they say that the gradient terms, and \(t\) contains those, are always too small to beat the constant terms \(V\) – and \((D-10)c\) might simply be a constant term that needs to be beaten but is too large.

We don't have a proof in either direction. But that's exactly why Cumrun's open-minded approach "I am not saying you must be wrong, Eva, but what I say might also be right" is exactly the right one.

Eva's claim that all the supercritical vacua, perhaps those in \(D=2018\), are as established and as consistent as the \(D=10\) or \(D=11\) vacua is just plain silly. This claim utterly disagrees with the composition of the string theory literature where most of the "good properties" depend on the critical dimension (and/or on supersymmetry, and supersymmetry is also possible in the critical spacetime dimension only). Silverstein's claim about the "equality" of critical and supercritical vacua is just some unjustified ideology at this point. Maybe the research will change towards the "equality between critical and supercritical vacua" in the future but it's a pure speculation; the critical string theory is much more established today and some future advances may also eliminate the supercritical string theory altogether.

Moreover, such a full legitimization of the supercritical vacua would probably lead to a much more hopeless proliferation of the "landscape of possibilities" than the regular landscape of the critical string/M-theory. The very dimension \(D\) would be unlimited and the complexity and diversity of possible compactifications would dramatically increase with \(D\), too. Maybe mathematics or Nature may make the search for the right vacuum much more difficult than we thought (and it's been hard enough for some two decades). But this is just a possibility, not an established fact. It's totally sensible to do research dependent on the working hypothesis that supercritical string theory is a curiosity in perturbative string theory that may be given one page of a textbook – but otherwise it's worthless, unusable, inconsistent rubbish in the string model building!

Cumrun clearly agrees with this working hypothesis of mine while Eva – without real evidence – is trying to make this assumption politically incorrect. She would love to impose a duty on everyone to spend the same time with supercritical string theory as with critical string theory. That's just a counterproductive pressure that should be ignored by Cumrun and others because the outcomes of such an extra rule would almost certain be tragic.

by Luboš Motl (noreply@blogger.com) at June 27, 2018 06:28 PM

June 25, 2018

Sean Carroll - Preposterous Universe

On Civility

Alex Wong/Getty Images

White House Press Secretary Sarah Sanders went to have dinner at a local restaurant the other day. The owner, who is adamantly opposed to the policies of the Trump administration, politely asked her to leave, and she did. Now (who says human behavior is hard to predict?) an intense discussion has broken out concerning the role of civility in public discourse and our daily life. The Washington Post editorial board, in particular, called for public officials to be allowed to eat in peace, and people have responded in volume.

I don’t have a tweet-length response to this, as I think the issue is more complex than people want to make it out to be. I am pretty far out to one extreme when it comes to the importance of engaging constructively with people with whom we disagree. We live in a liberal democracy, and we should value the importance of getting along even in the face of fundamentally different values, much less specific political stances. Not everyone is worth talking to, but I prefer to err on the side of trying to listen to and speak with as wide a spectrum of people as I can. Hell, maybe I am even wrong and could learn something.

On the other hand, there is a limit. At some point, people become so odious and morally reprehensible that they are just monsters, not respected opponents. It’s important to keep in our list of available actions the ability to simply oppose those who are irredeemably dangerous/evil/wrong. You don’t have to let Hitler eat in your restaurant.

This raises two issues that are not so easy to adjudicate. First, where do we draw the line? What are the criteria by which we can judge someone to have crossed over from “disagreed with” to “shunned”? I honestly don’t know. I tend to err on the side of not shunning people (in public spaces) until it becomes absolutely necessary, but I’m willing to have my mind changed about this. I also think the worry that this particular administration exhibits authoritarian tendencies that could lead to a catastrophe is not a completely silly one, and is at least worth considering seriously.

More importantly, if the argument is “moral monsters should just be shunned, not reasoned with or dealt with constructively,” we have to be prepared to be shunned ourselves by those who think that we’re moral monsters (and those people are out there).  There are those who think, for what they take to be good moral reasons, that abortion and homosexuality are unforgivable sins. If we think it’s okay for restaurant owners who oppose Trump to refuse service to members of his administration, we have to allow staunch opponents of e.g. abortion rights to refuse service to politicians or judges who protect those rights.

The issue becomes especially tricky when the category of “people who are considered to be morally reprehensible” coincides with an entire class of humans who have long been discriminated against, e.g. gays or transgender people. In my view it is bigoted and wrong to discriminate against those groups, but there exist people who find it a moral imperative to do so. A sensible distinction can probably be made between groups that we as a society have decided are worthy of protection and equal treatment regardless of an individual’s moral code, so it’s at least consistent to allow restaurant owners to refuse to serve specific people they think are moral monsters because of some policy they advocate, while still requiring that they serve members of groups whose behaviors they find objectionable.

The only alternative, as I see it, is to give up on the values of liberal toleration, and to simply declare that our personal moral views are unquestionably the right ones, and everyone should be judged by them. That sounds wrong, although we do in fact enshrine certain moral judgments in our legal codes (murder is bad) while leaving others up to individual conscience (whether you want to eat meat is up to you). But it’s probably best to keep that moral core that we codify into law as minimal and widely-agreed-upon as possible, if we want to live in a diverse society.

This would all be simpler if we didn’t have an administration in power that actively works to demonize immigrants and non-straight-white-Americans more generally. Tolerating the intolerant is one of the hardest tasks in a democracy.

 

 

by Sean Carroll at June 25, 2018 06:00 PM

June 24, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

7th Robert Boyle Summer School

This weekend saw the 7th Robert Boyle Summer School, an annual 3-day science festival in Lismore, Co. Waterford in Ireland. It’s one of my favourite conferences – a select number of talks on the history and philosophy of science, aimed at curious academics and the public alike, with lots of time for questions and discussion after each presentation.

220px-Robert_Boyle_0001

The Irish-born scientist and aristocrat Robert Boyle   

IMG_1745[1]

Lismore Castle in Co. Waterford , the birthplace of Robert Boyle

Born in Lismore into a wealthy landowning family, Robert Boyle became one of the most important figures in the Scientific Revolution. A contemporary of Isaac Newton and Robert Hooke, he is recognized the world over for his scientific discoveries, his role in the rise of the Royal Society and his influence in promoting the new ‘experimental philosophy’ in science.

This year, the theme of the conference was ‘What do we know – and how do we know it?’. There were many interesting talks such as Boyle’s Theory of Knowledge by Dr William Eaton, Associate Professor of Early Modern Philosophy at Georgia Southern University: The How, Who & What of Scientific Discovery by Paul Strathern, author of a great many books on scientists and philosophers such as the well-known Philosophers in 90 Minutes series: Scientific Enquiry and Brain StateUnderstanding the Nature of Knowledge by Professor William T. O’Connor, Head of Teaching and Research in Physiology at the University of Limerick Graduate Entry Medical School: The Promise and Peril of Big Data by Timandra Harkness, well-know media presenter, comedian and writer. For physicists, there was a welcome opportunity to hear the well-known American philosopher of physics Robert P. Crease present the talk Science Denial: will any knowledge do? The full programme for the conference can be found here.

All in all, a hugely enjoyable summer school, culminating in a garden party in the grounds of Lismore castle, Boyle’s ancestral home. My own contribution was to provide the music for the garden party – a flute, violin and cello trio, playing the music of Boyle’s contemporaries, from Johann Sebastian Bach to Turlough O’ Carolan. In my view, the latter was a baroque composer of great importance whose music should be much better known outside Ireland.

trio

IMG_9390

IMG_9398 (1)

Images from the garden party in the grounds of Lismore Castle

by cormac at June 24, 2018 08:19 PM

June 23, 2018

Clifford V. Johnson - Asymptotia

Google Talk!

I think that I forgot to post this link when it came out some time ago. I gave a talk at Google when I passed though London last Spring. There was a great Q & A session too - the Google employees were really interested and asked great questions. I talked in some detail about the book (The Dialogues), why I made it, how I made it, and what I was trying to do with the whole project. For a field that is supposed to be quite innovative (and usually is), I think that, although there are many really great non-fiction science books by Theoretical Physicists, we offer a rather narrow range of books to the general public, and I'm trying to broaden the spectrum with The Dialogues. In the months since the book has come out, people have been responding really positively to the book, so that's very encouraging (and thank you!). It's notable that it is a wide range of people, from habitual science book readers to people who say they've never picked up a science book before... That's a really great sign!

Here's the talk on YouTube:

Direct link here. Embed below: [...] Click to continue reading this post

The post Google Talk! appeared first on Asymptotia.

by Clifford at June 23, 2018 10:51 PM

Lubos Motl - string vacua and pheno

Slow bottom-up HEP research is neither intellectually challenging, nor justified by the null LHC data
Ben Allanach has been a well-known supersymmetry researcher in Cambridge, England whose name has appeared a dozen of times on this blog and he wrote a guest blog on ambulance chasing.



Because of his seemingly bullish presonality, I was surprised by an essay he wrote for Aeon.Co a few days ago,
Going nowhere fast: has the quest for top-down unification of physics stalled?
The most nontrivial statement in the essay is
Now I’ve all but dropped it [SUSY at the LHC] as a research topic.
He wants to do things that are more bottom-up such as the bottom mesons (a different bottom, the Academia is full of bottoms). I find this description bizarre because SUSY at the LHC is a good example of bottom-up physics in my eyes – and the bottom mesons seem really, really boring.




Allanach wrote that other colleagues have left SUSY-like research before him, everyone has his own calibration when he should give up, and Allanach gave up now. One theoretical reason he quotes is that SUSY probably doesn't solve the naturalness problem – and aside from the absence of superpartners of the LHC, it also seems that SUSY is incapable of solving other hierarchy problems such as the cosmological constant problem. So if SUSY doesn't solve that one, why it should be explaining the lightness of the Higgs?




So he attributes all the null data – and the disappointment – to the top-down, "reductive" thinking, the thinking whose current flagship is string theory. He wants to pursue the bottom mesons and perhaps a few other "humble" topics like that. I think that I have compressed his essay by several orders of magnitude and nothing substantial is missing.

OK, his attribution is 100% irrational and the rest of his ideas are half-right, half-wrong. Where should I start?

In April 2007, I quantified dozens of (my subjective) probabilities of statements beyond the established level of particle physics. The probabilities go from 0.000001% to 99.9999% – and the items are more likely to be found near 0% or 100% because there are still many things I find "almost certain". But there's one item that was sitting exactly at 50%:
50% - Supersymmetry will be found at the LHC
Many bullish particle physicists were surely boasting a much higher degree of certainty. And I surely wanted the probability to be higher. But that would quantify my wishful thinking. The post above captured what I really believed about the discovery of SUSY at the LHC and that was associated with 50%, a maximum uncertainty.

By the way, with the knowledge of the absence of any SUSY at the LHC so far, and with some ideas about the future of the LHC, I would quantify the probability of a SUSY discovery at the LHC (High-Luminosity LHC is allowed for that discovery) to be 25% now.

String theory in no way implies that SUSY was obliged to be discovered at the LHC. Such a claim about a low-energy experiment doesn't follow from the equations of string theory, from anything that is "characteristically stringy" i.e. connected with conformal field theory of two-dimensional world sheets (more or less directly). Someone might envision a non-stringy argument – a slightly rational one or a mostly irrational one – and attribute it to string theory because it sounds better when your ideas are linked to string theory. But that's deceitful. Various ideas how naturalness should be applied to effective field theories have nothing to do with string theory per se – on the contrary, string theory is very likely to heavily revolutionize the rules how naturalness should be applied, and it's already doing so.

So Allanach's statement that the null LHC data mean something bad for string theory and similar top-down thinking etc. is just absolutely wrong.

A correct proposition is Allanach's thesis that for a person who believes in naturalness and is interested in supersymmetry because in combination with naturalness, it seems to predict accessible superpartners at the colliders, the absence of such superpartners reduces the probability that this package of ideas is correct – and people who have pursued this bunch of ideas are likely to gradually give up at some points.

It's correct but mostly irrelevant for me – the main reason why I am confident that supersymmetry is realized in Nature (at some scale, possibly one that is inaccessible in practice) is that it seems to be a part of the realistic string vacua. This is an actual example of the top-down thinking because I am actually starting near the Planck scale. Allanach has presented no top-down argumentation – all his argumentation is bottom-up. Any reasoning based on the naturalness of parameters in effective field theories is unavoidable bottom-up reasoning.

A mostly wrong is his statement that the null LHC data reduce the probability of supersymmetry. But this statement is justifiable to the extent to which the existence of supersymmetry is tied to the naturalness – the extent to which the superpartners are "required" to be light. If you connect SUSY with the ideas implying that the superpartners must be light, its probability goes down. But more general SUSY models either don't assume the lightness at all, or have various additional – never fully explored – tricks that allow the superpartners to be much heavier or less visible, while still addressing naturalness equally satisfactorily. So in this broader realm, the probability of SUSY hasn't dropped (at least not much) even if you incorporate the naturalness thinking.

You know, the SUSY GUT is still equally compatible with the experiments as the Standard Model up to the GUT scale. The null LHC data say that some parameters in SUSY GUT have to be fine-tuned more than previously thought – but the Standard Model still has to be fine-tuned even more than that. So as long as you choose any consistent rules for the evaluation of the theories, the ratio of probabilities of a "SUSY framework" over "non-SUSY framework" remained the same or slightly increased. The absence of evidence isn't the evidence of absence.

I think he's also presenting pure speculation as a fact when he says that SUSY has nothing to do with the right explanation of the smallness of the cosmological constant. I think it's still reasonably motivated to assume that some argument based on a SUSY starting point (including some SUSY non-renormalization theorems) and small corrections following from SUSY breaking is a promising sketch of an explanation why the cosmological constant is small. We don't know the right explanation with any certainty. So the answer to this question is "we don't know" rather than "SUSY can't do it".

But again, the most far-reaching incorrect idea of Allanach's is his idea that the "surprisingly null LHC data", relatively to an average researcher, should strengthen the bottom-up thinking relatively to the top-down thinking. His conclusion is completely upside down!

The very point of the bottom-up thinking was to expect new physics "really" around the corner – something that I have always criticized (partly because it is always partly driven by the desire to get prizes soon if one is lucky – and that's an ethically problematic driver in science, I think; the impartial passion for the truth should be the motivation). An assumption that was always made by all bottom-up phenomenologists in recent decades was that there can't be any big deserts – wide intervals on the energy log scale where nothing new happens. Well, the null LHC data surely do weaken these theses, don't they? Deserts are possible (yes, that's why I posted the particular image at the top of the blog post, along with a supersymmetric man or superman for short) which also invalidates the claim that by adding small energy gains, you're guaranteed to see new interesting things.

So I think it's obvious that the right way to adjust one's research focus in the light of the null LHC data is to make the research more theoretical, more top-down – and less bound to immediate wishful thinking about the experiment, to be less bottom-up in this sense! SUSY people posting to hep-ph may want to join the Nima Arkani-Hamed-style subfield of amplitudes and amplituhedrons (which still has SUSY almost everywhere because it seems very useful or unavoidable for technical reasons now, SUSY is easier than non-SUSY, for sure) or something else that is posted to hep-th or that is in between hep-ph and hep-th. Allanach's conclusion is precisely wrong.

You know, the bottom-up thinking expects something interesting (although, perhaps, a bit modest) around the corner. That is what I would also call incrementalism. But given this understanding of "incrementalism" (which is basically the same as "bottom-up", indeed), I am shocked by Allanach's statement
This doesn’t mean we need to give up on the unification paradigm. It just means that incrementalism is to be preferred to absolutism
Holy cow. It's exactly the other way around! It's incrementalism that has failed. The addition of new light particles to the Standard Model, to turn it to the MSSM or something else – so that the additions are being linked to the ongoing experiment – that's both incrementalism and it's what has failed in the recent decade because nothing beyond the Higgs was seen.

So a particle physics thinker simply has to look beyond incrementalism. She has to be interested in absolutism at least a little bit, if you wish. She must be ready for big deserts – a somewhat big desert was just seen. And she must "zoom out", if I borrow a verb from the Bitcoin hodling kids who want to train their eyes and other people's eyes to overlook the 70% drop of the Bitcoin price since December ;-). (For the hodlers, the word "she" would be even more comical than for particle physicists!)

But in particle physics, you really need to zoom out because the research of the small interval of energies around the LHC energy scale wasn't fruitful! Allanach also wrote:
But none of our top-down efforts seem to be yielding fruit.
This is complete nonsense – Allanach is writing this nonsense as a layman who has been away for decades or for his previous life so far. The top-down research in string theory has yielded amazing fruits. In recent 10 years as well as 20 years as well as 30 years, it has yielded many more fruits and much more valuable fruits than what the bottom-up research yielded. Allanach is probably completely unfamiliar with all of this – but this ignorance doesn't change anything about the fact that the quote above places him in the category of crackpots.

Ben, you should learn at least some basics about what has been learned from the top-down approach – about dualities, new transitions, new types of vacua, new realization of well-known low-energy physical concepts within a stringy realization, integrable structures in QFTs, new auxiliary spaces, solution to the information loss paradox, links between entanglement and wormholes, and many others. Unlike the papers presenting possible explanations for the \(750\GeV\) diphoton excess, those aren't going away!

There have been various positive and negative expectations about new physics at the LHC. Things would have been more fun if there had been new physics by now. People may feel vindicated or frustrated because their wishes came true or didn't come true. Their love towards the field or its subfields have changed and they may adjust their career plans and other things. But at the end, scientists should think rationally and produce justifiable statements about the natural world, including questions that aren't quite settled yet. I think that most of Allanach's thinking is just plain irrational and the conclusions are upside down. And he's still one of the reasonable people.

Also, Allanach seems to be willing to switch to things like "chasing hopes surrounding B-mesons, \(g-2\) anomalies, sterile neutrinos", and so on. Well, it seems rather likely to me that all these emerging anomalies result from errors in the experiments. But even if they're not errors in the experiment, I don't see much value in theorists' preemptive bottom-up thinking about these matters. If the experiments force us to add a new neutrino species, great. But immediately, it will be just a straightforward experimental fact. The theory explaining the data, if such an anomaly (or the other ones) is confirmed, will be a straightforward ugly expansion of the Standard Model that will be almost directly extracted from the reliable experiment.

My point is that the experimenters could almost do it themselves – they're the crucial players in this particular enterprise – and Allanach wants himself and lots of colleagues to be hired as theoretical assistants to these experimenters. But these experimenters simply don't need too many assistants, especially not very expensive ones.

Why should a theorist spend much time by doing these things in advance? What is the point of it? If such new and surprising anomalies are found by the experiments, the experimenters represent a big fraction of the big discovery. The only big role for a theorist is to actually find an explanation why this new addition to the Standard Model is sensible or could have been expected – if the theorist finds some top-down explanation! A theorist may find out that the existence of some new particle species follows from some principle that looks sensible or unifying at the GUT scale or a string scale; it's a top-down contribution. Without such a contribution, there's almost no useful role for a theorist here. A theorist may preemptively analyze the consequences of 10 possible outcomes of a B-meson experiment. But isn't it better to simply wait for the outcome and make a simple analysis of the actual one outcome afterwards? The bottom-up analyses of possible outcomes just aren't too interesting for anybody.

More generally, I would find some detailed research of B-mesons and the aforementioned anomalies to be utterly boring and insufficiently intellectually stimulating. I have always been bored by these papers – equivalent to some homework exercises in a QFT course – and it's close to the truth if I say that I have never read a "paper like that" in its entirety. I think that if most high-energy physicists abandon the big picture and the big ambitions, the field will rightfully cease to attract the mankind's best minds and it will be in the process of dying.

If most of the people in the field were looking at some dirty structure of B-mesons, the field would become comparable to climatology or another inferior scientific discipline which is messy, likely to remain imprecise for decades or forever, and connected with no really deep mathematics (because deep mathematics has little to say to messy, complex patterns with huge error margins). B-mesons are similar bound states as atoms or molecules – except that atoms and molecules have far more precisely measurable and predictable spectra. So if I had to do some of these things, I would choose atomic or molecular physics or quantum chemistry instead of the B-meson engineering! Like nuclear physics, subnuclear physics really isn't intellectual deeper than the atomic and molecular physics of the 1930s.

Fundamental physics is the emperor of sciences and the ambitious goals are a necessary condition underlying that fact. The experimental data should help the fundamental physicists to adjust their ideas what the ambitious goals should look like – but the experimental data should never be used as evidence against the ambitious goals in general! Experimental data really cannot ever justify the suppression of ambitions such as the search for a theory of everything. Everyone who claims that they can is being demagogic or irrational.

And that's the memo.

by Luboš Motl (noreply@blogger.com) at June 23, 2018 08:27 AM

June 22, 2018

Jester - Resonaances

Both g-2 anomalies
Two months ago an experiment in Berkeley announced a new ultra-precise measurement of the fine structure constant α using interferometry techniques. This wasn't much noticed because the paper is not on arXiv, and moreover this kind of research is filed under metrology, which is easily confused with meteorology. So it's worth commenting on why precision measurements of α could be interesting for particle physics. What the Berkeley group really did was to measure the mass of the cesium-133 atom, achieving the relative accuracy of 4*10^-10, that is 0.4 parts par billion (ppb). With that result in hand, α can be determined after a cavalier rewriting of the high-school formula for the Rydberg constant:   
Everybody knows the first 3 digits of the Rydberg constant, Ry≈13.6 eV, but actually it is experimentally known with the fantastic accuracy of 0.006 ppb, and the electron-to-atom mass ratio has also been determined precisely. Thus the measurement of the cesium mass can be translated into a 0.2 ppb measurement of the fine structure constant: 1/α=137.035999046(27).

You may think that this kind of result could appeal only to a Pythonesque chartered accountant. But you would be wrong. First of all, the new result excludes  α = 1/137 at 1 million sigma, dealing a mortal blow to the field of epistemological numerology. Perhaps more importantly, the result is relevant for testing the Standard Model. One place where precise knowledge of α is essential is in calculation of the magnetic moment of the electron. Recall that the g-factor is defined as the proportionality constant between the magnetic moment and the angular momentum. For the electron we have
Experimentally, ge is one of the most precisely determined quantities in physics,  with the most recent measurement quoting a= 0.00115965218073(28), that is 0.0001 ppb accuracy on ge, or 0.2 ppb accuracy on ae. In the Standard Model, ge is calculable as a function of α and other parameters. In the classical approximation ge=2, while the one-loop correction proportional to the first power of α was already known in prehistoric times thanks to Schwinger. The dots above summarize decades of subsequent calculations, which now include O(α^5) terms, that is 5-loop QED contributions! Thanks to these heroic efforts (depicted in the film  For a Few Diagrams More - a sequel to Kurosawa's Seven Samurai), the main theoretical uncertainty for the Standard Model prediction of ge is due to the experimental error on the value of α. The Berkeley measurement allows one to reduce the relative theoretical error on adown to 0.2 ppb:  ae = 0.00115965218161(23), which matches in magnitude the experimental error and improves by a factor of 3 the previous prediction based on the α measurement with rubidium atoms.

At the spiritual level, the comparison between the theory and experiment provides an impressive validation of quantum field theory techniques up to the 13th significant digit - an unimaginable  theoretical accuracy in other branches of science. More practically, it also provides a powerful test of the Standard Model. New particles coupled to the electron may contribute to the same loop diagrams from which ge is calculated, and could shift the observed value of ae away from the Standard Model predictions. In many models, corrections to the electron and muon magnetic moments are correlated. The latter famously deviates from the Standard Model prediction by 3.5 to 4 sigma, depending on who counts the uncertainties. Actually, if you bother to eye carefully the experimental and theoretical values of ae beyond the 10th significant digit you can see that they are also discrepant, this time at the 2.5 sigma level. So now we have two g-2 anomalies! In a picture, the situation can be summarized as follows:

If you're a member of the Holy Church of Five Sigma you can almost preach an unambiguous discovery of physics beyond the Standard Model. However, for most of us this is not the case yet. First, there is still some debate about the theoretical uncertainties entering the muon g-2 prediction. Second, while it is quite easy to fit each of the two anomalies separately, there seems to be no appealing model to fit both of them at the same time.  Take for example the very popular toy model with a new massive spin-1 Z' boson (aka the dark photon) kinetically mixed with the ordinary photon. In this case Z' has, much like the ordinary photon, vector-like and universal couplings to electron and muons. But this leads to a positive contribution to g-2, and it does not fit well the ae measurement which favors a new negative contribution. In fact, the ae measurement provides the most stringent constraint in part of the parameter space of the dark photon model. Conversely, a Z' boson with purely axial couplings to matter does not fit the data as it gives a negative contribution to g-2, thus making the muon g-2 anomaly worse. What might work is a hybrid model with a light Z' boson having lepton-flavor violating interactions: a vector coupling to muons and a somewhat smaller axial coupling to electrons. But constructing a consistent and realistic model along these lines is a challenge because of other experimental constraints (e.g. from the lack of observation of μ→eγ decays). Some food for thought can be found in this paper, but I'm not sure if a sensible model exists at the moment. If you know one you are welcome to drop a comment here or a paper on arXiv.

More excitement on this front is in store. The muon g-2 experiment in Fermilab should soon deliver first results which may confirm or disprove the muon anomaly. Further progress with the electron g-2 and fine-structure constant measurements is also expected in the near future. The biggest worry is that, if the accuracy improves by another two orders of magnitude, we will need to calculate six loop QED corrections...

by Mad Hatter (noreply@blogger.com) at June 22, 2018 11:04 PM

June 18, 2018

The n-Category Cafe

∞-Atomic Geometric Morphisms

Today’s installment in the ongoing project to sketch the <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-elephant: atomic geometric morphisms.

Chapter C3 of Sketches of an Elephant studies various classes of geometric morphisms between toposes. Pretty much all of this chapter has been categorified, except for section C3.5 about atomic geometric morphisms. To briefly summarize the picture:

  • Sections C3.1 (open geometric morphisms) and C3.3 (locally connected geometric morphisms) are steps <semantics>n=1<annotation encoding="application/x-tex">n=-1</annotation></semantics> and <semantics>n=0<annotation encoding="application/x-tex">n=0</annotation></semantics> on an infinite ladder of locally n-connected geometric morphisms, for <semantics>1n<annotation encoding="application/x-tex">-1 \le n \le \infty</annotation></semantics>. A geometric morphism between <semantics>(n+1,1)<annotation encoding="application/x-tex">(n+1,1)</annotation></semantics>-toposes is locally <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-connected if its inverse image functor is locally cartesian closed and has a left adjoint. More generally, a geometric morphism between <semantics>(m,1)<annotation encoding="application/x-tex">(m,1)</annotation></semantics>-toposes is locally <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-connected, for <semantics>n<m<annotation encoding="application/x-tex">n\lt m</annotation></semantics>, if it is “locally” locally <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-connected on <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-truncated maps.

  • Sections C3.2 (proper geometric morphisms) and C3.4 (tidy geometric morphisms) are likewise steps <semantics>n=1<annotation encoding="application/x-tex">n=-1</annotation></semantics> and <semantics>n=0<annotation encoding="application/x-tex">n=0</annotation></semantics> on an infinite ladder of n-proper geometric morphisms.

  • Section C3.6 (local geometric morphisms) is also step <semantics>n=0<annotation encoding="application/x-tex">n=0</annotation></semantics> on an infinite ladder: a geometric morphism between <semantics>(n+1,1)<annotation encoding="application/x-tex">(n+1,1)</annotation></semantics>-toposes is <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-local if its direct image functor has an indexed right adjoint. Cohesive toposes, which have attracted a lot of attention around here, are both locally <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-connected and <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-local. (Curiously, the <semantics>n=1<annotation encoding="application/x-tex">n=-1</annotation></semantics> case of locality doesn’t seem to be mentioned in the 1-Elephant; has anyone seen it before?)

So what about C3.5? An atomic geometric morphism between elementary 1-toposes is usually defined as one whose inverse image functor is logical. This is an intriguing prospect to categorify, because it appears to mix the “elementary” and “Grothendieck” aspects of topos theory: a geometric morphisms are arguably the natural morphisms between Grothendieck toposes, while logical functors are more natural for the elementary sort (where “natural” means “preserves all the structure in the definition”). So now that we’re starting to see some progress on elementary higher toposes (my post last year has now been followed by a preprint by Rasekh), we might hope be able to make some progress on it.

Unfortunately, the definitions of elementary <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topos currently under consideration have a problem when it comes to defining logical functors. A logical functor between 1-toposes can be defined as a cartesian closed functor that preserves the subobject classifier, i.e. <semantics>F(Ω)Ω<annotation encoding="application/x-tex">F(\Omega) \cong \Omega</annotation></semantics>. The higher analogue of the subobject classifier is an object classifier — but note the switch from definite to indefinite article! For Russellian size reasons, we can’t expect to have one object classifer that classifies all objects, only a tower of “universes” each of which classifies some subcollection of “small” objects.

What does it mean for a functor to “preserve” the tower of object classifiers? If an <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topos came equipped with a specified tower of object classifiers (indexed by <semantics><annotation encoding="application/x-tex">\mathbb{N}</annotation></semantics>, say, or maybe by the ordinal numbers), then we could ask a logical functor to preserve them one by one. This would probably be the relevant kind of “logical functor” when discussing categorical semantics of homotopy type theory: since type theory does have a specified tower of universe types <semantics>U 0:U 1:U 2:<annotation encoding="application/x-tex">U_0 : U_1 : U_2 : \cdots</annotation></semantics>, the initiality conjecture for HoTT should probably say that the syntactic category is an elementary <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topos that’s initial among logical functors of this sort.

However, Grothendieck <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topoi don’t really come equipped with such a tower. And even if they did, preserving it level by level doesn’t seem like the right sort of “logical functor” to use in defining atomic geometric morphisms; there’s no reason to expect such a functor to “preserve size” exactly.

What do we want of a logical functor? Well, glancing through some of the theorems about logical functors in the 1-Elephant, one result that stands out to me is the following: if <semantics>F:SE<annotation encoding="application/x-tex">F:\mathbf{S}\to \mathbf{E}</annotation></semantics> is a logical functor with a left adjoint <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics>, then <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics> induces isomorphisms of subobject lattices <semantics>Sub E(A)Sub S(LA)<annotation encoding="application/x-tex">Sub_{\mathbf{E}}(A) \cong Sub_{\mathbf{S}}(L A)</annotation></semantics>. This is easy to prove using the adjointness <semantics>LF<annotation encoding="application/x-tex">L\dashv F</annotation></semantics> and the fact that <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> preserves the subobject classifier:

<semantics>Sub E(A)E(A,Ω E)E(A,FΩ S)E(LA,Ω S)Sub S(LA).<annotation encoding="application/x-tex"> Sub_{\mathbf{E}}(A) \cong \mathbf{E}(A,\Omega_{\mathbf{E}}) \cong \mathbf{E}(A,F \Omega_{\mathbf{S}}) \cong \mathbf{E}(L A,\Omega_{\mathbf{S}})\cong Sub_{\mathbf{S}}(L A).</annotation></semantics>

What would be the analogue for <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topoi? Well, if we imagine hypothetically that we had a classifier <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> for all objects, then the same argument would show that <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics> induces an equivalence between entire slice categories <semantics>E/AS/LA<annotation encoding="application/x-tex">\mathbf{E}/A \simeq \mathbf{S}/L A</annotation></semantics>. (Actually, I’m glossing over something here: the direct arguments with <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> and <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> show only an equivalence between sets of subobjects and cores of slice categories. The rest comes from the fact that <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> preserves local cartesian closure as well as the (sub)object classifier, so that we can enhance <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> to an internal poset and <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> to an internal full subcategory and both of these are preserved by <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> as well.)

In fact, the converse is true too: reversing the above argument shows that <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> preserves <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> if and only if <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics> induces isomorphisms of subobject lattices, and similarly <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> preserves <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> if and only if <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics> induces equivalences of slice categories. The latter condition, however, is something that can be said without reference to the nonexistent <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics>. So if we have a functor <semantics>F:ES<annotation encoding="application/x-tex">F:\mathbf{E}\to \mathbf{S}</annotation></semantics> between <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-toposes that has a left adjoint <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics>, then I think it’s reasonable to define <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> to be logical if it is locally cartesian closed and <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics> induces equivalences <semantics>E/AS/LA<annotation encoding="application/x-tex">\mathbf{E}/A \simeq \mathbf{S}/L A</annotation></semantics>.

Furthermore, a logical functor between 1-toposes has a left adjoint if and only if it has a right adjoint. (This follows from the monadicity of the powerset functor <semantics>P:E opE<annotation encoding="application/x-tex">P : \mathbf{E}^{op} \to \mathbf{E}</annotation></semantics> for 1-toposes, which we don’t have an analogue of (yet) in the <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-case.) In particular, if the inverse image functor in a geometric morphism is logical, then it automatically has a left adjoint, so that the above characterization of logical-ness applies. And since a logical functor is locally cartesian closed, this geometric morphism is automatically locally connected as well. This suggests the following:

Definition: A geometric morphism <semantics>p:ES<annotation encoding="application/x-tex">p:\mathbf{E}\to \mathbf{S}</annotation></semantics> between <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topoi is <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-atomic if

  1. It is locally <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-connected, i.e. <semantics>p *<annotation encoding="application/x-tex">p^\ast</annotation></semantics> is locally cartesian closed and has a left adjoint <semantics>p !<annotation encoding="application/x-tex">p_!</annotation></semantics>, and
  2. <semantics>p !<annotation encoding="application/x-tex">p_!</annotation></semantics> induces equivalences of slice categories <semantics>E/AS/p !A<annotation encoding="application/x-tex">\mathbf{E}/A \simeq \mathbf{S}/p_! A</annotation></semantics> for all <semantics>AE<annotation encoding="application/x-tex">A\in \mathbf{E}</annotation></semantics>.

This seems natural to me, but it’s very strong! In particular, taking <semantics>A=1<annotation encoding="application/x-tex">A=1</annotation></semantics> we get an equivalence <semantics>EE/1S/p !1<annotation encoding="application/x-tex">\mathbf{E}\simeq \mathbf{E}/1 \simeq \mathbf{S}/p_! 1</annotation></semantics>, so that <semantics>E<annotation encoding="application/x-tex">\mathbf{E}</annotation></semantics> is equivalent to a slice category of <semantics>S<annotation encoding="application/x-tex">\mathbf{S}</annotation></semantics>. In other words, <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-atomic geometric morphisms coincide with local homeomorphisms!

Is that really reasonable? Actually, I think it is. Consider the simplest example of an atomic geometric morphism of 1-topoi that is not a local homeomorphism: <semantics>[G,Set]Set<annotation encoding="application/x-tex">[G,Set] \to Set</annotation></semantics> for a group <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>. The corresponding geometric morphism of <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topoi <semantics>[G,Gpd]Gpd<annotation encoding="application/x-tex">[G,\infty Gpd] \to \infty Gpd</annotation></semantics> is a local homeomorphism! Specifically, we have <semantics>[G,Gpd]Gpd/BG<annotation encoding="application/x-tex">[G,\infty Gpd] \simeq \infty Gpd / B G</annotation></semantics>. So in a sense, the difference between atomic and locally-homeomorphic vanishes in the limit <semantics>n<annotation encoding="application/x-tex">n\to \infty</annotation></semantics>.

To be sure, there are other atomic geometric morphisms of 1-topoi that do not extend to local homeomorphisms of <semantics>(,1)<annotation encoding="application/x-tex">(\infty,1)</annotation></semantics>-topoi, such as <semantics>Cont(G)Set<annotation encoding="application/x-tex">Cont(G) \to Set</annotation></semantics> for a topological group <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>. But it seems reasonable to me to regard these as “1-atomic morphisms that are not <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-atomic” — a thing which we should certainly expect to exist, just as there are locally 0-connected morphisms that are not locally <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-connected, and 0-proper morphisms that are not <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-proper.

We can also “see” how the difference gets “pushed off to <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>” to vanish, in terms of sites of definition. In C3.5.8 of the 1-Elephant it is shown that every atomic Grothendieck topos has a site of definition in which (among other properties) all morphisms are effective epimorphisms. If we trace through the proof, we see that this effective-epi condition comes about as the “dual” class to the monomorphisms that the left adjoint of a logical functor induces an equivalence on. Since an <semantics>(n+1,1)<annotation encoding="application/x-tex">(n+1,1)</annotation></semantics>-topos has classifiers for <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-truncated objects, we would expect an atomic one to have a site of definition in which all morphisms belong to the dual class of the <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-truncated morphisms, i.e. the <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-connected morphisms. So as <semantics>n<annotation encoding="application/x-tex">n\to \infty</annotation></semantics>, we get stronger and stronger conditions on the morphisms in our site, until in the limit we have a classifier for all morphisms, and the morphisms in our site are all required to be equivalences. In other words, the site is itself an <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-groupoid, and thus the topos of (pre)sheaves on it is a slice of <semantics>Gpd<annotation encoding="application/x-tex">\infty Gpd</annotation></semantics>.

However, it could be that I’m missing something and this is not the best categorification of atomic geometric morphisms. Any thoughts from readers?

by shulman (viritrilbia@gmail.com) at June 18, 2018 05:06 AM

June 16, 2018

Tommaso Dorigo - Scientificblogging

On The Residual Brightness Of Eclipsed Jovian Moons
While preparing for another evening of observation of Jupiter's atmosphere with my faithful 16" dobsonian scope, I found out that the satellite Io will disappear behind the Jovian shadow tonight. This is a quite common phenomenon and not a very spectacular one, but still quite interesting to look forward to during a visual observation - the moon takes some time to fully disappear, so it is fun to follow the event.
This however got me thinking. A fully eclipsed jovian moon should still be able to reflect back some light picked up from the still lit other satellites - so it should not, after all, appear completely dark. Can a calculation be made of the effect ? Of course - and it's not that difficult.

read more

by Tommaso Dorigo at June 16, 2018 04:47 PM

June 12, 2018

Axel Maas - Looking Inside the Standard Model

How to test an idea
As you may have guessed from reading through the blog, our work is centered around a change of paradigm: That there is a very intriguing structure of the Higgs and the W/Z bosons. And that what we observe in the experiments are actually more complicated than what we usually assume. That they are not just essentially point-like objects.

This is a very bold claim, as it touches upon very basic things in the standard model of particle physics. And the interpretation of experiments. However, it is at the same time a necessary consequence if one takes the underlying more formal theoretical foundation seriously. The reason that there is not a huge clash is that the standard model is very special. Because of this both pictures give almost the same prediction for experiments. This can also be understood quantitatively. That is where I have written a review about. It can be imagined in this way:

Thus, the actual particle, which we observe, and call the Higgs is actually a complicated object made from two Higgs particles. However, one of those is so much eclipsed by the other that it looks like just a single one. And a very tiny correction to it.

So far, this does not seem to be something where it is necessary to worry about.

However, there are many and good reasons to believe that the standard model is not the end of particle physics. There are many, many blogs out there, which explain the reasons for this much better than I do. However, our research provides hints that what works so nicely in the standard model, may work much less so in some extensions of the standard model. That there the composite nature makes huge differences for experiments. This was what came out of our numerical simulations. Of course, these are not perfect. And, after all, unfortunately we did not yet discover anything beyond the standard model in experiments. So we cannot test our ideas against actual experiments, which would be the best thing to do. And without experimental support such an enormous shift in paradigm seems to be a bit far fetched. Even if our numerical simulations, which are far from perfect, support the idea. Formal ideas supported by numerical simulations is just not as convincing as experimental confirmation.

So, is this hopeless? Do we have to wait for new physics to make its appearance?

Well, not yet. In the figure above, there was 'something'. So, the ideas make also a statement that even within the standard model there should be a difference. The only question is, what is really the value of a 'little bit'? So far, experiments did not show any deviations from the usual picture. So 'little bit' needs indeed to be really rather small. But we have a calculation prescription for this 'little bit' for the standard model. So, at the very least what we can do is to make a calculation for this 'little bit' in the standard model. We should then see if the value of 'little bit' may already be so large that the basic idea is ruled out, because we are in conflict with experiment. If this is the case, this would raise a lot of question on the basic theory, but well, experiment rules. And thus, we would need to go back to the drawing board, and get a better understanding of the theory.

Or, we get something which is in agreement with current experiment, because it is smaller then the current experimental precision. But then we can make a statement how much better experimental precision needs to become to see the difference. Hopefully the answer will not be so much that it will not be possible within the next couple of decades. But this we will see at the end of the calculation. And then we can decide, whether we will get an experimental test.

Doing the calculations is actually not so simple. On the one hand, they are technically challenging, even though our method for it is rather well under control. But it will also not yield perfect results, but hopefully good enough. Also, it depends strongly on the type of experiment how simple the calculations are. We did a first few steps, though for a type of experiment not (yet) available, but hopefully in about twenty years. There we saw that not only the type of experiment, but also the type of measurement matters. For some measurements the effect will be much smaller than for others. But we are not yet able to predict this before doing the calculation. There, we need still much better understanding of the underlying mathematics. That we will hopefully gain by doing more of these calculations. This is a project I am currently pursuing with a number of master students for various measurements and at various levels. Hopefully, in the end we get a clear set of predictions. And then we can ask our colleagues at experiments to please check these predictions. So, stay tuned.

By the way: This is the standard cycle for testing new ideas and theories. Have an idea. Check that it fits with all existing experiments. And yes, this may be very, very many. If your idea passes this test: Great! There is actually a chance that it can be right. If not, you have to understand why it does not fit. If it can be fixed, fix it, and start again. Or have a new idea. And, at any rate, if it cannot be fixed, have a new idea. When you got an idea which works with everything we know, use it to make a prediction where you get a difference to our current theories. By this you provide an experimental test, which can decide whether your idea is the better one. If yes: Great! You just rewritten our understanding of nature. If not: Well, go back to fix it or have a new idea. Of course, it is best if we have already an experiment which does not fit with our current theories. But there we are at this stage a little short off. May change again. If your theory has no predictions which can be testable in any foreseeable future experimentally. Well, that is a good question how to deal with this, and there is not yet a consensus how to proceed.

by Axel Maas (noreply@blogger.com) at June 12, 2018 10:49 AM

June 10, 2018

Tommaso Dorigo - Scientificblogging

Modeling Issues Or New Physics ? Surprises From Top Quark Kinematics Study
Simulation, noun:
1. Imitation or enactment
2. The act or process of pretending; feigning.
3. An assumption or imitation of a particular appearance or form; counterfeit; sham.

Well, high-energy physics is all about simulations. 

We have a theoretical model that predicts the outcome of the very energetic particle collisions we create in the core of our giant detectors, but we only have approximate descriptions of the inputs to the theoretical model, so we need simulations. 

read more

by Tommaso Dorigo at June 10, 2018 11:18 AM

June 09, 2018

Jester - Resonaances

Dark Matter goes sub-GeV
It must have been great to be a particle physicist in the 1990s. Everything was simple and clear then. They knew that, at the most fundamental level, nature was described by one of the five superstring theories which, at low energies, reduced to the Minimal Supersymmetric Standard Model. Dark matter also had a firm place in this narrative, being identified with the lightest neutralino of the MSSM. This simple-minded picture strongly influenced the experimental program of dark matter detection, which was almost entirely focused on the so-called WIMPs in the 1 GeV - 1 TeV mass range. Most of the detectors, including the current leaders XENON and LUX, are blind to sub-GeV dark matter, as slow and light incoming particles are unable to transfer a detectable amount of energy to the target nuclei.

Sometimes progress consists in realizing that you know nothing Jon Snow. The lack of new physics at the LHC invalidates most of the historical motivations for WIMPs. Theoretically, the mass of the dark matter particle could be anywhere between 10^-30 GeV and 10^19 GeV. There are myriads of models positioned anywhere in that range, and it's hard to argue with a straight face that any particular one is favored. We now know that we don't know what dark matter is, and that we should better search in many places. If anything, the small-scale problem of the 𝞚CDM cosmological model can be interpreted as a hint against the boring WIMPS and in favor of light dark matter. For example, if it turns out that dark matter has significant (nuclear size) self-interactions, that can only be realized with sub-GeV particles. 
                       
It takes some time for experiment to catch up with theory, but the process is already well in motion. There is some fascinating progress on the front of ultra-light axion dark matter, which deserves a separate post. Here I want to highlight the ongoing  developments in direct detection of dark matter particles with masses between MeV and GeV. Until recently, the only available constraint in that regime was obtained by recasting data from the XENON10 experiment - the grandfather of the currently operating XENON1T.  In XENON detectors there are two ingredients of the signal generated when a target nucleus is struck:  ionization electrons and scintillation photons. WIMP searches require both to discriminate signal from background. But MeV dark matter interacting with electrons could eject electrons from xenon atoms without producing scintillation. In the standard analysis, such events would be discarded as background. However,  this paper showed that, recycling the available XENON10 data on ionization-only events, one can exclude dark matter in the 100 MeV ballpark with the cross section for scattering on electrons larger than ~0.01 picobarn (10^-38 cm^2). This already has non-trivial consequences for concrete models; for example, a part of the parameter space of milli-charged dark matter is currently best constrained by XENON10.   

It is remarkable that so much useful information can be extracted by basically misusing data collected for another purpose (earlier this year the DarkSide-50 recast their own data in the same manner, excluding another chunk of the parameter space).  Nevertheless, dedicated experiments will soon  be taking over. Recently, two collaborations published first results from their prototype detectors:  one is SENSEI, which uses 0.1 gram of silicon CCDs, and the other is SuperCDMS, which uses 1 gram of silicon semiconductor.  Both are sensitive to eV energy depositions, thanks to which they can extend the search region to lower dark matter mass regions, and set novel limits in the virgin territory between 0.5 and 5 MeV.  A compilation of the existing direct detection limits is shown in the plot. As you can see, above 5 MeV the tiny prototypes cannot yet beat the XENON10 recast. But that will certainly change as soon as full-blown detectors are constructed, after which the XENON10 sensitivity should be improved by several orders of magnitude.
     
Should we be restless waiting for these results? Well, for any single experiment the chance of finding nothing are immensely larger than that of finding something. Nevertheless, the technical progress and the widening scope of searches offer some hope that the dark matter puzzle may be solved soon.

by Mad Hatter (noreply@blogger.com) at June 09, 2018 05:39 PM

June 08, 2018

Jester - Resonaances

Massive Gravity, or You Only Live Twice
Proving Einstein wrong is the ultimate ambition of every crackpot and physicist alike. In particular, Einstein's theory of gravitation -  the general relativity -  has been a victim of constant harassment. That is to say, it is trivial to modify gravity at large energies (short distances), for example by embedding it in string theory, but it is notoriously difficult to change its long distance behavior. At the same time, motivations to keep trying go beyond intellectual gymnastics. For example, the accelerated expansion of the universe may be a manifestation of modified gravity (rather than of a small cosmological constant).   

In Einstein's general relativity, gravitational interactions are mediated by a massless spin-2 particle - the so-called graviton. This is what gives it its hallmark properties: the long range and the universality. One obvious way to screw with Einstein is to add mass to the graviton, as entertained already in 1939 by Fierz and Pauli. The Particle Data Group quotes the constraint m ≤ 6*10^−32 eV, so we are talking about the De Broglie wavelength comparable to the size of the observable universe. Yet even that teeny mass may cause massive troubles. In 1970 the Fierz-Pauli theory was killed by the van Dam-Veltman-Zakharov (vDVZ) discontinuity. The problem stems from the fact that a massive spin-2 particle has 5 polarization states (0,±1,±2) unlike a massless one which has only two (±2). It turns out that the polarization-0 state couples to matter with the similar strength as the usual polarization ±2 modes, even in the limit where the mass goes to zero, and thus mediates an additional force which differs from the usual gravity. One finds that, in massive gravity, light bending would be 25% smaller, in conflict with the very precise observations of stars' deflection around the Sun. vDV concluded that "the graviton has rigorously zero mass". Dead for the first time...           

The second coming was heralded soon after by Vainshtein, who noticed that the troublesome polarization-0 mode can be shut off in the proximity of stars and planets. This can happen in the presence of graviton self-interactions of a certain type. Technically, what happens is that the polarization-0 mode develops a background value around massive sources which, through the derivative self-interactions, renormalizes its kinetic term and effectively diminishes its interaction strength with matter. See here for a nice review and more technical details. Thanks to the Vainshtein mechanism, the usual predictions of general relativity are recovered around large massive source, which is exactly where we can best measure gravitational effects. The possible self-interactions leading a healthy theory without ghosts have been classified, and go under the name of the dRGT massive gravity.

There is however one inevitable consequence of the Vainshtein mechanism. The graviton self-interaction strength grows with energy, and at some point becomes inconsistent with the unitarity limits that every quantum theory should obey. This means that massive gravity is necessarily an effective theory with a limited validity range and has to be replaced by a more fundamental theory at some cutoff scale 𝞚. This is of course nothing new for gravity: the usual Einstein gravity is also an effective theory valid at most up to the Planck scale MPl~10^19 GeV.  But for massive gravity the cutoff depends on the graviton mass and is much smaller for realistic theories. At best,
So the massive gravity theory in its usual form cannot be used at distance scales shorter than ~300 km. For particle physicists that would be a disaster, but for cosmologists this is fine, as one can still predict the behavior of galaxies, stars, and planets. While the theory certainly cannot be used to describe the results of table top experiments,  it is relevant for the  movement of celestial bodies in the Solar System. Indeed, lunar laser ranging experiments or precision studies of Jupiter's orbit are interesting probes of the graviton mass.

Now comes the latest twist in the story. Some time ago this paper showed that not everything is allowed  in effective theories.  Assuming the full theory is unitary, causal and local implies non-trivial constraints on the possible interactions in the low-energy effective theory. These techniques are suitable to constrain, via dispersion relations, derivative interactions of the kind required by the Vainshtein mechanism. Applying them to the dRGT gravity one finds that it is inconsistent to assume the theory is valid all the way up to 𝞚max. Instead, it must be replaced by a more fundamental theory already at a much lower cutoff scale,  parameterized as 𝞚 = g*^1/3 𝞚max (the parameter g* is interpreted as the coupling strength of the more fundamental theory). The allowed parameter space in the g*-m plane is showed in this plot:

Massive gravity must live in the lower left corner, outside the gray area  excluded theoretically  and where the graviton mass satisfies the experimental upper limit m~10^−32 eV. This implies g* ≼ 10^-10, and thus the validity range of the theory is some 3 order of magnitude lower than 𝞚max. In other words, massive gravity is not a consistent effective theory at distance scales below ~1 million km, and thus cannot be used to describe the motion of falling apples, GPS satellites or even the Moon. In this sense, it's not much of a competition to, say, Newton. Dead for the second time.   

Is this the end of the story? For the third coming we would need a more general theory with additional light particles beyond the massive graviton, which is consistent theoretically in a larger energy range, realizes the Vainshtein mechanism, and is in agreement with the current experimental observations. This is hard but not impossible to imagine. Whatever the outcome, what I like in this story is the role of theory in driving the progress, which is rarely seen these days. In the process, we have understood a lot of interesting physics whose relevance goes well beyond one specific theory. So the trip was certainly worth it, even if we find ourselves back at the departure point.

by Mad Hatter (noreply@blogger.com) at June 08, 2018 08:35 AM

June 07, 2018

Jester - Resonaances

Can MiniBooNE be right?
The experimental situation in neutrino physics is confusing. One one hand, a host of neutrino experiments has established a consistent picture where the neutrino mass eigenstates are mixtures of the 3 Standard Model neutrino flavors νe, νμ, ντ. The measured mass differences between the eigenstates are Δm12^2 ≈ 7.5*10^-5 eV^2 and Δm13^2 ≈ 2.5*10^-3 eV^2, suggesting that all Standard Model neutrinos have masses below 0.1 eV. That is well in line with cosmological observations which find that the radiation budget of the early universe is consistent with the existence of exactly 3 neutrinos with the sum of the masses less than 0.2 eV. On the other hand, several rogue experiments refuse to conform to the standard 3-flavor picture. The most severe anomaly is the appearance of electron neutrinos in a muon neutrino beam observed by the LSND and MiniBooNE experiments.


This story begins in the previous century with the LSND experiment in Los Alamos, which claimed to observe νμνe antineutrino oscillations with 3.8σ significance.  This result was considered controversial from the very beginning due to limitations of the experimental set-up. Moreover, it was inconsistent with the standard 3-flavor picture which, given the masses and mixing angles measured by other experiments, predicted that νμνe oscillation should be unobservable in short-baseline (L ≼ km) experiments. The MiniBooNE experiment in Fermilab was conceived to conclusively prove or disprove the LSND anomaly. To this end, a beam of mostly muon neutrinos or antineutrinos with energies E~1 GeV is sent to a detector at the distance L~500 meters away. In general, neutrinos can change their flavor with the probability oscillating as P ~ sin^2(Δm^2 L/4E). If the LSND excess is really due to neutrino oscillations, one expects to observe electron neutrino appearance in the MiniBooNE detector given that L/E is similar in the two experiments. Originally, MiniBooNE was hoping to see a smoking gun in the form of an electron neutrino excess oscillating as a function of L/E, that is peaking at intermediate energies and then decreasing towards lower energies (possibly with several wiggles). That didn't happen. Instead, MiniBooNE finds an excess increasing towards low energies with a similar shape as the backgrounds. Thus the confusion lingers on: the LSND anomaly has neither been killed nor robustly confirmed.     

In spite of these doubts, the LSND and MiniBooNE anomalies continue to arouse interest. This is understandable: as the results do not fit the 3-flavor framework, if confirmed they would prove the existence of new physics beyond the Standard Model. The simplest fix would be to introduce a sterile neutrino νs with the mass in the eV ballpark, in which case MiniBooNE would be observing the νμνsνe oscillation chain. With the recent MiniBooNE update the evidence for the electron neutrino appearance increased to 4.8σ, which has stirred some commotion on Twitter and in the blogosphere. However, I find the excitement a bit misplaced. The anomaly is not really new: similar results showing a 3.8σ excess of νe-like events were already published in 2012.  The increase of the significance is hardly relevant: at this point we know anyway that the excess is not a statistical fluke, while a systematic effect due to underestimated backgrounds would also lead to a growing anomaly. If anything, there are now less reasons than in 2012 to believe in the sterile neutrino origin the MiniBooNE anomaly, as I will argue in the following.

What has changed since 2012? First, there are new constraints on νe appearance from the OPERA experiment (yes, this OPERA) who did not see any excess νe in the CERN-to-Gran-Sasso νμ beam. This excludes a large chunk of the relevant parameter space corresponding to large mixing angles between the active and sterile neutrinos. From this point of view, the MiniBooNE update actually adds more stress on the sterile neutrino interpretation by slightly shifting the preferred region towards larger mixing angles...  Nevertheless, a not-too-horrible fit to all appearance experiments can still be achieved in the region with Δm^2~0.5 eV^2 and the mixing angle sin^2(2θ) of order 0.01.     

Next, the cosmological constraints have become more stringent. The CMB observations by the Planck satellite do not leave room for an additional neutrino species in the early universe. But for the parameters preferred by LSND and MiniBooNE, the sterile neutrino would be abundantly produced in the hot primordial plasma, thus violating the Planck constraints. To avoid it, theorists need to deploy a battery of  tricks (for example, large sterile-neutrino self-interactions), which makes realistic models rather baroque.

But the killer punch is delivered by disappearance analyses. Benjamin Franklin famously said that only two things in this world were certain: death and probability conservation. Thus whenever an electron neutrino appears in a νμ beam, a muon neutrino must disappear. However, the latter process is severely constrained by long-baseline neutrino experiments, and recently the limits have been further strengthened thanks to the MINOS and IceCube collaborations. A recent combination of the existing disappearance results is available in this paper.  In the 3+1 flavor scheme, the probability of a muon neutrino transforming into an electron  one in a short-baseline experiment is
where U is the 4x4 neutrino mixing matrix.  The Uμ4 matrix elements controls also the νμ survival probability
The νμ disappearance data from MINOS and IceCube imply |Uμ4|≼0.1, while |Ue4|≼0.25 from solar neutrino observations. All in all, the disappearance results imply that the effective mixing angle sin^2(2θ) controlling the νμνsνe oscillation must be much smaller than 0.01 required to fit the MiniBooNE anomaly. The disagreement between the appearance and disappearance data had already existed before, but was actually made worse by the MiniBooNE update.
So the hypothesis of a 4th sterile neutrino does not stand scrutiny as an explanation of the MiniBooNE anomaly. It does not mean that there is no other possible explanation (more sterile neutrinos? non-standard interactions? neutrino decays?). However, any realistic model will have to delve deep into the crazy side in order to satisfy the constraints from other neutrino experiments, flavor physics, and cosmology. Fortunately, the current confusing situation should not last forever. The MiniBooNE photon background from π0 decays may be clarified by the ongoing MicroBooNE experiment. On the timescale of a few years the controversy should be closed by the SBN program in Fermilab, which will add one near and one far detector to the MicroBooNE beamline. Until then... years of painful experience have taught us to assign a high prior to the Standard Model hypothesis. Currently, by far the most plausible explanation of the existing data is an experimental error on the part of the MiniBooNE collaboration.

by Mad Hatter (noreply@blogger.com) at June 07, 2018 01:20 PM

June 01, 2018

Jester - Resonaances

WIMPs after XENON1T
After today's update from the XENON1T experiment, the situation on the front of direct detection of WIMP dark matter is as follows

WIMP can be loosely defined as a dark matter particle with mass in the 1 GeV - 10 TeV range and significant interactions with ordinary matter. Historically, WIMP searches have stimulated enormous interest because this type of dark matter can be easily realized in models with low scale supersymmetry. Now that we are older and wiser, many physicists would rather put their money on other realizations, such as axions, MeV dark matter, or primordial black holes. Nevertheless, WIMPs remain a viable possibility that should be further explored.
 
To detect WIMPs heavier than a few GeV, currently the most successful strategy is to use huge detectors filled with xenon atoms, hoping one of them is hit by a passing dark matter particle. Xenon1T beats the competition from the LUX and Panda-X experiments because it has a bigger gun tank. Technologically speaking, we have come a long way in the last 30 years. XENON1T is now sensitive to 40 GeV WIMPs interacting with nucleons with the cross section of 40 yoctobarn (1 yb = 10^-12 pb = 10^-48 cm^2). This is 6 orders of magnitude better than what the first direct detection experiment in the Homestake mine could achieve back in the 80s. Compared to the last year, the  limit is better by a factor of two at the most sensitive mass point. At high mass the improvement is somewhat smaller than expected due to a small excess of events observed by XENON1T, which is probably just a 1 sigma upward fluctuation of the background.

What we are learning about WIMPs is how they can (or cannot) interact with us. Of course, at this point in the game we don't see qualitative progress, but rather incremental quantitative improvements. One possible scenario is that WIMPs experience one of the Standard Model forces,  such as the weak or the Higgs force. The former option is strongly constrained by now. If WIMPs had interacted in the same way as our neutrino does, that is by exchanging a Z boson,  it would have been found in the Homestake experiment. Xenon1T is probing models where the dark matter coupling to the Z boson is suppressed by a factor cχ ~ 10^-3 - 10^-4 compared to that of an active neutrino. On the other hand, dark matter could be participating in weak interactions only by exchanging W bosons, which can happen for example when it is a part of an SU(2) triplet. In the plot you can see that XENON1T is approaching but not yet excluding this interesting possibility. As for models using the Higgs force, XENON1T is probing the (subjectively) most natural parameter space where WIMPs couple with order one strength to the Higgs field. 

And the arms race continues. The search in XENON1T will go on until the end of this year, although at this point a discovery is extremely unlikely. Further progress is expected on a timescale of a few years thanks to the next generation xenon detectors XENONnT and LUX-ZEPLIN, which should achieve yoctobarn sensitivity. DARWIN may be the ultimate experiment along these lines, in the sense that there is no prefix smaller than yocto it will reach the irreducible background from atmospheric neutrinos, after which new detection techniques will be needed.  For dark matter mass closer to 1 GeV, several orders of magnitude of pristine parameter space will be covered by the SuperCDMS experiment. Until then we are kept in suspense. Is dark matter made of WIMPs? And if yes, does it stick above the neutrino sea?

by Mad Hatter (noreply@blogger.com) at June 01, 2018 05:30 PM

Tommaso Dorigo - Scientificblogging

MiniBoone Confirms Neutrino Anomaly
Neutrinos, the most mysterious and fascinating of all elementary particles, continue to puzzle physicists. 20 years after the experimental verification of a long-debated effect whereby the three neutrino species can "oscillate", changing their nature by turning one into the other as they propagate in vacuum and in matter, the jury is still out to decide what really is the matter with them. And a new result by the MiniBoone collaboration is stirring waters once more.

read more

by Tommaso Dorigo at June 01, 2018 12:49 PM

May 26, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

A festschrift at UCC

One of my favourite academic traditions is the festschrift, a conference convened to honour the contribution of a senior academic. In a sense, it’s academia’s version of an Oscar for lifetime achievement, as scholars from all around the world gather to pay tribute their former mentor, colleague or collaborator.

Festschrifts tend to be very stimulating meetings, as the diverging careers of former students and colleagues typically make for a diverse set of talks. At the same time, there is usually a unifying theme based around the specialism of the professor being honoured.

And so it was at NIALLFEST this week, as many of the great and the good from the world of Einstein’s relativity gathered at University College Cork to pay tribute to Professor Niall O’Murchadha, a theoretical physicist in UCC’s Department of Physics noted internationally for seminal contributions to general relativity.  Some measure of Niall’s influence can be seen from the number of well-known theorists at the conference, including major figures such as Bob WaldBill UnruhEdward Malec and Kip Thorne (the latter was recently awarded the Nobel Prize in Physics for his contribution to the detection of gravitational waves). The conference website can be found here and the programme is here.

IMG_1640

IMG_1644

IMG_1642

University College Cork: probably the nicest college campus in Ireland

As expected, we were treated to a series of high-level talks on diverse topics, from black hole collapse to analysis of high-energy jets from active galactic nuclei, from the initial value problem in relativity to the search for dark matter (slides for my own talk can be found here). To pick one highlight, Kip Thorne’s reminiscences of the forty-year search for gravitational waves made for a fascinating presentation, from his description of early designs of the LIGO interferometer to the challenge of getting funding for early prototypes – not to mention his prescient prediction that the most likely chance of success was the detection of a signal from the merger of two black holes.

All in all, a very stimulating conference. Most entertaining of all were the speakers’ recollections of Niall’s working methods and his interaction with students and colleagues over the years. Like a great piano teacher of old, one great professor leaves a legacy of critical thinkers dispersed around their world, and their students in turn inspire the next generation!

 

by cormac at May 26, 2018 12:16 AM

May 21, 2018

Andrew Jaffe - Leaves on the Line

Leon Lucy, R.I.P.

I have the unfortunate duty of using this blog to announce the death a couple of weeks ago of Professor Leon B Lucy, who had been a Visiting Professor working here at Imperial College from 1998.

Leon got his PhD in the early 1960s at the University of Manchester, and after postdoctoral positions in Europe and the US, worked at Columbia University and the European Southern Observatory over the years, before coming to Imperial. He made significant contributions to the study of the evolution of stars, understanding in particular how they lose mass over the course of their evolution, and how very close binary stars interact and evolve inside their common envelope of hot gas.

Perhaps most importantly, early in his career Leon realised how useful computers could be in astrophysics. He made two major methodological contributions to astrophysical simulations. First, he realised that by simulating randomised trajectories of single particles, he could take into account more physical processes that occur inside stars. This is now called “Monte Carlo Radiative Transfer” (scientists often use the term “Monte Carlo” — after the European gambling capital — for techniques using random numbers). He also invented the technique now called smoothed-particle hydrodynamics which models gases and fluids as aggregates of pseudo-particles, now applied to models of stars, galaxies, and the large scale structure of the Universe, as well as many uses outside of astrophysics.

Leon’s other major numerical contributions comprise advanced techniques for interpreting the complicated astronomical data we get from our telescopes. In this realm, he was most famous for developing the methods, now known as Lucy-Richardson deconvolution, that were used for correcting the distorted images from the Hubble Space Telescope, before NASA was able to send a team of astronauts to install correcting lenses in the early 1990s.

For all of this work Leon was awarded the Gold Medal of the Royal Astronomical Society in 2000. Since then, Leon kept working on data analysis and stellar astrophysics — even during his illness, he asked me to help organise the submission and editing of what turned out to be his final papers, on extracting information on binary-star orbits and (a subject dear to my heart) the statistics of testing scientific models.

Until the end of last year, Leon was a regular presence here at Imperial, always ready to contribute an occasionally curmudgeonly but always insightful comment on the science (and sociology) of nearly any topic in astrophysics. We hope that we will be able to appropriately memorialise his life and work here at Imperial and elsewhere. He is survived by his wife and daughter. He will be missed.

by Andrew at May 21, 2018 09:27 AM

May 14, 2018

Sean Carroll - Preposterous Universe

Intro to Cosmology Videos

In completely separate video news, here are videos of lectures I gave at CERN several years ago: “Cosmology for Particle Physicists” (May 2005). These are slightly technical — at the very least they presume you know calculus and basic physics — but are still basically accurate despite their age.

  1. Introduction to Cosmology
  2. Dark Matter
  3. Dark Energy
  4. Thermodynamics and the Early Universe
  5. Inflation and Beyond

Update: I originally linked these from YouTube, but apparently they were swiped from this page at CERN, and have been taken down from YouTube. So now I’m linking directly to the CERN copies. Thanks to commenters Bill Schempp and Matt Wright.

by Sean Carroll at May 14, 2018 07:09 PM

May 10, 2018

Sean Carroll - Preposterous Universe

User-Friendly Naturalism Videos

Some of you might be familiar with the Moving Naturalism Forward workshop I organized way back in 2012. For two and a half days, an interdisciplinary group of naturalists (in the sense of “not believing in the supernatural”) sat around to hash out the following basic question: “So we don’t believe in God, what next?” How do we describe reality, how can we be moral, what are free will and consciousness, those kinds of things. Participants included Jerry Coyne, Richard Dawkins, Terrence Deacon, Simon DeDeo, Daniel Dennett, Owen Flanagan, Rebecca Newberger Goldstein, Janna Levin, Massimo Pigliucci, David Poeppel, Nicholas Pritzker, Alex Rosenberg, Don Ross, and Steven Weinberg.

Happily we recorded all of the sessions to video, and put them on YouTube. Unhappily, those were just unedited proceedings of each session — so ten videos, at least an hour and a half each, full of gems but without any very clear way to find them if you weren’t patient enough to sift through the entire thing.

No more! Thanks to the heroic efforts of Gia Mora, the proceedings have been edited down to a number of much more accessible and content-centered highlights. There are over 80 videos (!), with a median length of maybe 5 minutes, though they range up to about 20 minutes and down to less than one. Each video centers on a particular idea, theme, or point of discussion, so you can dive right into whatever particular issues you may be interested in. Here, for example, is a conversation on “Mattering and Secular Communities,” featuring Rebecca Goldstein, Dan Dennett, and Owen Flanagan.

The videos can be seen on the workshop web page, or on my YouTube channel. They’re divided into categories:

A lot of good stuff in there. Enjoy!

by Sean Carroll at May 10, 2018 02:48 PM

March 29, 2018

Robert Helling - atdotde

Machine Learning for Physics?!?
Today was the last day of a nice workshop here at the Arnold Sommerfeld Center organised by Thomas Grimm and Sven Krippendorf on the use of Big Data and Machine Learning in string theory. While the former (at this workshop mainly in the form of developments following Kreuzer/Skarke and taking it further for F-theory constructions, orbifolds and the like) appears to be quite advanced as of today, the latter is still in its very early days. At best.

I got the impression that for many physicists that have not yet spent too much time with this, deep learning and in particular deep neural networks are expected to be some kind of silver bullet that can answer all kinds of questions that humans have not been able to answer despite some effort. I think this hope is at best premature and looking at the (admittedly impressive) examples where it works (playing Go, classifying images, speech recognition, event filtering at LHC) these seem to be more like those problems where humans have at least a rough idea how to solve them (if it is not something that humans do everyday like understanding text) and also roughly how one would code it but that are too messy or vague to be treated by a traditional program.

So, during some of the less entertaining talks I sat down and thought about problems where I would expect neural networks to perform badly. And then, if this approach fails even in simpler cases that are fully under control one should maybe curb the expectations for the more complex cases that one would love to have the answer for. In the case of the workshop that would be guessing some topological (discrete) data (that depends very discontinuously on the model parameters). Here a simple problem would be a 2-torus wrapped by two 1-branes. And the computer is supposed to compute the number of matter generations arising from open strings at the intersections, i.e. given two branes (in terms of their slope w.r.t. the cycles of the torus) how often do they intersect? Of course these numbers depend sensitively on the slope (as a real number) as for rational slopes [latex]p/q[/latex] and [latex]m/n[/latex] the intersection number is the absolute value of [latex]pn-qm[/latex]. My guess would be that this is almost impossible to get right for a neural network, let alone the much more complicated variants of this simple problem.

Related but with the possibility for nicer pictures is the following: Can a neural network learn the shape of the Mandelbrot set? Let me remind those of you who cannot remember the 80ies anymore, for a complex number c you recursively apply the function
[latex]f_c(z)= z^2 +c[/latex]
starting from 0 and ask if this stays bounded (a quick check shows that once you are outside [latex]|z| < 2[/latex] you cannot avoid running to infinity). You color the point c in the complex plane according to the number of times you have to apply f_c to 0 to leave this circle. I decided to do this for complex numbers x+iy in the rectangle -0.74
I have written a small mathematica program to compute this image. Built into mathematica is also a neural network: You can feed training data to the function Predict[], for me these were 1,000,000 points in this rectangle and the number of steps it takes to leave the 2-ball. Then mathematica thinks for about 24 hours and spits out a predictor function. Then you can plot this as well:


There is some similarity but clearly it has no idea about the fractal nature of the Mandelbrot set. If you really believe in magic powers of neural networks, you might even hope that once it learned the function for this rectangle one could extrapolate to outside this rectangle. Well, at least in this case, this hope is not justified: The neural network thinks the correct continuation looks like this:
Ehm. No.

All this of course with the caveat that I am no expert on neural networks and I did not attempt anything to tune the result. I only took the neural network function built into mathematica. Maybe, with a bit of coding and TensorFlow one can do much better. But on the other hand, this is a simple two dimensional problem. At least for traditional approaches this should be much simpler than the other much higher dimensional problems the physicists are really interested in.

by Robert Helling (noreply@blogger.com) at March 29, 2018 07:35 PM

Axel Maas - Looking Inside the Standard Model

Asking questions leads to a change of mind
In this entry, I would like to digress a bit from my usual discussion of our physics research subject. Rather, I would like to talk a bit about how I do this kind of research. There is a twofold motivation for me to do this.

One is that I am currently teaching, together with somebody from the philosophy department, a course on science philosophy of physics. It cam to me as a surprise that one thing the students of philosophy are interested in is, how I think. What are the objects, or subjects, and how I connect them when doing research. Or even when I just think about a physics theory. The other is the review I have have recently written. Both topics may seem unrelated at first. But there is deep connection. It is less about what I have written in the review, but rather what led me up to this point. This requires some historical digression in my own research.

In the very beginning, I started out with doing research on the strong interactions. One of the features of the strong interactions is that the supposed elementary particles, quarks and gluons, are never seen separately, but only in combinations as hadrons. This is a phenomenon which is called confinement. It always somehow presented as a mystery. And as such, it is interesting. Thus, one question in my early research was how to understand this phenomenon.

Doing that I came across an interesting result from the 1970ies. It appears that a, at first sight completely unrelated, effect is very intimately related to confinement. At least in some theories. This is the Brout-Englert-Higgs effect. However, we seem to observe the particles responsible for and affected by the Higgs effect. And indeed, at that time, I was still thinking that the particles affected by the Brout-Englert-Higgs effect, especially  the Higgs and the W and Z bosons, are just ordinary, observable particles. When one reads my first paper of this time on the Higgs, this is quite obvious. But then there was the results of the 1970ies. It stated that, on a very formal level, there should be no difference between confinement and the Brout-Englert-Higgs effect, in a very definite way.

Now the implications of that serious sparked my interest. But I thought this would help me to understand confinement, as it was still very ingrained into me that confinement is a particular feature of the strong interactions. The mathematical connection I just took as a curiosity. And so I started to do extensive numerical simulations of the situation.

But while trying to do so, things which did not add up started to accumulate. This is probably most evident in a conference proceeding where I tried to put sense into something which, with hindsight, could never be interpreted in the way I did there. I still tried to press the result into the scheme of thinking that the Higgs and the W/Z are physical particles, which we observe in experiment, as this is the standard lore. But the data would not fit this picture, and the more and better data I gathered, the more conflicted the results became. At some point, it was clear that something was amiss.

At that point, I had two options. Either keep with the concepts of confinement and the Brout-Englert-Higgs effect as they have been since the 1960ies. Or to take the data seriously, assuming that these conceptions were wrong. It is probably signifying my difficulties that it took me more than a year to come to terms with the results. In the end, the decisive point was that, as a theoretician, I needed to take my theory seriously, no matter the results. There is no way around it. And it gave a prediction which did not fit my view of the experiments than necessarily either my view was incorrect or the theory. The latter seemed more improbable than the first, as it fits experiment very well. So, finally, I found an explanation, which was consistent. And this explanation accepted the curious mathematical statement from the 1970ies that confinement and the Brout-Englert-Higgs effect are qualitatively the same, but not quantitatively. And thus the conclusion was what we observe are not really the Higgs and the W/Z bosons, but rather some interesting composite objects, just like hadrons, which due to a quirk of the theory just behave almost as if they are the elementary particles.

This was still a very challenging thought to me. After all, this was quite contradictory to usual notions. Thus, it came as a very great relief to me that during a trip a couple months later someone pointed me to a few, almost forgotten by most, papers from the early 1980ies, which gave, for a completely different reason, the same answer. Together with my own observation, this made click, and everything started to fit together - the 1970ies curiosity, the standard notions, my data. That I published in the mid of 2012, even though this still lacked some more systematic stuff. But it required still to shift my thinking from agreement to really understanding. That came then in the years to follow.

The important click was to recognize that confinement and the Brout-Englert-Higgs effect are, just as pointed out in the 1970ies mathematically, really just two faces to the same underlying phenomena. On a very abstract level, essentially all particles which make up the standard model, are really just a means to an end. What we observe are objects which are described by them, but which they are not themselves. They emerge, just like hadrons emerge in the strong interaction, but with very different technical details. This is actually very deeply connected with the concept of gauge symmetry, but this becomes quickly technical. Of course, since this is fundamentally different from the usual way, this required confirmation. So we went, made predictions which could distinguish between the standard way of thinking and this way of thinking, and tested them. And it came out as we predicted. So, seems we are on the right track. And all details, all the if, how, and why, and all the technicalities and math you can find in the review.

To make now full circle to the starting point: That what happened during this decade in my mind was that the way I thought about how the physical theory I tried to describe, the standard model, changed. In the beginning I was thinking in terms of particles and their interactions. Now, very much motivated by gauge symmetry, and, not incidental, by its more deeper conceptual challenges, I think differently. I think no longer in terms of the elementary particles as entities themselves, but rather as auxiliary building blocks of actually experimentally accessible quantities. The standard 'small-ball' analogy went fully away, and there formed, well, hard to say, a new class of entities, which does not necessarily has any analogy. Perhaps the best analogy is that of, no, I really do not know how to phrase it. Perhaps at a later time I will come across something. Right now, it is more math than words.

This also transformed the way how I think about the original problem, confinement. I am curious, where this, and all the rest, will lead to. For now, the next step will be to go ahead from simulations, and see whether we can find some way how to test this actually in experiment. We have some ideas, but in the end, it may be that present experiments will not be sensitive enough. Stay tuned.

by Axel Maas (noreply@blogger.com) at March 29, 2018 01:09 PM

March 28, 2018

Marco Frasca - The Gauge Connection

Paper with a proof of confinement has been accepted

Recently, I wrote a paper together with Masud Chaichian (see here) containing a mathematical proof of confinement of a non-Abelian gauge theory based on Kugo-Ojima criterion. This paper underwent an extended review by several colleagues well before its submission. One of them has been Taichiro Kugo, one of the discoverers of the confinement criterion, that helped a lot to improve the paper and clarify some points. Then, after a review round of about two months, the paper has been accepted in Physics Letters B, one of the most important journals in particle physics.

This paper contains the exact beta function of a Yang-Mills theory. This confirms that confinement arises by the combination of the running coupling and the propagator. This idea was around in some papers in these latter years. It emerged as soon as people realized that the propagator by itself was not enough to grant confinement, after extended studies on the lattice.

It is interesting to point out that confinement is rooted in the BRST invariance and asymptotic freedom. The Kugo-Ojima confinement criterion permits to close the argument in a rigorous way yielding the exact beta funtion of the theory.

by mfrasca at March 28, 2018 09:34 AM

March 20, 2018

Marco Frasca - The Gauge Connection

Good news from Moriond

Some days ago, Rencontres of Moriond 2018 ended with the CERN presenting a wealth of results also about the Higgs particle. The direction that the two great experiments, ATLAS and CMS, took is that of improving the measurements on the Standard Model as no evidence has been seen so far of possible new particles. Also, the studies of the properties of the Higgs particle have been refined as promised and the news are really striking.

In a communicate to the public (see here), CERN finally acknowledge, for the first time, a significant discrepancy between data from CMS and Standard Model for the signal strengths in the Higgs decay channels. They claim a 17% difference. This is what I advocated for some years and I have published in reputable journals. I will discuss this below. I would like only to show you the CMS results in the figure below.

ATLAS, by its side, is seeing significant discrepancy in the ZZ channel (2\sigma) and a 1\sigma compatibility for the WW channel. Here are their results.

On the left the WW channel is shown and on the right there are the combined \gamma\gamma and ZZ channels.

The reason of the discrepancy is due, as I have shown in some papers (see here, here and here), to the improper use of perturbation theory to evaluate the Higgs sector. The true propagator of the theory is a sum of Yukawa-like propagators with a harmonic oscillator spectrum. I solved exactly this sector of the Standard Model. So, when the full propagator is taken into account, the discrepancy is toward an increase of the signal strength. Is it worth a try?

This means that this is not physics beyond the Standard Model but, rather, the Standard Model in its full glory that is teaching something new to us about quantum field theory. Now, we are eager to see the improvements in the data to come with the new run of LHC starting now. In the summer conferences we will have reasons to be excited.

by mfrasca at March 20, 2018 09:17 AM

March 17, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

Remembering Stephen Hawking

Like many physicists, I woke to some sad news early last Wednesday morning, and to a phoneful of requests from journalists for a soundbyte. In fact, although I bumped into Stephen at various conferences, I only had one significant meeting with him – he was intrigued by my research group’s discovery that Einstein once attempted a steady-state model of the universe. It was a slightly scary but very funny meeting during which his famous sense of humour was fully at play.

4Hawking_1 (1)

Yours truly talking steady-state cosmology with Stephen Hawking

I recalled the incident in a radio interview with RTE Radio 1 on Wednesday. As I say in the piece, the first words that appeared on Stephen’s screen were “I knew..” My heart sank as I assumed he was about to say “I knew about that manuscript“. But when I had recovered sufficiently to look again, what Stephen was actually saying was “I knew ..your father”. Phew! You can find the podcast here.

Image result for cormac o raifeartaigh stephen hawking

Hawking in conversation with my late father (LHS) and with Ernest Walton (RHS)

RTE TV had a very nice obituary on the Six One News, I have a cameo appearence a few minutes into the piece here.

In my view, few could question Hawking’s brilliant contributions to physics, or his outstanding contribution to the public awareness of science. His legacy also includes the presence of many brilliant young physicists at the University of Cambridge today. However, as I point out in a letter in today’s Irish Times, had Hawking lived in Ireland, he probably would have found it very difficult to acquire government funding for his work. Indeed, he would have found that research into the workings of the universe does not qualify as one of the “strategic research areas” identified by our national funding body, Science Foundation Ireland. I suspect the letter will provoke an angry from certain quarters, but it is tragically true.

Update

The above notwithstanding, it’s important not to overstate the importance of one scientist. Indeed, today’s Sunday Times contains a good example of the dangers of science history being written by journalists. Discussing Stephen’s 1974 work on black holes, Bryan Appleyard states  “The paper in effect launched the next four decades of cutting edge physics. Odd flowers with odd names bloomed in the garden of cosmic speculation – branes, worldsheets , supersymmetry …. and, strangest of all, the colossal tree of string theory”.

What? String theory, supersymmetry and brane theory are all modern theories of particle physics (the study of the world of the very small). While these theories were used to some extent by Stephen in his research in cosmology (the study of the very large), it is ludicrous to suggest that they were launched by his work.

 

by cormac at March 17, 2018 08:27 PM

March 16, 2018

Sean Carroll - Preposterous Universe

Stephen Hawking’s Scientific Legacy

Stephen Hawking died Wednesday morning, age 76. Plenty of memories and tributes have been written, including these by me:

I can also point to my Story Collider story from a few years ago, about how I turned down a job offer from Hawking, and eventually took lessons from his way of dealing with the world.

Of course Hawking has been mentioned on this blog many times.

When I started writing the above pieces (mostly yesterday, in a bit of a rush), I stumbled across this article I had written several years ago about Hawking’s scientific legacy. It was solicited by a magazine at a time when Hawking was very ill and people thought he would die relatively quickly — it wasn’t the only time people thought that, only to be proven wrong. I’m pretty sure the article was never printed, and I never got paid for it; so here it is!

(If you’re interested in a much better description of Hawking’s scientific legacy by someone who should know, see this article in The Guardian by Roger Penrose.)

Stephen Hawking’s Scientific Legacy

Stephen Hawking is the rare scientist who is also a celebrity and cultural phenomenon. But he is also the rare cultural phenomenon whose celebrity is entirely deserved. His contributions can be characterized very simply: Hawking contributed more to our understanding of gravity than any physicist since Albert Einstein.

“Gravity” is an important word here. For much of Hawking’s career, theoretical physicists as a community were more interested in particle physics and the other forces of nature — electromagnetism and the strong and weak nuclear forces. “Classical” gravity (ignoring the complications of quantum mechanics) had been figured out by Einstein in his theory of general relativity, and “quantum” gravity (creating a quantum version of general relativity) seemed too hard. By applying his prodigious intellect to the most well-known force of nature, Hawking was able to come up with several results that took the wider community completely by surprise.

By acclimation, Hawking’s most important result is the realization that black holes are not completely black — they give off radiation, just like ordinary objects. Before that famous paper, he proved important theorems about black holes and singularities, and afterward studied the universe as a whole. In each phase of his career, his contributions were central.

The Classical Period

While working on his Ph.D. thesis in Cambridge in the mid-1960’s, Hawking became interested in the question of the origin and ultimate fate of the universe. The right tool for investigating this problem is general relativity, Einstein’s theory of space, time, and gravity. According to general relativity, what we perceive as “gravity” is a reflection of the curvature of spacetime. By understanding how that curvature is created by matter and energy, we can predict how the universe evolves. This may be thought of as Hawking’s “classical” period, to contrast classical general relativity with his later investigations in quantum field theory and quantum gravity.

Around the same time, Roger Penrose at Oxford had proven a remarkable result: that according to general relativity, under very broad circumstances, space and time would crash in on themselves to form a singularity. If gravity is the curvature of spacetime, a singularity is a moment in time when that curvature becomes infinitely big. This theorem showed that singularities weren’t just curiosities; they are an important feature of general relativity.

Penrose’s result applied to black holes — regions of spacetime where the gravitational field is so strong that even light cannot escape. Inside a black hole, the singularity lurks in the future. Hawking took Penrose’s idea and turned it around, aiming at the past of our universe. He showed that, under similarly general circumstances, space must have come into existence at a singularity: the Big Bang. Modern cosmologists talk (confusingly) about both the Big Bang “model,” which is the very successful theory that describes the evolution of an expanding universe over billions of years, and also the Big Bang “singularity,” which we still don’t claim to understand.

Hawking then turned his own attention to black holes. Another interesting result by Penrose had shown that it’s possible to extract energy from a rotating black hole, essentially by bleeding off its spin until it’s no longer rotating. Hawking was able to demonstrate that, although you can extract energy, the area of the event horizon surrounding the black hole will always increase in any physical process. This “area theorem” was both important in its own right, and also evocative of a completely separate area of physics: thermodynamics, the study of heat.

Thermodynamics obeys a set of famous laws. For example, the first law tells us that energy is conserved, while the second law tells us that entropy — a measure of the disorderliness of the universe — never decreases for an isolated system. Working with James Bardeen and Brandon Carter, Hawking proposed a set of laws for “black hole mechanics,” in close analogy with thermodynamics. Just as in thermodynamics, the first law of black hole mechanics ensures that energy is conserved. The second law is Hawking’s area theorem, that the area of the event horizon never decreases. In other words, the area of the event horizon of a black hole is very analogous to the entropy of a thermodynamic system — they both tend to increase over time.

Black Hole Evaporation

Hawking and his collaborators were justly proud of the laws of black hole mechanics, but they viewed them as simply a formal analogy, not a literal connection between gravity and thermodynamics. In 1972, a graduate student at Princeton University named Jacob Bekenstein suggested that there was more to it than that. Bekenstein, on the basis of some ingenious thought experiments, suggested that the behavior of black holes isn’t simply like thermodynamics, it actually is thermodynamics. In particular, black holes have entropy.

Like many bold ideas, this one was met with resistance from experts — and at this point, Stephen Hawking was the world’s expert on black holes. Hawking was certainly skeptical, and for good reason. If black hole mechanics is really just a form of thermodynamics, that means black holes have a temperature. And objects that have a temperature emit radiation — the famous “black body radiation” that played a central role in the development of quantum mechanics. So if Bekenstein were right, it would seemingly imply that black holes weren’t really black (although Bekenstein himself didn’t quite go that far).

To address this problem seriously, you need to look beyond general relativity itself, since Einstein’s theory is purely “classical” — it doesn’t incorporate the insights of quantum mechanics. Hawking knew that Russian physicists Alexander Starobinsky and Yakov Zel’dovich had investigated quantum effects in the vicinity of black holes, and had predicted a phenomenon called “superradiance.” Just as Penrose had showed that you could extract energy from a spinning black hole, Starobinsky and Zel’dovich showed that rotating black holes could emit radiation spontaneously via quantum mechanics. Hawking himself was not an expert in the techniques of quantum field theory, which at the time were the province of particle physicists rather than general relativists. But he was a quick study, and threw himself into the difficult task of understanding the quantum aspects of black holes, so that he could find Bekenstein’s mistake.

Instead, he surprised himself, and in the process turned theoretical physics on its head. What Hawking eventually discovered was that Bekenstein was right — black holes do have entropy — and that the extraordinary implications of this idea were actually true — black holes are not completely black. These days we refer to the “Bekenstein-Hawking entropy” of black holes, which emit “Hawking radiation” at their “Hawking temperature.”

There is a nice hand-waving way of understanding Hawking radiation. Quantum mechanics says (among other things) that you can’t pin a system down to a definite classical state; there is always some intrinsic uncertainty in what you will see when you look at it. This is even true for empty space itself — when you look closely enough, what you thought was empty space is really alive with “virtual particles,” constantly popping in and out of existence. Hawking showed that, in the vicinity of a black hole, a pair of virtual particles can be split apart, one falling into the hole and the other escaping as radiation. Amazingly, the infalling particle has a negative energy as measured by an observer outside. The result is that the radiation gradually takes mass away from the black hole — it evaporates.

Hawking’s result had obvious and profound implications for how we think about black holes. Instead of being a cosmic dead end, where matter and energy disappear forever, they are dynamical objects that will eventually evaporate completely. But more importantly for theoretical physics, this discovery raised a question to which we still don’t know the answer: when matter falls into a black hole, and then the black hole radiates away, where does the information go?

If you take an encyclopedia and toss it into a fire, you might think the information contained inside is lost forever. But according to the laws of quantum mechanics, it isn’t really lost at all; if you were able to capture every bit of light and ash that emerged from the fire, in principle you could exactly reconstruct everything that went into it, even the print on the book pages. But black holes, if Hawking’s result is taken at face value, seem to destroy information, at least from the perspective of the outside world. This conundrum is the “black hole information loss puzzle,” and has been nagging at physicists for decades.

In recent years, progress in understanding quantum gravity (at a purely thought-experiment level) has convinced more people that the information really is preserved. In 1997 Hawking made a bet with American physicists Kip Thorne and John Preskill; Hawking and Thorne said that information was destroyed, Preskill said that somehow it was preserved. In 2007 Hawking conceded his end of the bet, admitting that black holes don’t destroy information. However, Thorne has not conceded for his part, and Preskill himself thinks the concession was premature. Black hole radiation and entropy continue to be central guiding principles in our search for a better understanding of quantum gravity.

Quantum Cosmology

Hawking’s work on black hole radiation relied on a mixture of quantum and classical ideas. In his model, the black hole itself was treated classically, according to the rules of general relativity; meanwhile, the virtual particles near the black hole were treated using the rules of quantum mechanics. The ultimate goal of many theoretical physicists is to construct a true theory of quantum gravity, in which spacetime itself would be part of the quantum system.

If there is one place where quantum mechanics and gravity both play a central role, it’s at the origin of the universe itself. And it’s to this question, unsurprisingly, that Hawking devoted the latter part of his career. In doing so, he established the agenda for physicists’ ambitious project of understanding where our universe came from.

In quantum mechanics, a system doesn’t have a position or velocity; its state is described by a “wave function,” which tells us the probability that we would measure a particular position or velocity if we were to observe the system. In 1983, Hawking and James Hartle published a paper entitled simply “Wave Function of the Universe.” They proposed a simple procedure from which — in principle! — the state of the entire universe could be calculated. We don’t know whether the Hartle-Hawking wave function is actually the correct description of the universe. Indeed, because we don’t actually have a full theory of quantum gravity, we don’t even know whether their procedure is sensible. But their paper showed that we could talk about the very beginning of the universe in a scientific way.

Studying the origin of the universe offers the prospect of connecting quantum gravity to observable features of the universe. Cosmologists believe that tiny variations in the density of matter from very early times gradually grew into the distribution of stars and galaxies we observe today. A complete theory of the origin of the universe might be able to predict these variations, and carrying out this program is a major occupation of physicists today. Hawking made a number of contributions to this program, both from his wave function of the universe and in the context of the “inflationary universe” model proposed by Alan Guth.

Simply talking about the origin of the universe is a provocative step. It raises the prospect that science might be able to provide a complete and self-contained description of reality — a prospect that stretches beyond science, into the realms of philosophy and theology. Hawking, always provocative, never shied away from these implications. He was fond of recalling a cosmology conference hosted by the Vatican, at which Pope John Paul II allegedly told the assembled scientists not to inquire into the origin of the universe, “because that was the moment of creation and therefore the work of God.” Admonitions of this sort didn’t slow Hawking down; he lived his life in a tireless pursuit of the most fundamental questions science could tackle.

 

by Sean Carroll at March 16, 2018 11:23 PM

Ben Still - Neutrino Blog

Particle Physics Brick by Brick
It has been a very long time since I last posted and I apologise for that. I have been working the LEGO analogy, as described in the pentaquark series and elsewhere, into a book. The book is called Particle Physics Brick by Brick and the aim is to stretch the LEGO analogy to breaking point while covering as much of the standard model of particle physics as possible. I have had enormous fun writing it and I hope that you will enjoy it as much if you choose to buy it.

It has been available in the UK since September 2017 and you can buy it from Foyles / Waterstones / Blackwell's / AmazonUK where it is receiving ★★★★★ reviews

It is released in the US this Wednesday 21st March 2018 and you can buy it from all good book stores and Amazon.com 

I just wanted to share a few reviews of the book as well because it makes me happy!

Spend a few hours perusing these pages and you'll be in a much better frame of mind to understand your place in the cosmos... The astronomically large objects of the universe are no easier to grasp than the atomically small particles of matter. That's where Ben Still comes in, carrying a box of Legos. A British physicist with a knack for explaining abstract concepts... He starts by matching the weird properties and interactions described by the Standard Model of particle physics with the perfectly ordinary blocks of a collection of Legos. Quarks and leptons, gluons and charms are assigned to various colors and combinations of plastic bricks. Once you've got that system in mind, hang on: Still races off to illustrate the Big Bang, the birth of stars, electromagnetism and all matter of fantastical-sounding phenomenon, like mesons and beta decay. "Given enough plastic bricks, the rules in this book and enough time," Still concludes, "one might imagine that a plastic Universe could be built by us, brick by brick." Remember that the next time you accidentally step on one barefoot.--Ron Charles, The Washington Post

Complex topics explained simply An excellent book. I am Head of Physics at a school and have just ordered 60 copies of this for our L6th students for summer reading before studying the topic on particle physics early next year. Highly recommended. - Ben ★★★★★ AmazonUK

It's beautifully illustrated and very eloquently explains the fundamentals of particle ...
This is a gem of a pop science book. It's beautifully illustrated and very eloquently explains the fundamentals of particle physics without hitting you over the head with quantum field theory and Lagrangian dynamics. The author has done an exceptional job. This is a must have for all students and academics of both physics and applied maths! - Jamie ★★★★★ AmazonUK

by Ben (noreply@blogger.com) at March 16, 2018 09:32 PM

March 02, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

Snowbound academics are better academics

Like most people in Ireland, I am working at home today. We got quite a dump of snow in the last two days, and there is no question of going anywhere until the roads clear. Worse, our college closed quite abruptly and I was caught on the hop – there are a lot of things (flash drives, books and papers) sitting smugly in my office that I need for my usual research.

IMG_1459

The college on Monday evening

That said, I must admit I’m finding it all quite refreshing. For the first time in years, I have time to read interesting things in my daily email; all those postings from academic listings that I never seem to get time to read normally. I’m enjoying it so much, I wonder how much stuff I miss the rest of the time.

IMG_1470

The view from my window as I write this

This morning, I thoroughly enjoyed a paper by Nicholas Campion on the representation of astronomy and cosmology in the works of William Shakespeare. I’ve often wondered about this as Shakespeare lived long enough to know of Galileo’s ground-breaking astronomical observations. However, anyone expecting coded references to new ideas about the universe in Shakespeare’s sonnets and plays will be disappointed; apparently he mainly sticks to classical ideas, with a few vague references to the changing order.

I’m also reading about early attempts to measure the parallax of light from a comet, especially by the great Danish astronomer Tycho de Brahe. This paper comes courtesy of the History of Astronomy Discussion Group listings, a really useful resource for anyone interested in the history of astronomy.

While I’m reading all this, I’m also trying to keep abreast of a thoroughly modern debate taking place worldwide, concerning the veracity of an exciting new result in cosmology on the formation of the first stars. It seems a group studying the cosmic microwave background think they have found evidence of a signal representing the absorption of radiation from the first stars. This is exciting enough if correct, but the dramatic part is that the signal is much larger than expected, and one explanation is that this effect may be due to the presence of Dark Matter.

If true, the result would be a major step in our understanding of the formation of stars,  plus a major step in the demonstration of the existence of Dark Matter. However, it’s early days – there are many possible sources of a spurious signal and signals that are larger than expected have a poor history in modern physics! There is a nice article on this in The Guardian, and you can see some of the debate on Peter Coles’s blog In the Dark.  Right or wrong, it’s a good example of how scientific discovery works – if the team can show they have taken all possible spurious results into account, and if other groups find the same result, skepticism will soon be converted into excited acceptance.

All in all, a great day so far. My only concern is that this is the way academia should be – with our day-to-day commitments in teaching and research, it’s easy to forget there is a larger academic world out there.

Update

Of course, the best part is the walk into the village when it finally stops chucking down. can’t believe my local pub is open!

IMG_1480

Dunmore East in the snow today

 

by cormac at March 02, 2018 01:44 PM

March 01, 2018

Sean Carroll - Preposterous Universe

Dark Matter and the Earliest Stars

So here’s something intriguing: an observational signature from the very first stars in the universe, which formed about 180 million years after the Big Bang (a little over one percent of the current age of the universe). This is exciting all by itself, and well worthy of our attention; getting data about the earliest generation of stars is notoriously difficult, and any morsel of information we can scrounge up is very helpful in putting together a picture of how the universe evolved from a relatively smooth plasma to the lumpy riot of stars and galaxies we see today. (Pop-level writeups at The Guardian and Science News, plus a helpful Twitter thread from Emma Chapman.)

But the intrigue gets kicked up a notch by an additional feature of the new results: the data imply that the cosmic gas surrounding these early stars is quite a bit cooler than we expected. What’s more, there’s a provocative explanation for why this might be the case: the gas might be cooled by interacting with dark matter. That’s quite a bit more speculative, of course, but sensible enough (and grounded in data) that it’s worth taking the possibility seriously.

[Update: skepticism has already been raised about the result. See this comment by Tim Brandt below.]

Illustration: NR Fuller, National Science Foundation

Let’s think about the stars first. We’re not seeing them directly; what we’re actually looking at is the cosmic microwave background (CMB) radiation, from about 380,000 years after the Big Bang. That radiation passes through the cosmic gas spread throughout the universe, occasionally getting absorbed. But when stars first start shining, they can very gently excite the gas around them (the 21cm hyperfine transition, for you experts), which in turn can affect the wavelength of radiation that gets absorbed. This shows up as a tiny distortion in the spectrum of the CMB itself. It’s that distortion which has now been observed, and the exact wavelength at which the distortion appears lets us work out the time at which those earliest stars began to shine.

Two cool things about this. First, it’s a tour de force bit of observational cosmology by Judd Bowman and collaborators. Not that collecting the data is hard by modern standards (observing the CMB is something we’re good at), but that the researchers were able to account for all of the different ways such a distortion could be produced other than by the first stars. (Contamination by such “foregrounds” is a notoriously tricky problem in CMB observations…) Second, the experiment itself is totally charming. EDGES (Experiment to Detect Global EoR [Epoch of Reionization] Signature) is a small-table-sized gizmo surrounded by a metal mesh, plopped down in a desert in Western Australia. Three cheers for small science!

But we all knew that the first stars had to be somewhen, it was just a matter of when. The surprise is that the spectral distortion is larger than expected (at 3.8 sigma), a sign that the cosmic gas surrounding the stars is colder than expected (and can therefore absorb more radiation). Why would that be the case? It’s not easy to come up with explanations — there are plenty of ways to heat up gas, but it’s not easy to cool it down.

One bold hypothesis is put forward by Rennan Barkana in a companion paper. One way to cool down gas is to have it interact with something even colder. So maybe — cold dark matter? Barkana runs the numbers, given what we know about the density of dark matter, and finds that we could get the requisite amount of cooling with a relatively light dark-matter particle — less than five times the mass of the proton, well less than expected in typical models of Weakly Interacting Massive Particles. But not completely crazy. And not really constrained by current detection limits from underground experiments, which are generally sensitive to higher masses.

The tricky part is figuring out how the dark matter could interact with the ordinary matter to cool it down. Barkana doesn’t propose any specific model, but looks at interactions that depend sharply on the relative velocity of the particles, as v^{-4}. You might get that, for example, if there was an extremely light (perhaps massless) boson mediating the interaction between dark and ordinary matter. There are already tight limits on such things, but not enough to completely squelch the idea.

This is all extraordinarily speculative, but worth keeping an eye on. It will be full employment for particle-physics model-builders, who will be tasked with coming up with full theories that predict the right relic abundance of dark matter, have the right velocity-dependent force between dark and ordinary matter, and are compatible with all other known experimental constraints. It’s worth doing, as currently all of our information about dark matter comes from its gravitational interactions, not its interactions directly with ordinary matter. Any tiny hint of that is worth taking very seriously.

But of course it might all go away. More work will be necessary to verify the observations, and to work out the possible theoretical implications. Such is life at the cutting edge of science!

by Sean Carroll at March 01, 2018 12:00 AM

February 25, 2018