### Tony Phillips' Take Math Digest Reviews Blog on Math Blogs

 Mail to a friend · Print this article · Previous Columns Tony Phillips' Take on Math in the Media A monthly survey of math news

# This month's topics:

## The mathematics of pasta

Kenneth Chang contributed "Pasta Graduates from Alphabet Soup to Advanced Geometry" to the January 9, 2012 New York Times. Illustrated with Mathematica-generated images of specimens of cappelletti, quadrefiore, mafaldine, fagottini, radiatori and funghini (selected from the 103 forms listed on food-info.net), Chang's piece refers to three sources. One is the Pasta Visualization I and II pages from Sander Huisman's blog (remarkable gemelli). The next is a book devoted to the subject: "Pasta by Design" by Marco Guarnieri and George L. Legendre (Thames & Hudson, London, 2011), which "classifies 92 types of pasta, organizing them into an evolutionlike family tree. For each, the book provides a mathematical equation, a mouthwatering picture and a paragraph of suggestions, like sauces to eat it with." The third is a "pop quiz" administered to students in Kate Okikiolu's Winter 2006 Vector Calculus course at U. C. Dan Diego by Chris Tiee, one of her TAs. Tiee gave the students images of fusilli, penne rigate, conchiglie rigate, cavatappi and farfalle to match with five sets of parametric equations.

The five varieties of pasta used by Chris Tiee in his 2006 Vector Calculus pop quiz. The varieties were to be matched with parametric equations like $$\left ( \begin{array}{c}x\\y\\z \end{array}\right ) = \left ( \begin{array}{c}t+\frac{1}{10}\sin(10s)\\ \frac{2s}{3}(1.2-\frac{1}{1+t^2})\\ \frac{\sin(\pi s)}{2\pi s} \end{array}\right ), ~~-\pi\leq s\leq \pi,~~ -3\leq t\leq 3.$$

## Correlation from information theory

"Detecting Novel Associations in Large Data Sets" appeared in Science for December 16, 2011; authors: a 9-person team led by David Reshef (MIT) and Yakir Reshef. The abstract sets out the main point: "Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships, the maximal information coefficient (MIC)." The MIC realizes the goal of a statistic that can "capture a wide range of interesting associations, not limited to specific function types (such as linear, exponential, or periodic) or even to all functional relationships." As the authors remark, "many important relationships ... are not well modeled by a function." What is the MIC? "Intuitively, MIC is based on the idea that if a relationship exists between two variables, then a grid can be drawn on the scatterplot of the two variables that partitions the data to encapsulate that relationship." More specifically, given a scatterplot, "we explore all grids up to a maximal grid resolution, dependent on the sample size, computing for every pair of integers $(x,y)$ the largest possible mutual information achievable by any $x$-by-$y$ grid applied to the data." MIC is the maximum of these numbers (after some normalization) over all possible grid sizes. [The crucial ingredient here is mutual information. Given an $x$-by-$y$ grid $G$ with boxes numbered $(i,j)$, $1\leq i \leq x,$ $1\leq j\leq y$, one compares for each pair $(i,j)$ the probability $p(i,j)$ that a point of the scatterplot will lie in box $(i,j)$ to the product $p(i,.)p(.,j)$ of the probabilities that a point lies in the $i$th row and the $j$th column. (Clearly, if there is no association between the two coordinates of the scatterplot, these two numbers will be equal). The mutual information contained in $G$ is then $$\sum_{i,j} p(i,j) \log \frac{p(i,j)}{p(i,.)p(.,j)},$$ following Shannon.] The authors compare MIC to other measures of correlation (it works better) and give several applications to real data sets using numbers from the World Health Organization.

## The hit potential equation

The British and European press, including wired.co.uk (Mark Brown, December 19), were quick to pick up the University of Bristol's press release, "Can science predict a hit song?" issued December 17, 2011. According to Brown, "Machine learning engineers from the University of Bristol think they might have the master equation to predicting the popularity of a song. The so-called Hit Potential Equation looks a little something like this: $$\mbox{Score} = (w_1 \times f_1) + (w_2 \times f_2) + (w_3 \times f_3) + (w_4 \times f_4), \mbox{etc.}$$ Simple, right? The "w"s are "weights," [and the "f"s are] musical features like tempo, time signature, song duration, loudness and how energetic it is. By using a machine-learning algorithm, the team could mine official UK top 40 singles charts over the past 50 years to see how important these 23 features are to producing a hit song."

Brown quotes Tijl De Bie, senior lecturer at Bristol: "musical tastes evolve, which means our 'hit potential equation' needs to evolve as well." ScoreAHit.com, the project's website, presents an animation of the fluctuations in the calculated weights, charted from 1962 until the present: a quantitative picture of exactly how UK pop music taste has been changing.

## Numerosity in neural networks

"Neural network gets an idea of number without counting" was posed by Celeste Biever to NewScientist (Tech) on January 20, 2012. "An artificial brain has taught itself to estimate the number of objects in an image without actually counting them, emulating abilities displayed by some animals including lions and fish, as well as humans. Because the model was not preprogrammed with numerical capabilities, the feat suggests that this skill emerges due to general learning processes rather than number-specific mechanisms." Biever, picking up an article in Nature Neuroscience by Ivilin Stoianov and Marco Zorzi (Università di Padova), reports: "The skill in question is known as approximate number sense. A simple test of ANS involves looking at two groups of dots on a page and intuitively knowing which has more dots, even though you have not counted them. Fish use ANS to pick the larger, and therefore safer, shoal to swim in." Stoianov and Zorzi "used a computerised neural network that responds to images and generates new 'fantasy' ones based on rules that it deduces from the original images." Their findings "suggested that the network had learned to estimate the number of objects in each image as part of its rules for generating images."

The finding may also help us to understand dyscalculia - where people find it nearly impossible to acquire basic number and arithmetic skills - and enhance robotics and computer vision."

## Splendeur et misère of mathematicians

The Parisian daily Libération published (online, January 21, 2012) an interview with David Bessis ("poet, author and mathematician"). The interviewer, Anne Diatkine, sets the stage by remarking on the increased public awareness of mathematics--as a profession--in France today. She mentions the exhibition "Mathematics--a Beautiful Elsewhere" running at the Fondation Cartier through February (see Sarah Moroz's travel blog in the online New York Times, January 24) and a recent book, "Fors Intérieurs" by Isabelle Boccon-Gibod (in-depth accounts of the interior lives of eight mathematicians) as part of an effort to make "the ultimate hermetism accessible or even attractive." The Bessis interview is far-ranging but dwells in particular on what mathematicians experience but don't often recount: the pain of "understanding nothing, and often at the heart of one's own work," the anguish of "being too stupid to understand." "Doing mathematical research requires a colossal effort to tame one's fear, to manipulate oneself and to absorb elements which were initially incomprehensible." On the other hand he evokes the "giddy exhilaration of understanding," which he calls the main reason for his love of mathematics. "To be at ease in a world which one moment before was inaccessible--that is extraordinary." [My translations -TP].

Tony Phillips
Stony Brook University
tony at math.sunysb.edu