Research Vignette: Optimization Against Adversarial Uncertainty

by James R. Lee, University of Washington

The rise of machine learning in recent decades has generated a renewed interest in online decision-making, where algorithms are equipped only with partial information about the world, and must take actions in the face of uncertainty about the future. There are various ways to analyze how much the presence of uncertainty degrades optimization. Two of the most general models (competitive analysis and multi-arm bandits) are “probability free” (aka “worst-case,” or “adversarial”), in the sense that one can measure the performance of algorithms without making distributional assumptions on the input.

Competitive analysis

One such model goes by the name of competitive analysis, and has been studied in theoretical CS since the 1970s. Imagine, for instance, that one maintains a binary search tree (BST) with key-value pairs at the nodes. At every point in time, a key is requested, and the corresponding value is looked up via binary search. Along the search path, the algorithm is allowed to modify the tree using standard tree rotation operations.

The cost of such an algorithm is the average number of binary search steps per key-lookup over a long sequence of requests. To minimize the cost, it is beneficial for the algorithm to dynamically rebalance the tree so that frequently-requested keys are near the root. Sleator and Tarjan had this model in mind when they invented the Splay Tree data structure. Their (still unproven) Dynamic Optimality Conjecture asserts that, for every sequence of key requests, the number of operations used by the Splay Tree is within a constant factor of the number of operations used by any binary search tree, even an offline algorithm that is allowed to see the entire sequence of key requests in advance. (This is a very strong requirement! We make no assumption on the structure of the request sequence, and have to compare our performance to an algorithm that sees the entire request sequence up front.)

In the language of competitive analysis, the conjecture states that the Splay Tree algorithm is “competitive.” In general, an \(\alpha\)-competitive online algorithm is one whose total cost on every input sequence is within an \(\alpha\) factor of the cost incurred by the optimal offline algorithm that sees the whole input sequence in advance.

Multi-arm bandits

Another compelling model arising in online learning is the multi-armed bandit problem. Here, an agent has a set \(\mathcal{A}\) of feasible actions. At time \(t=1,2,3,\ldots\), the agent plays an action \(a_t \in \mathcal{A}\), and only then the cost of that action is revealed. Imagine, for instance, choosing an advertisement \(a_t \in \mathcal{A}\) to present to a user, and then learning afterward the probability it led to a sale. The goal of an algorithm in this model is to achieve a total cost (often called the loss) that is not too much worse than the best fixed action in hindsight. The gap between the two is called the regret.

CONTINUE READING

Research Vignette: Ramsey Graphs and the Error of Explicit 2-Source Extractors

by Amnon Ta-Shma, Tel Aviv University

About a century ago, Ramsey proved that in any graph of size N, there exists a monochromatic subset of size ½ log(N), i.e., it is either a clique or an independent set.1 In 1947, Erdős proved there exist graphs with no monochromatic set of size 2log(N).2 Erdős’ proof is one of the first applications of the probabilistic method. It is a simple, straightforward counting argument, and like many other counting arguments, it shows almost any graph is good without giving any clue as to any specific such graph. Erdős offered $100 for finding an explicit K-Ramsey graph (a graph where all sets of size K are not monochromatic) for K=O(log N). The best explicit construction until a few years ago was log(K)= for some constant <1.3

All of this dramatically changed last year. Cohen,4 and independently Chattopadhyay and Zuckerman,5 constructed K-Ramsey graphs with log K=poly loglog(N). Remarkably, Chattopadhyay and Zuckerman explicitly construct the stronger object of a two-source extractor. A function  is a (K,ϵ) 2-source extractor if for every two cardinality K subsets , the distribution E(A,B) (obtained by picking uniformly  and outputting E(a,b)) is ε close to uniform in the variational distance. Roughly speaking, the Ramsey graph problem is to find a (K, ε) 2-source extractor with any error ε= ε(K) smaller than 1, even allowing error that is exponentially close to 1. In contrast, The CZ construction gives a 2-source extractor with constant error. A 2-source extractor is a basic object of fundamental importance, and the previous best 2-source construction was Bourgain’s extractor, requiring log(K) = for some small constant α>0. The CZ construction is an exponential improvement over that, requiring only log(K)=polyloglog(N).

How does the CZ construction work? Roughly speaking, the first step in the CZ construction is to encode a sample from one source (say ) with a t non-malleable extractor. I do not want to formally define non-malleability here, but this essentially means that the bits of the encoded string are “almost” t-wise independent, in the sense that except for few “bad” bits, when we look at t bits, they are close to being uniform. The next step in the CZ construction is to use the sample from the second source () to sample a substring of the encoding of a. The sampling is done using an extractor and uses the known relationship between extractors and samplers.6 Finally, a deterministic function, conceptually similar to the Majority function, is applied on the bits of the substring.

The CZ construction achieves log K=polyloglog(N), where the non-explicit argument of Erdős shows log K=loglog(N)+1 is sufficient. The first bottleneck in the CZ construction, pointed out by Cohen and Schulman,7 is the use of extractors as samplers in the construction. In a recent work, Ben-Aroya, Doron and I showed how to solve this problem using samplers with multiplicative error.8 Furthermore, Dodis et al. showed such samplers are related to low entropy-gap condensers, and Yevgeniy gave a series of talks on such condensers and their applications, mainly in cryptography, during the Spring 2017 Simons Institute program on Pseudorandomness.9 With this, the currently best explicit construction has log K=loglog(N) polylogloglog(N). The extra polylogloglog(N) factor is because current explicit t-non-malleable constructions, even for a constant t, have a suboptimal dependence on ε.

Yet, in my opinion, the most pressing bottleneck is of a completely different nature. The CZ result efficiently constructs a two-source (K,ε) extractor for small values of K, but a large error ε. Specifically, the algorithm computing E(a,b) has running time poly(1/ε), and if we take explicitness to mean running time polynomial in the input length log(N), we can only hope for error which is 1/polylog(N). In contrast, a straightforward probabilistic method argument shows we can hope for an ε which is polynomially small in K. The currently best low-error constructions are Bourgain’s 2-source extractor requiring log and Raz’ extractor which allows one source to be (almost) arbitrarily weak, but requires the other source to have min-entropy rate above half (the entropy rate is the min-entropy divided by the length of the string). At the Simons Institute program on Pseudorandomness, we were wondering whether the CZ approach that allows both sources to be weak can be somehow modified to give a low-error construction?

CONTINUE READING

Research Vignette: Promise and Limitations of Generative Adversarial Nets (GANs)

by Sanjeev Arora, Princeton University and Institute for Advanced Study

If we are asked to close our eyes and describe an imaginary beach scene, we can usually do so in great detail. Can a machine learn to do something analogous, namely, generate realistic and novel images different from those it has seen before? One expects that some day, machines will be creative and be able to generate even new essays, songs, etc., but for this article, let’s discuss only images. In machine learning, this goal of generating novel images has been formalized as follows. We hypothesize that realistic images are drawn from a probability distribution on the (vast) space of all possible images. Humans appear to be able to imagine novel samples from this vast distribution after having seen a reasonably small number of examples that arose in their personal experience. We would like machines to do the same: sample from the distribution of all realistic images. While this framing of the problem appears anticlimactic – reducing creativity/imagination to the more prosaic act of learning to generate samples from a vast distribution – it is nevertheless a powerful framework that already presents difficult computational hurdles. Many past approaches for solving this distribution learning problem used explicit statistical models, and tended to end in failure (or at best, tepid success) because real life data is just too complicated to capture using simple models.

This article is about Generative Adversarial Nets (GANs), a proposal by Goodfellow et al. in 20141 to solve this task by harnessing the power of large-scale deep learning (sometimes also called neural net training). Specifically, in the last 5-6 years, deep learning has become very successful at teaching machines to recognize familiar objects in images, in the sense that they can give labels to scenes such as street, tree, person, bicyclist, parked cars with precision approaching or exceeding that of humans. Perhaps this ability can lead naturally to a solution of the sampling problem?

CONTINUE READING

Research Vignette: Setting Posted Prices Under Uncertainty

by Amos Fiat, Tel Aviv University

Selfish Behaviour and Uncertainty
The author had the undeserved great fortune to be invited to two semesters at the Simons Institute for the Theory of Computing. The author had a truly wondrous time and is immensely grateful for the this fantastic opportunity. The opportunity to hear and interact with the amazing people attending was priceless. Thus, when asked to write a short Research Vignette,1 the author felt that it would be churlish to object strenuously. One can but hope that this is not yet another example of an error in judgment.2

Aspects of uncertainty have been studied in many disciplines, including philosophy, psychology, physics, the life sciences, economics and computer science, among others. In this Vignette, we address some aspects and models of uncertainty in statistics, computer science, and economics.

Temporal uncertainty occurs where the future or certain aspects of the future are unknown. Decisions made today (e.g., agreeing to write this article) may have unforseen consequences tomorrow (when the disastrous NYT critique is published). Optimal stopping theory,3 prophet inequality settings,4 secretary settings,5 and competitive analysis of online algorithms6 all deal with aspects of and models for temporal uncertainty. This vast body of work includes both Bayesian and worst-case models.

Another element of uncertainty arises when selfish agents interact. Clearly, it is useful to know how much a customer is willing to pay before trying to sell her something. Given that the techniques used by Torquemada are often condemned,7 one needs to be more subtle when considering the impact of private information.

One approach is to consider rational behavior and study resulting equilibria.8 Pricing equilibria (competitive equilibria) have been studied in many settings.9 Social choice theory10 and mechanism design11 seek to engineer algorithms so as to achieve certain desired goals. A wide variety of equilibrium notions have been defined, including dominant strategy, Bayes-Nash, and many others.

The computer science outlook is to quantify things – in this case, how good an outcome arises in equilibria when compared to optimal outcomes,12 and what mechanisms can be implemented in poly time.13

The setting of interest
Consider a model of online decision-making mechanisms in which selfish agents arrive sequentially, and the mechanism decides upon an outcome and payment for each arriving agent, where payments are used to align the incentives of the agent with that of the mechanism. There is temporal uncertainty because the preferences of the agents (their types) are unknown. The agent may have an opportunity to specify her type, but the decisions made by the mechanism might not be in the best interest of the agent. This may result in the agent strategically misrepresenting her preferences so as to achieve a better outcome for herself. A mechanism is truthful if it is always in the best interest of the agent to report her type truthfully.

For some time I have been obsessed with a specific class of truthful online mechanisms that take the form of dynamic posted prices.14 The assumption here is that the future is entirely unknown, and can be determined adversarially with no guarantees on future behavior. Dynamic pricing schemes are truthful online mechanisms that post prices for every possible outcome, before the next agent arrives. Then, the agent chooses the preferred outcome — minimizing the cost for the outcome plus the price tag associated with the outcome.

Online problems are either minimization problems (e.g., minimize the sum of costs) or maximization problems (e.g., maximize the number of agents served). Although optimal solutions to maximization/minimization objectives can be cast as the other, the competitive ratio is quite different in the two settings. Online algorithms have been devised with both maximization and minimization objectives. The technique of “classify and randomly select” often gives simple randomized algorithms for maximization objectives which also naturally translate into truthful mechanisms. In contrast, minimization objectives (e.g., k-server and makespan) require entirely different techniques. Converting online algorithms into mechanisms without performance degradation opens up an entire new class of problems for which incentive compatible mechanism design is applicable.

We note there are many other problems where posted prices play a major role, within widely differing models, information states, and various stochastic assumptions.15

A dynamic pricing scheme is inherently truthful, since prices are determined irrespective of the type of the next agent. Posted price mechanisms have many additional advantages over arbitrary truthful online mechanisms. In particular, such mechanisms are simple: agents need not trust or understand the logic underlying the truthful mechanism, agents are not required to reveal their type, and there is no need to verify that the agents follow the decision made by the truthful online mechanism.

A posted price mechanism is a truthful online algorithm, and as such, can perform no better than the best online algorithm. The main goal of this approach is to study the performance of dynamic posted price mechanisms (quantified by the competitive ratio measure) and compare them with the performance of the best online algorithm. One may think of this problem as analogous to one of the central questions in algorithmic mechanism design in offline settings: compare the performance of the best truthful mechanism (quantified by the approximation ratio measure) with the performance of the best non-truthful algorithm.

In a paper at EC 2017, Feldman, Roytman and I presented constant competitive dynamic pricing schemes for makespan minimization in job scheduling. Events represent jobs; the job type contains the job’s processing times on various machines. Agents seek to complete their job as soon as possible, and therefore prefer to be assigned to a machine whose load (including the new job) is minimized.16 One can consider so-called related machines (where machines have speeds, and jobs have some quantity of work to be done) and unrelated machines (where the time associated with a job arbitrarily depends on the machine). Previous online algorithms for the problem17 are not truthful — in that a job may misrepresent its size so as to get a preferential assignment to a machine.

An online truthful mechanism for this setting determines an allocation and payment for each arriving agent upon arrival. That is, upon the arrival of a job, based on the job’s processing times, the mechanism assigns the job to some machine and determines the payment the agent should make. The cost of an agent is the sum of the machine’s load (including her own processing time) and the payment. Each agent seeks to minimize her cost.

A dynamic posted price mechanism for this setting sets prices on each machine, before the next agent arrives (prices may change over time). The next agent to arrive seeks to minimize her cost, i.e., the load on the chosen machine (including her own load) plus the posted price on the machine. The agent breaks ties arbitrarily.

An example of dynamic pricing: Makespan minimization, related machines
We now motivate how dynamic pricing is useful, via a small example. We do so by comparing the schedule produced without any pricing to the schedule produced via a dynamic pricing scheme. Schedules obtained without pricing are equivalent to schedules produced by the greedy algorithm that assigns each job j to a machine i that minimizes ℓi(j − 1) + (the load on machine i prior to the arrival of job j, plus the additional time required for job j itself).

Figure 1: Related machines: the optimal makespan, the greedy algorithm, and dynamic pricing. The algorithm for setting dynamic prices is missing.


In our example (given as Figure 1) there are m = 3 machines, with speeds s1 = , s2 = (1 + ϵ), and s3 = 1 + 2ϵ; and n = 3 jobs, with sizes p1 = (1 + ϵ), p2 = , and p3 = 1 + 2ϵ. The left, middle, and right columns show the optimal assignment, the greedy assignment, and the assignment obtained by our dynamic pricing scheme, respectively. In the middle and right columns, the arrival order is from bottom to top.

Optimal makespan: The optimal makespan is L = 1, achieved by assigning job 1 to machine 2, job 2 to machine 1, and job 3 to machine 3.

Greedy: The greedy algorithm assigns job 1 to machine 3, since the machine i that minimizes is the fastest machine (initially, all loads are 0). Job 2 is also assigned to machine 3, since ℓ3(1) + = < = < (for sufficiently small ϵ > 0). Lastly, job 3 is also assigned to machine 3, since ℓ3(2) + = < = < . Hence, the greedy algorithm assigns all jobs to machine 3, resulting in a makespan of ≈ 2.

Pricing scheme: Our dynamic pricing scheme sets prices before the arrival of each job, which are independent of the type of the incoming job. We omit the explanation as to how to set these prices.18 The important aspect is that these prices are set before knowing what the next job will be. The performance guarantee must hold irrespective of the future. Let cij be the cost of job j on machine i; this is the sum of the completion time and the additional posted price currently on the machine.

Job 1 chooses machine 1, since c11 = + π11 = 1 + ϵ < 1 + π21 = + π21 = c21 < + 1 + π21 + π31 = c31.

Prior to the arrival of job 2, the dynamic pricing scheme sets new prices (see Figure 1), and job 2 chooses machine 2, since c22 = + π22 < + 1 + ϵ + = ℓ1(1) + = c12 < + 2 ≈ + π32 = c32.

Finally, prior to the arrival of job 3, the dynamic pricing scheme sets yet new prices and  job 3 chooses machine 3, since c33 = + π33 , while c13 = ℓ1(2) + = 1 + ϵ + 2p3 ≈ 3 and c23 = ℓ2(2) + + π23 = + + π23 ≈ 3.

Since machine 1 has the highest load, the schedule produced by our dynamic pricing scheme achieves a makespan of ℓ1(3) = 1 + ϵ. This example can be extended to show that greedy can be as bad as Ω(log m)-competitive, while in contrast our dynamic pricing scheme is O(1)-competitive.

During a talk in the Fall 2017 Simons Institute program on Algorithms and Uncertainty, I proposed that one study dynamic pricing schemes for flow time; and indeed, one definite outcome of this semester is a paper by Im, Moseley, Pruhs and Stein that comes up with a constant competitive dynamic pricing scheme to minimize the maximum flow time (the maximum time an agent is in the system). This improves on the previous online algorithm for the problem in terms of the competitive ratio, and has the added advantage of being truthful and simple.

Online algorithms, online mechanisms, and dynamic pricing
One critical question is the following: where do the boundaries lie between online algorithms, online mechanisms (not dynamic pricing) and dynamic pricing? In our EC paper, we show that there exist problems – makespan minimization on unrelated machines, where the pair (job, machine) determines the processing time – where good online mechanisms exist but there are no good dyanmic pricing schemes.

The connection between online mechanisms and dynamic pricing schemes is not entirely trivial.19 Dynamic pricing schemes are obviously truthful mechanisms, but converting an arbitrary online mechanism into a dynamic pricing scheme may not be possible.

We can show that online mechanisms with certain performance guarantees that also have properties a to c below can be converted into dynamic pricing schemes with the same guarantee: a) the online mechanism must be prompt (where the payments are determined immediately); b) ties in agent utilities do not arise, or can be resolved arbitrarily without harming performance guarantees — this is not an issue in the mechanism-design setting because the mechanism may decide how to break ties, but it is an issue with pricing schemes; and c) the mechanism does not require “voluntary revelation” of agent types – in pricing schemes, the only information available to the dynamic pricing scheme is what the previous agents actually chose; pricing schemes are not direct revelation schemes.

Afterword
It has been a pleasure discussing these and other problems with researchers visiting the Simons Institute, and at my home site, Tel Aviv University. I am grateful for discussions with and gracious instruction from Nikhil Bansal, Avrim Blum, Alon Eden, Michal Feldman, Anupam Gupta, Ilia Gorlick, Haim Kaplan, Anna Karlin, Dick Karp, Thomas Kesselheim, Bobby Kleinberg, Elias Koutsoupias, Stefano Leonardi, Christos Papadimitriou, Kirk Pruhs, Tim Roughgarden, and Matt Weinberg. In various combinations, we have tried to attack many other online problems via dynamic pricing. Many of these are wide open; some (minor) progress has been made on others. Many an idea arose in the numerous talks and discussions at the Simons Institute.

Notes

1Merriam-Webster online dictionary: 2a) a short descriptive literary sketch; 2b) a brief incident or scene (as in a play or movie).

2According to Aristole (Poetics, 4rd century BCE), an ideal tragedy is caused by error in judgment by the protagonist, and not by unavoidable error.

3Wald (1945), Arrow, Blackwell and Girshick (1948); Snell (1952).

4Cayley (1875); Krengel, Sucheston and Garling (1977).

5Kepler (1613); Martin Gardner (1960); Gibert and Mosteler (1966).

6Sleator and Tarjan (1985).

7Recent political developments in North America notwithstanding.

8von Neumann (1928).

9Cournot (1838); Walras (1874); Arrow and Debreu (1954).

10Condorcet (1785); Arrow (1951); Gibbard and Satterthwaite (1973,1975).

11Hurwicz, Maskin and Myerson (2007).

12Koutsoupias and Papadimitriou (1999).

13Nisan and Ronen (1999).

14Fiat, Mansour and Nadav (2008) — Packet routing; Cohen-Addad, Eden, Fiat and Jeż (2015) — Task systems, k-server, and metrical matching; Feldman, Fiat and Roytman (2017) — Makespan minimization; Im, Moseley, Pruhs and Stein (2017), the latter two discussed herein.

15E.g., Myerson’s virtual values, prophet inequalities, secretary problems, combinatorial markets – Feldman, Gavin and Lucier (2013, 2015 and 2016); Envy-free pricings – Guruswami, Hartline, Karlin, Kempe, Kenyon and McSherry (2005); and many others.

16In this interpretation, “load” is the time required by the server to deal with all current jobs in the server queue, and jobs are processed in a first-in, first-out manner, i.e., jobs enter a server queue.

17Awerbuch, Azar, Fiat, Plotkin and Waarts (1993).

18EC17 and on the archive.

19Many thanks to Moshe Babaioff, Liad Blumrosen, Yannai A. Gonczarowski and Noam Nisan for discussions helpful in clarifying this point.  

Research Vignette: Entangled Solutions to Logic Puzzles

by Albert Atserias, Universitat Politècnica de Catalunya

The Fall 2016 Simons Institute program on Logical Structures and Computation focused on four themes: a) finite and algorithmic model theory; b) logic and probability; c) logic and quantum mechanics; and d) logic and databases. In this Research Vignette, I want to highlight one of the outcomes of this fruitful program in the form of an emerging direction of research that, quite surprisingly, touches on all four themes at once.

Two-player refereed games
A game is played by Alice and Bob, who confront a verifier. After an initial gathering to agree on a common strategy, Alice and Bob will not be allowed to communicate during the game. The game starts with the verifier privately choosing a pair of queries (p,q) ∈ P × Q according to some probability distribution π known to everyone; the verifier sends p to Alice and q to Bob. Alice and Bob reply with a pair of answers (r,s) ∈ R × S according to their previously agreed strategy; Alice replies r and Bob replies s. Upon receipt of the answers, the verifier accepts or rejects depending on whether a predicate V (p,q,r,s) also known to everyone holds or not. Alice’s and Bob’s goal is to maximize the probability of winning, i.e., making the verifier accept. Let this maximum be called the value of the game.

How could we model the value of this game? One perfectly natural approach would be to model it as a constraint-optimization problem. The possible queries are the variables, the possible answers are the values for these variables, and the verifier’s predicate V (p,q,r,s) defines the constraints, weighted by π(p,q). The optimal strategy would then be an assignment of values to variables that maximizes the total weight of satisfied constraints.

However, this model does not seem to capture all the different ways Alice and Bob could interact in their initial gathering. For example, Alice and Bob could have agreed to use the outcomes of a number of coin-flips, which they could have recorded in their respective memory sticks, to determine their answers during the game. As long as they don’t communicate during the game, this form of interaction is allowed! A better model for this situation would then seem to be that Alice and Bob choose functions of the form A : P × T R and B : Q × T S, where T denotes the set of potential coin-flip results, which has an associated probability distribution σ. Their goal would now be to maximize the average weight:

It turns out that in this particular case, there is no difference in the optimal value of the two models; but one could easily imagine that if we missed this one, we could well be missing other ways that Alice and Bob could interact in defining their strategy. And indeed, we are missing some such forms of interaction.

Suppose that Alice and Bob, when they design their strategy, agree to prepare a pair of elementary particles in a suitable entangled state, in the way they were taught was possible in their quantum mechanics class. When they separate, Alice takes one of the particles with her and Bob takes the other. When they play, Alice and Bob perform certain measurements on their respective particles depending on the query each one gets, and then they use the outcomes to determine their answers. Notice that Alice and Bob still stay separated from each other; they just perform local measurements on their local particles! The question is: could the fact that the particles were entangled in the past give Alice and Bob a strategy that has greater probability of winning?

Quantum vs. hidden-variable theories
The peculiar properties of quantum entanglement have fascinated physicists, mathematicians, logicians, and philosophers since the 1920’s. It is well-known that such giants as Einstein were initially reluctant to accept the uncertainty principle of quantum mechanics in its full strength. Others, such as von Neumann, were profoundly inspired by the topic to develop their deepest work. In more recent times, theoretical computer scientists have brought forth a computational and information-theoretic perspective that, hopefully, sheds some more light. The origins of this approach go back to the famous 1964 article of J.S. Bell, “On the Einstein-Podolsky-Rosen Paradox.”

Following Bell, let’s consider a Gedankenexperiment. Before the experiment starts, a large number N of pairs of particles of so-called spin are independently prepared in an entangled state, the singlet state. The experiment starts when Alice and Bob travel to distant planets with one particle each from each pair. Once in place, they measure the spins of their particles along different axes. Alice uses axis a and Bob uses axis b, where a and b are 3-dimensional unit vectors chosen by the verifier. As they do their measurements, they collect outcomes r1,,rN ∈{+1,−1} for Alice, and s1,,sN ∈{+1,−1} for Bob, which they send back to the verifier. At this point Bell asked: what are we to expect for the value of i=1Ns iri? In other words, if the verifier counts the number of times that ri = si and subtracts the number of times that ri ≠ si, what are we to expect?

Bell computed what quantum theory predicts for these questions, and he concluded that one is to expect that i=1Nr isi approaches − cos θ as N grows, where θ ∈ [0] is the angle between the axis vectors a and b. Put differently, if we ask the verifier to accept any single pair of data (ri,si) only if ri = si, one is to expect that the verifier would accept with probability approaching cos θ. The point of Bell’s study was to contrast this calculation with what would be expected from a local hidden-variable theory of the type Einstein and his co-authors had hoped would explain the uncertainty in the experiment. In such a model, Bell argued, the statistics of i=1Nr isi would have to take the form of an average ∑ tT σ(t)i=1Nr i(t)si(t), and he went on to prove that in such a case it would have to approach the quantity −1 + θ. In other words, a verifier that accepts only if ri = si would have to accept with probability approaching . The discrepancy between and the earlier cos θ is particularly noticeable at θ ≈ 133.56 where the ratio is minimized and is as small as 0.878567.

The analysis in Bell’s Theorem can be turned into a game in which entangled strategies give higher probability of winning than unentangled ones. The question arises, however, of whether this difference can be witnessed only statistically, i.e., quantitatively, or if it manifests itself in more absolute terms, i.e., qualitatively. In other words, could there be a game in which Alice and Bob can make the verifier accept with certainty if they use entanglement but not if they don’t? Such questions had also been considered by physicists, including Bell himself, with the conclusion that, also at this level, entangled strategies are superior to unentangled ones. This type of refinement of the sense in which entanglement is more powerful, which led Mermin to coin Kell-Kochen-Specker type theorems in his beautiful paper, “Hidden variables and the two theorems of John Bell,” ties better to the problems of interest to logicians, and will take our attention for the rest of this Vignette.

Quantum relaxations of constraint satisfaction problems
Take an instance of the constraint satisfaction problem with Boolean variables X1,,Xn ranging over {+1,−1}, and constraints given by, say, polynomial equations of the form P(Xi1,,Xik) = 0. There is a well-known game that models this problem: the verifier chooses both a constraint and a variable that appears in this constraint, sends the constraint to Alice and the variable to Bob, and expects an assignment that satisfies the constraint from Alice, and an assignment that agrees with Alice’s on the pointed variable from Bob. It is easy to see that the solutions to the constraint-satisfaction instance give rise to (unentangled) strategies that win with certainty in the game, and vice-versa. What about entangled strategies? What do they correspond to at the level of the constraint-satisfaction instance? In their ICALP 2014 paper, theoretical computer scientists Richard Cleve and Rajat Mittal addressed this question and, to answer it, introduced the following quantum relaxation of the constraint-satisfaction problem:

Let the variables X1,,Xn range over Hermitian d-dimensional matrices in ℂd×d for some arbitrary but unspecified dimension d ≥ 1. Impose the conditions that Xi2 = I, where I is the identity matrix, and that the matrix product on Xi and Xj commutes whenever Xi and Xj appear together in at least one of the given polynomial equations. Note that the case d = 1 is no relaxation at all since the 1-dimensional matrices are just scalars, whose product obviously commutes, and the equations Xi2 = I translate to X i ∈{+1,−1}. Cleve and Mittal proved that entangled winning strategies in the game give rise to solutions to this relaxed version of the problem, and vice-versa.

Now we can ask the type of questions that Bell asked. Could there be a system of polynomial equations that is quantum-satisfiable in the sense above but not classically satisfiable? Some well-known such examples come from the work of the physicists in their proofs of the Bell-Kochen-Specker Theorems; the best known is the Mermin-Peres Magic Square made of six polynomial equations (the three rows and the three columns in the square below) over nine variables:

Seen as a logic puzzle of the kind that one used to find in newspapers, this appears to have no solution: all variables appear exactly twice so the left-hand sides of the equations multiplied together give 1, while the right-hand sides give −1. In contrast, the system is quantum-satisfiable by means of a 4-dimensional solution made of suitable Kronecker products of the classical Pauli matrices (Fig 3. in Mermin’s paper — see Notes for further reading, below).

Emerging directions
In his series of lectures in the Logic and Computation Boot Camp that took place at the begining of the program, Samson Abramsky gave an introduction to the topic of Bell-type results from a logician’s viewpoint. A recurrent theme within the lectures was the extent to which local constraints on a system, such as the outcomes of physical measurements, may or may not influence its global properties. Put this generally, this is indeed a common theme of most subareas of logic in computer science. It appeared in Panangaden’s lectures on the analysis of probabilistic systems while constructing continuous probability spaces that meet the local constraints that arise from conditioning. It also appeared in Kolaitis’ lectures on database theory while discussing how the local structure of a conjunctive query can influence the size of its output and the computational complexity of computing it. And it appeared in Dawar’s course on finite and algorithmic model theory as the fundamental question of whether the local structure of a finite model could determine it up to isomorphism. I will close this Vignette with a discussion on a research question that was raised at the program, which links this last topic with Bell-Kochen-Specker-type theorems, and which was successfully answered in a team-effort during the program.

The work of Cleve and Mittal referred to above is the continuation of a recent trend that takes classical combinatorial problems and relaxes them into quantum analogues: quantum chromatic numbers, quantum homomorphisms, quantum Lovász theta, etc. Let us take quantum graph isomorphism. We are given two graphs G and H. The verifier chooses two vertices from either graph and sends them to Alice and Bob, one vertex each. If Alice gets a vertex u from G, she must reply with a vertex v from H, and vice-versa. If Bob gets a vertex u′ from G, he must reply with a vertex v′ from H, and vice-versa. The verifier will accept if and only if the two pointed vertices from G satisfy the same equalities and edge relationships as the two pointed vertices from H; i.e., he will accept precisely when u = u′ if and only if v = v′, and u Gu′ if and only if v Hv′. It is not hard to see that Alice and Bob have an unentangled strategy that wins this game with certainty if and only if G and H are isomorphic. Are there pairs of graphs on which Alice and Bob win with entanglement but not without?

The graph isomorphism problem is of course of fundamental importance in theoretical computer science. For one thing, it is one of the few problems in the complexity class NP that is not known to be polynomial-time solvable or NP-complete. Thanks to the recent breakthough of Babai, we now know that the problem is solvable in quasipolynomial time, and thanks to the earlier work in probabilistic proof systems, we already had strong evidence that the problem cannot be NP-complete. These two facts could explain why it has always been so hard to find hard-to-solve instances of the graph isomorphism problem. To put it graphically, if you give me a reasonably smart heuristic to solve the graph isomorphism problem, chances are that I will have a hard time finding an example on which your heuristic fails. For this very same reason, finding two graphs on which Alice and Bob can win the isomorphism game with entanglement but not without does not look like a very easy question.

It turns out that similar questions had been asked previously in descriptive complexity theory in the 1980’s. The fundamental problem of that area, which has its origins in the theory of relational databases, is whether there is a logic that is able to express all and only the isomorphism-invariant graph properties that can be decided in polynomial time. In the 1980’s some candidate logics were put forward, until one was found to resist all attempts at a counterexample. This was the logic now called Fixed-Point Logic with Counting (FPC). Eventually, in a breakthrough that became known as the CFI-construction, J. Y. Cai, M. Fürer and N. Immerman found a counterexample: they proved that the isomorphism problem for graphs with color-classes of bounded size, which is a special case that is solvable in polynomial time, cannot be expressed in FPC.

Besides its original purpose, the CFI-construction became also the canonical “hard-to-solve” instance for combinatorial heuristics for the graph isomorphism problem. For one thing, it was the first construction to show that there exist pairs of non-isomorphic graphs for which the classical vertex-refinement heuristic, when extended to tuple-refinement in the straightforward way, would need to use tuples of unbounded length before it detects that the graphs are not isomorphic. Could the same construction be used to solve our problem of finding non-isomorphic graphs that are nonetheless quantum isomorphic? One difficulty in going this way is that nobody knows of a systematic method to decide if two given graphs are quantum isomorphic; as far as we know, the problem could be even undecidable. For this reason, it would have been more natural to start by constructing two graphs that are quantum isomorphic by design, and only then show that they are not classically isomorphic. But CFI was already there, and we had to give it a try. So we tried it, and… it worked!

The key to finding the solution was the previously noted fact that, in the context of the logical and universal-algebraic approaches to the Feder-Vardi Dichotomy Conjecture for constraint satisfaction problems, the CFI-construction can be interpreted as a system of parity equations in which each variable appears in exactly two equations. Coincidentally (or not), these are precisely the type of systems of equations that underly the Mermin-Peres Magic Square and similar constructions. From there, a few calculations sufficed to realize that Mermin’s solution to the magic square could be reused, and the problem was solved.

Notes for further reading
Bell’s paper appears in Physics 1, 1964: 195-200. The ratio   in our exposition of Bell’s analysis is exactly the same as Goemans-Williamson’s approximation ratio for MAX-CUT and is related to Grothendieck’s constant; this is hardly a coincidence. See for example the paper by Regev and Vidick, “Quantum XOR Games,” ACM TOCT, 7(4), 2015, n. 15, and references therein. For the Bell-Kochen-Specker Theorem and the magic square see Mermin’s Rev. Mod. Phy. 65(3), 1993. Cleve and Mittal’s paper appeared in the Proceedings of ICALP 2014 and also QIP 2014 and arXiv:1209.2729 [quant-ph]. The Cai-Fürer-Immerman paper appears in Combinatorica 12(4), 1992: 389-410, with an earlier version in FOCS 1989. The new results on quantum graph isomorphism appear in Atserias, Mančinska, Roberson, Šámal, Severini, and Varvitsiotis, “Quantum and non-signalling graph isomorphisms,” arXiv:1611.09837 [quant-ph], with a shorter version in ICALP 2017. Two other recent papers that report work along these lines are Atserias, Kolaitis and Severini, “Generalized Satisfiability Problems via Operator Assignments,” arXiv:1704.01736 [cs.LO]; and Abramsky, Dawar and Wang, “The pebbling comonad in finite model theory,” arXiv:1704.05124 [cs.LO]. This last will appear in shorter form in LICS 2017.  

Research Vignette: Network Biology Meets Cancer Genomics

by Teresa Przytycka and Roded Sharan

The Spring 2016 Simons Institute program on Algorithmic Challenges in Genomics focused on three main research areas: Computational Cancer Biology, Regulatory Genomics and Epigenomics, and Network Biology. The different tracks allowed us, the long-term visitors, to better understand the relationships among the different areas and come up with ideas on how they can be integrated to gain further insights into biology. Here, we would like to tell the story of one such integration attempt that was initiated during the Simons program, where ideas from network biology led to new formulations and novel findings in cancer genomics.

A fundamental concept in cancer genomics is that of a disease module that contains functionally related genes that jointly drive the disease. The discovery of such gene sets can reveal the molecular mechanisms that underlie the disease and suggest potential therapeutic targets. It is generally assumed that cancer progresses through stages, starting from a precancerous lesion caused, for example, by DNA damage. If such damage cannot be properly handled by the cell repair mechanisms, for example when such mechanisms are compromised by mutations, an early stage of cancer emerges. In this state, the organism still relies on cellular mechanisms such as apoptosis, cell cycle arrest, or senescence, to guard against uncontrolled cell growth and proliferation. However, once those barriers are compromised, say by further mutations, the disease progresses. While the progression seems sequential, steered by driver mutations that provide a growth advantage to the evolving tumor cells, the actual process is more complex and not fully understood. To shed light on the progression process, we need not only to be able to identify disease modules, but also to understand the relationships among them.

A common approach to detecting cancer modules is to look for sets of genes that are mutually exclusive in the mutations they harbor, across a cohort of patients. The underlying assumption is that the module can be disrupted by a mutation in any of its member genes, thereby driving the disease progression, and additional mutations do not provide an advantage to the disease. As driver mutations are selected for during cancer evolution, the module’s genes often exhibit mutual exclusivity mutation patterns across different patients. Other inter-gene relations such as physical interactions, co-expression, functional similarity, and more, can inform the discovery of cancer modules. Thanks to public projects, such as The Cancer Genome Atlas, researchers can now access genomic data of thousands of patients across tens of cancer types. Such valuable data sources enable the systematic search of disease modules using different computational and statistical approaches [1].

While several research projects have tried to infer disease modules from mutation data, most have focused on mutual exclusivity relations. Only a few projects have considered co-occurrence relations, the integration of other types of data, or the modeling of inter-module relations. These more complex scenarios, however, are very common in network biology, where multiple relations can be modeled by multiple edge types in a gene network. The discovery of interesting edge-patterns can then be cast as a clustering problem where one aims to maximize within-cluster associations as well as between-cluster relations. It thus seemed natural to try and apply such general clustering formulations in the cancer genomics domain.

Multiple relations can exist within and between genes of true cancer modules. We already mentioned mutual exclusivity within modules (Figure, Panel A), but it can also be present between modules, representing different cancer subtypes or different cancer progression pathways (Figure, Panel B). Similarly, mutations can co-occur within modules, representing some shared mutagenic process, or between modules, representing a synergistic relation, where two alternative pathways need to be disrupted for the disease to occur (Figure, Panel C). Finally, known gene interactions or functional associations are typically within modules, representing their shared role in some molecular pathway driving the disease.

To capture all these different relations, Phuong Dao, Yoo-Ah Kim, Damian Wojtowicz, Sanna Madan and I modeled them within a general clustering framework that scores a given collection of modules by their within- and between-edges. The resulting clustering problem is reminiscent of cluster editing, also known as correlation clustering, where one wishes to minimally edit the edge set of a given graph so that it is transformed into a collection of disjoint cliques. While the problem is known to be NP-hard [2], it admits a practical and exact solution via integer linear programming [3]. Motivated by the success of the latter approach, we formulated the within-between module detection problem using integer linear programming. By applying a commercial solver (Cplex), we could solve the problem to optimality on current data sets, consisting of hundreds of mutated genes and thousands of relations across hundreds of patients, in less than an hour. The integrated network view of these relations made it possible to systematically test the contribution of the different types of relations, both within and between modules, to module discovery.

In an application to real data from The Cancer Genome Atlas project, we were able to form comprehensive maps of the modules underlying different cancers and gain novel insights about the roles of different genes in driving the disease. As an example, when maximizing the mutual exclusivity of mutations within modules and the co-occurrence of mutations between modules, we uncovered pairs of modules that work in a synergistic fashion, where the disruption of both is required for the disease to occur (Figure, Panel C). Overall, our general strategy that accounts for a number of complementary relations proved to be a valuable tool in dissecting the mutational landscape of cancer to infer molecular mechanisms that drive the disease.

References

  1. Y-A Kim, D-Y Cho, and TM Przytycka (2016). “Understanding Genotype-Phenotype Effects in Cancer via Network Approaches.” PLoS Comput Biol 12: e1004747.
  2. R Shamir, R Sharan, and D Tsur (2004). “Cluster graph modification problems.” Discrete Appl Math 144: 173–182.
  3. S Böcker S, S Briesemeister, and GW Klau (2009). “Exact Algorithms for Cluster Editing: Evaluation and Experiments.” Algorithmica 60: 316–334.

Research Vignette: Phase Transitions in Random Constraint Satisfaction Problems

by Allan Sly and Nike Sun

Random constraint satisfaction problems

Suppose we are given a finite graph. Does it admit a proper three-coloring? This is one example of a constraint satisfaction problem (CSP). Generally speaking, a CSP is specified by a list of variables, subject to a set of constraints. A prototypical computational question is to decide, in a given CSP instance, whether there exists any variable assignment satisfying all the constraints. If yes, then one may further ask how many solutions are there, or what is the optimal value of some objective function over the set of solutions.

On general inputs, these questions are not believed to be computationally tractable — many CSPs are NP-complete. For the problem of deciding whether a given graph is three-colorable, the best known algorithms require more than 2 time on the “worst” graphs of size n. In view of this, it became natural to develop notions of tractability in “average” or “typical” cases (Levin 1986). One of the standard ways to approach this has been via ensembles of random CSPs. For example, say that we sample a large random graph from some natural distribution — might it be the case that “in most cases” it is easy to decide whether the graph is three-colorable?

Satisfiability transitions in random CSPs

For all the commonly studied random CSP ensembles, the (average-case) complexity remains an open question. A possible reason for this is that even before going into computational considerations, there are far more basic properties of random CSPs that have not been resolved. For example, it seems very difficult to estimate the probability that a random CSP is satisfiable. In many examples, the random CSP can be naturally parametrized by a “constraint level” α — commonly, α is the ratio of the number of constraints to the number of variables. Numerical experiments (early 1990s) indicated that some of the most canonical random CSPs have a sharp satisfiability transition — a critical threshold 0 < αsat < ∞ such that the random CSP is satisfiable with high probability for α < αsat, and unsatisfiable with high probability for α > αsat.

This transition was most studied for the random k-SAT problem, defined as follows. Start with a set of boolean variables x1,…,xn. A literal is a variable xi, or its negation ¬xi. For i ≥ 1, the random clause ai is the disjunction of k literals chosen independently and uniformly from the set of literals {x1x1,…,xnxn}; the clauses are chosen mutually independently. A random k-SAT formula at density α is defined by the conjunction of the first M clauses where M is a Poisson() random variable: Φ = a1 ∧ … ∧ aM. Its solution space is the set S = S(Φ) of variable assignments x ∈ {T,F}n which make the formula Φ satisfied, meaning Φ(x) = T. Let p = pk(α,n) be the probability that the random formula Φ is satisfiable — equivalently, p is the probability for S to be nonempty. The satisfiability threshold conjecture is that for each k ≥ 2, there is a critical αsat (depending on k but not on n) such that

This was proved for k = 2 (with αsat = 1) in a few independent works around 1992 (Chvátal–Reed, de la Vega, Goerdt), but remained open for k ≥ 3. It is known (Friedgut 1999) that for each k ≥ 3 there exists a sharp threshold sequence αsat(n) — the conjecture holds if the sequence converges.

CONTINUE READING

Research Vignette: Simplicity, Optimality and Efficiency Tradeoffs in Economics

by Constantinos Daskalakis

In the beauty pageant of algorithms, contestants are judged based on their efficiency in the use of computational resources. Equally important is the quality of the solutions they compute, so algorithm design is typically thought to target an optimality-versus-efficiency tradeoff.

A feature of algorithms that is hard to quantify and trade against efficiency and optimality is their simplicity. In turn, this feature is quite important in several practical situations. A domain where simplicity creeps up as an important consideration, as we will explain next, is mechanism design. This subfield of economics and game theory takes an engineering approach to the design of incentive structures, called mechanisms. The goal of a mechanism is to motivate a group of strategic agents — who are interested in optimizing their own utilities — to choose actions that ultimately effectuate the designer’s objectives. For example, when designing a road network, a planner may want to minimize traffic by providing the right incentives (via tolls, traffic lights, and other traffic regulations) to individual drivers (who only care to minimize their own driving times) to choose routes that benefit the overall network traffic. The practical applications of mechanism design are abundant, ranging from economics to politics, and include the design of markets, auctions and online advertising platforms, the design of voting rules, the design of road and information networks, etc.

Everything else being equal, it is always desirable to adopt simple mechanisms. Complex mechanisms may lead to frustration for participants, who, unable to understand how to optimize their actions, may behave erratically and unpredictably, ultimately destroying system performance. At the same time, mechanism design often depends on data from market analysis, the process of scouting information about the strategic preferences of the agents that will participate in the designed mechanism. For instance, when designing auctions, it is useful to have information about how much the bidders value the items that are being sold. Whatever information it is that we have, we would not want to run the risk of overfitting it, so opting for simple mechanisms is often a good idea to achieve good generalization.

Ultimately, mechanism design targets a simplicity, optimality and computational efficiency tradeoff. The first workshop of Simons Institute’s Fall 2015 program on Economics and Computation was devoted to the study of this tradeoff in the context of mechanism design, and beyond. There was a series of interesting talks, by computer scientists and economists, capturing different angles on the tradeoff. In the course of the semester, several groups of program participants worked in this space, including Cai, Devanur and Weinberg, who provided a principled approach based on duality theory for designing simple, multi-item auctions that extract a good fraction of the optimal revenue [1]. This is an important task, as revenue-optimal mechanisms are known to be extremely complex [2,3].

In this Research Vignette, I outline one of my own projects in this space, which I find represents an exciting direction for future investigation. It is joint work with Vasilis Syrgkanis, who was a speaker in the aforementioned workshop, and it pertains to the well-studied problem of combinatorial auctions. This problem involves a seller withindivisible items, which she wishes to sell to a group ofbuyers. Every buyer is characterized by a valuation functionmapping each bundleof items to the buyer’s valuefor this bundle. The seller’s goal is to find a partitionof the items together with pricesso as to maximize the total welfare resulting from allocating bundle to each buyer and charging him. This allocation would induce total buyer utility and seller revenue , so the total welfare would simply be

CONTINUE READING

Research Vignette: Strengthening SETH Lower Bounds

by Amir Abboud

A central goal of complexity theory is to understand and prove lower bounds for the time complexity of fundamental problems. One of the most important computational problems is Edit Distance, the problem of computing the minimum number of edit operations (insertions, deletions, and substitutions) required to transform one sequence into another. A classical dynamic programming algorithm that is taught in basic algorithms courses solves Edit Distance on sequences of length n in O(n2) time. Generalizations of Edit Distance are of fundamental importance in computational biology and genomics, where n is typically a few billions, and this quadratic runtime is prohibitive. A faster, e.g. linear-time, algorithm would have far reaching consequences: the paper introducing BLAST, a heuristic algorithm for a generalization of Edit Distance, that runs in linear time but is not guaranteed to return an optimal solution, has been cited more than fifty thousand times. Despite decades of attempts, no upper bound below the O(n2/log2n) bound of Masek and Paterson is known for Edit Distance.

This is a situation where lower bounds are highly desirable, but unfortunately, the current state of the art in complexity is far from providing a lower bound that is close to quadratic for any natural problem in NP, let alone Edit Distance. Therefore, researchers have turned their attention to conditional lower bounds, and a recent celebrated result by Backurs and Indyk shows that Edit Distance cannot be solved in truly subquadratic O(n2−ε) time, for some ε>0, unless the Strong Exponential Time Hypothesis (SETH) is false (this is among the few STOC papers that made it to the Boston Globe news website). SETH is a pessimistic version of PNP, which essentially states that there is no ε>0 such that SAT on CNF formulas on n variables and O(n) clauses can be solved in O((2−ε)n) time.

Other interesting recent results show that under SETH, the current algorithms for many central problems in diverse areas of computer science are optimal, up to no(1) factors. These areas include pattern matching, graph algorithms, computational geometry, data mining, and economics, and the list is growing by the day. These conditional lower bounds are a part of a more general line of work in which one bases the hardness of important problems in P on well-known conjectures about the exact complexity of other famous problems. Other conjectures concern the exact complexity of 3-SUM (given n integers, do three of them sum to zero?) and All-Pairs-Shortest-Path (compute all pairwise distances in a weighted graph). This line of work was the focus of the third workshop in the Fall 2015 Simons Institute Program on Fine-Grained Complexity and Algorithm Design, entitled Computational Complexity of Low-Polynomial Time Problems. It seems that we are rapidly approaching an exciting state of affairs in which a tight lower bound for many problems of practical significance can be derived from a small set of such conjectures.

One of the major goals of this research is to strengthen the foundations on which these results are based. We do not have strong evidence supporting these conjectures, and perhaps the main reason to believe them is the fact that no one knows how to refute them. For example, SETH was introduced by Impagliazzo and Paturi as a plausible explanation for the lack of (2−ε)n algorithms for CNF-SAT, despite its being one of the most extensively studied problems in computer science. Can we find more evidence for these conjectures? Can we show surprising consequences of refuting them? Alternatively, can we replace them with more believable ones?

In the rest of this Vignette I will discuss a recent paper with Thomas Dueholm Hansen, Virginia Vassilevska Williams, and Ryan Williams, in which we show that many of the SETH lower bounds can be based on an assumption that is much more believable than SETH.

CONTINUE READING

Research Vignette: Mathematical Software Obfuscation

By Amit Sahai

From revolutionaries and spies to ordinary citizens living under repressive regimes, for centuries people have had the need to hide operational secrets. By an operational secret, we mean a secret that must be kept safe, where the secret is also used to perform ordinary tasks. For example, someone may want to keep the amount of cash she has on-hand secret, but she also needs to refer to this information when making everyday purchases. Even secrets like the location of hidden refugees under one’s protection could be an operational secret, since the holder of the secret would want to plan her ordinary movements to avoid the secret location that she is protecting. Through many clever means, people throughout history managed to protect such secrets. Essential to such protection was the refuge of the human mind, the ultimate sanctuary where secrets can be pondered and processed without fear of loss.

But what if this most basic assumption proves false? What if an adversary can read someone’s mind while she is thinking about the secret she needs to protect? Indeed, it is hard to imagine how someone can protect secrets when the inner dialogue of her mind can betray her. Fortunately, this scenario remains science fiction when applied to humanity. However, if we replace humans by computer programs, this situation is all too common. Computer programs, with inbuilt databases of sensitive information, are routinely captured by adversarial entities. These adversaries can reverse-engineer these programs and see every detail of how the programs “think” about their secrets as they perform their calculations. Furthermore, adversaries can even modify the programs to alter their behavior, in order to coax secrets out of the original computer code. Because computer programs with inbuilt operational secrets are so useful, cryptographic researchers have pondered this challenge as far back as the classic work of Diffie and Hellman in 1976. For decades, however, our ability to suitably “obfuscate” programs in order to defend against these kinds of attacks was based on unsatisfying approaches that failed to provide security against even moderately skilled adversaries.

This changed in 2013, due to the work of Garg, Gentry, Halevi, Raykova, Sahai and Waters, which gave the first sound mathematical approach to this problem. This work enabling a mathematical approach to securing software has been hailed as, “a watershed moment for cryptography,” by the Simons Foundation’s Quanta magazine, and its ramifications were a major theme of the Simons Institute program on Cryptography in Summer 2015. To illustrate why this recent advance in obfuscation has caused such a stir in the cryptographic community, let us consider an analogy with the ancient problem of sending encrypted messages.

CONTINUE READING