[PDF] Enhancing Security via Deliberate Unpredictability of Solutions in Optimisation

Abstract

The main aim of decision support systems is to find solutions that satisfy user requirements. Often, this leads to predictability of those solutions, in the sense that having the input data and the model, an adversary or enemy can predict to a great extent the solution produced by your decision support system. Such predictability can be undesirable, for example, in military or security timetabling, or applications that require anonymity. In this paper, we discuss the notion of solution predictability and introduce potential mechanisms to intentionally avoid it.

Full PDF

aa r X i v : . [ c s . D M ] J a n Noname manuscript No. (will be inserted by the editor)

Enhancing Security via Deliberate Unpredictabilityof Solutions in Optimisation

Daniel Karapetyan · Andrew J. Parkes

Received: 1 Mar 2020 / Accepted: 23 May 2020

Abstract

The main aim of decision support systems is to ﬁnd solutions thatsatisfy user requirements. Often, this leads to predictability of those solutions,in the sense that having the input data and the model, an adversary or enemycan predict to a great extent the solution produced by your decision supportsystem. Such predictability can be undesirable, for example, in military or se-curity timetabling, or applications that require anonymity. In this paper, wediscuss the notion of solution predictability and introduce potential mecha-nisms to intentionally avoid it.

Keywords

Unpredictability · Decision Support · Diversity of Solutions · Perfect Matching Problem

A search algorithm, even non-deterministic, is likely to be biased to somesolutions, and hence anyone knowing the input data and the algorithm mightbe able to predict much of the solution. This can be an issue if the solutionhas to be kept in secret. One of many examples of this is the scheduling oftasks in cloud computing. In this problem, computational tasks are assigned tovarious servers and time slots. While respecting the constraints and eﬃciencyconsiderations, one may want to keep the schedule as unpredictable as possibleto reduce the chances of the potential intruder to guess the server and/or timeslot assigned to a speciﬁc task.

D. KarapetyanSchool of Computer Science, University of NottinghamE-mail: [email protected]. J. ParkesSchool of Computer Science, University of NottinghamE-mail: [email protected] Daniel Karapetyan, Andrew J. ParkesSolution space

A B

Decision variable

Fig. 1: Example of solution space with all the feasible solutions grouped intotwo clusters A and B . Cluster A contains many more solutions than cluster B ; as a result, uniform sampling from the entire set of feasible solutions willbe biased towards cluster A , hence the value of the ‘decision variable’ will be‘predictable’.Solution unpredictability can be understood in many ways. In our example,one may be interested in predicting the exact server and time slot for a task,or may be interested in predicting only the server, or even an approximatelocation of the server (i.e., the speciﬁc data centre). Our aim here is not togive a generic formulation of the problem; but to point out potential interestingextensions of the classic decision support, and to provide some relevant results,and so to encourage further discussion of the topic.In particular, we discuss diversity issues in the context of the assignmentproblem, or speciﬁcally of variations of Perfect Matching Problems in bipartitegraphs, which is closely related to the task assignment in cloud computing.We want the locations/times of diﬀerent tasks to be unpredictable (to makehacking harder) and so need diverse assignments to select from. A straightforward approach to achieve unpredictability of solutions is to ran-domly sample the set of feasible solutions. (A standard technique to do thisis the “rapidly mixing Markov Chains”, e.g. see [1,4]) However, to enhancesecurity, the sampling should not necessarily be uniform. To illustrate this, werefer to Figure 1. in which the sets of feasible solutions form clusters, i.e. sub-sets of similar solutions (e.g. see [3]). In the example given in Figure 1, all thesolutions are grouped into two clusters: A and B . Hence, solutions within eachcluster share similar values of the ‘decision variable’. Cluster A contains moresolutions, and hence a solution selected with uniform sampling is likely to befrom cluster A . This will make the value of the decision variable predictable;with high probability, it will correspond to the ﬁrst cluster. nhancing Security via Deliberate Unpredictability of Solutions in Optimisation 3 To address this issue, we may want to pre-select a subset of diverse feasiblesolutions and then pick one of them randomly. For example, such a subset maybe obtained by selecting 10 feasible solutions such that the total Hammingdistance between them is maximised. (In a loose sense, we are doing the exactopposite of the work on minimal perturbations, such as [2], which aimed toﬁnd nearby solutions, and instead are looking for “maximal perturbations”.)This approach is likely to generate interesting optimisation challenges. Do-ing it directly by ﬁrst enumerating all solutions is generally impractical, evenfor problems where ﬁnding a solution is easy. Indeed, just counting all thefeasible solutions is generally

Firstly, a quick reminder of the base problem:NAME:

Perfect-Matching

INSTANCE: A bipartite graph G = ( U, V, E ) over vertices (

U, V ) of sizes( n, n ) and with edges E .SOLUTION: A vertex-disjoint subset M ⊆ E of n edges, or equivalently, asubset of edges such that it covers every node and the edges in this subsetare disjoint (do not share any nodes).Finding one solution (perfect matching) is well-known to be polynomial-time (e.g. using the Hungarian method). However, this does not mean that allquestions about perfect matchings are necessarily easy. For example, countingthe number of solutions is = NP).A problem that naturally occurs when studying diverse solutions to thematching problem is to ﬁnd a pair of well-separated perfect matchings. We Daniel Karapetyan, Andrew J. Parkes will deﬁne separation as the number of edges that are not shared by the twomatchings. The corresponding decision problem is as follows.DEFINITION:

Maximum-Separated-Perfect-Matchings

INSTANCE: – Bipartite (unweighted) graph G on ( n, n ) nodes; – Integer 0 ≤ d ≤ n .QUESTION: Do there exist perfect matchings M and M , such that M and M diﬀer on at least d assignments?We do not know the complexity of this problem and leave it as an openquestion. Below, however, we show that several related problems are polytime.Firstly, suppose that we are given one perfect matching, and to promotediversity, we want to ﬁnd another one that is “as diﬀerent as possible”. Thecorresponding decision problem is as follows.DEFINITION: Distant-Perfect-Matching

INSTANCE: – Bipartite (unweighted) graph G , on ( n, n ) nodes; – Perfect matching M ; – Integer 0 ≤ d ≤ n .QUESTION: Does there exist another perfect matching M , such that M and M diﬀer on at least d assignments? In other words, does there exista matching M such that | M ∩ M | ≤ n − d ?The maximum distance, d = n , is easy because it means that no edges canbe shared. Hence, we can simply solve this case by removing all the edges in M from E , and then looking for a perfect matching in this reduced graph.The problem of ﬁnding maximum d can also be solved in poly-time usingthe given solution M to modify the weights of the edges, giving a new weightedgraph and then doing a maximum weight perfect matching on this graph.This approach has the drawback that we need to provide the ﬁrst match-ing. Instead, generally we want to simultaneously ﬁnd a pair of well-separatedperfect matchings – ones diﬀering on at least d edges. Using the usual distinc-tion between “maximal” (local) and “maximum” (global), this leads to twoproblems, Firstly, the “maxim al ” separation:DEFINITION: Maximal-Separated-Perfect-Matchings

INSTANCE: – Bipartite (unweighted) graph G on ( n, n ) nodes;TASK: Find a pair of matchings M and M such that no matching is furtherfrom M than M is, and vice versa.This is in poly-time because we just iterate solution to Distant-Perfect-Matching , switching between which matching is considered the ﬁxed one.Starting from any matching, call it M , then ﬁnd the most distant, call it M ,then ﬁnd the most distant from M etc., terminating when the distance nolonger increases – which must happen within O ( n ) iterations. nhancing Security via Deliberate Unpredictability of Solutions in Optimisation 5 Another special case of the

Maximum-Separated-Perfect-Matchings is when d = n , which requires a disjoint pair of perfect matchings. Consider asubgraph of G with the edge set M ∪ M . Each vertex now has two distinctedges to it; by following these we get disjoint cycles. So the d = n case isequivalent to ﬁnding a “Disjoint Vertex Cycle Cover” – a set of disjoint cyclesthat contain all the vertices. This is known to be polynomial time by conversionto a matching problem . In this paper, we discussed the concept of unpredictability of solutions inautomated decision support. As a motivating example, we consider a simpleassignment problem, which could easily be part of task scheduling in cloudcomputing. The aim is that unpredictability of task assignments will increasesecurity of the system, by making it harder for malicious agents to guesslocations and time slots of tasks. The most obvious approach to achievingunpredictability, random sampling of solutions, turns out to be computation-ally hard and weak. Indeed, uniform sampling is both complex and would notnecessarily give us the desired diversity of solutions.Hence, we focus on ﬁnding a few diverse solutions; then we can select oneof them randomly to achieve unpredictability. We model the task schedulingusing the bipartite matching. We gave some initial deﬁnitions that relate toﬁnding diverse pairs of matchings. In particular, observing that ﬁnding max-imally separated matchings is possible in polynomial time. Though not (yet)answering the question of ﬁnding the pairs with maximum separation. Alsowhilst we found that it is easy to ﬁnd a pair of non-overlapping solutions, wedo not know whether this result generalises to a higher number of solutions.This is still work in progress and more research is needed to establish eﬃ-cient methods for increasing unpredictability of solutions in new and existingdecision support systems. For example, here we have discussed only issues ofselecting from ‘feasible’, but it could be that this includes quality being abovesome threshold.

References