2. Literature Review
Combinatorial optimization problems with conflict pair constraints have recently received considerable attention from the research community. These include conflict pair versions of the minimum spanning tree problem [
11,
12,
13,
14,
19,
20], shortest path problem [
14], minimum cost matching problem [
14,
17], minimum cost bin packing [
22,
23], and maximum flow problem [
18]. KPCC is yet another member in this class of optimization problems. Conflict pair constraints can easily be absorbed into a quadratic objective function. Thus, these types of problems can be formulated as quadratic combinatorial optimization problems. However, the special structure of the conflict pair constraints makes the problem interesting to study independently.
The multiple choice knapsack problem studied by Chandra et al. [
24] and Nauss [
25] in 1975 is essentially a KPCC, where the underlying graph is a collection of disjoint cliques. The continuous version of this problem was investigated by Ibaraki et al. [
26]. Another special case of KPCC, where
for all
j, was mentioned by Bar-Noy et al. [
27] in the context of resource allocation. The general problem KPCC was introduced by Yamada, Kataoka, and Watanab [
10], and they proposed exact and heuristic algorithms to solve the problem. Hifi and Michrafy [
6] presented different exact algorithms based on MIP formulations. Bettinelli, Cacchiani, and Malaguti [
28] proposed yet another class of the branch and bound algorithm, along with extensive experimental analysis. Moreover, they presented comparisons of MIP formulations using general purpose solvers. Salem et al. [
29] investigated the convex hull of feasible solutions of KPCC, i.e., the KPCC polytope. The authors obtained necessary and sufficient conditions for odd cycle inequalities associated with the stable set problem to be facet-defining for the KPCC polytope. Similar conditions were developed for the cover inequalities of the knapsack polytope in connection with the KPCC polytope. They also analyzed different lifting procedures to obtain additional valid inequalities. Exploiting these inequalities, the authors proposed an effective branch and bound algorithm to solve KPCC and reported the results of extensive computational experiments.
Heuristic algorithms for KPCC have been investigated by many researchers, and the results of extensive experimental analysis with these algorithms are available. Hifi and Michrafy [
5] developed a reactive-local-search-based algorithm and also generated a class of test instances based on the ideas from [
10]. Akeb et al. [
4] proposed local-branching-based heuristics and provided an experimental analysis based on instances generated by [
5]. Hifi and Otmani [
7] proposed a scatter-search-based metaheuristic algorithm along with results of an experimental analysis using test instances from [
5]. Recently, Hifi [
9] proposed an iterative-rounding-search-based algorithm. Their experimental results using test instances from [
5] were promising and were able to obtain the best known solutions for many benchmark instances. Salem et al. [
30] developed a probabilistic tabu search algorithm to solve KPCC and reported extensive computational results.
Pferschy and Schauer [
31] investigated KPCC for several special cases of the graph
G. They showed that KPCC can be solved in pseudo-polynomial time when
G is a tree and more generally when
G is a graph with bounded treewidth or chordal. Fully polynomial-time approximation schemes (FPTASs) are also derived for these classes of graphs. Further, they proved that KPCC remains strongly NP-hard on perfect graphs. It may be noted that the stable set problem is polynomial-time-solvable on perfect graphs. Bandyapadhyay [
1] also considered KPCC on trees, but the algorithm discussed in [
31] is faster for this case. Moreover, [
1] presents pseudopolynomial algorithms for cycles, forests,
k-outerplanar graphs, and interval graphs. In another paper, Pferschy and Schauer [
32] investigated approximation algorithms for KPCC under various restrictions on
G. They showed that KPCC admits FPTASs on weakly chordal graphs and is solvable by PTAS (not FPTASs) on planar graphs.
Kalra et al. [
2] studied another special case of KPCC, where
for all
and
G is a multipartite graph. They proposed a
-approximation algorithm, where
k is the number of partite sets, and proved that the performance bound established is tight. If
and
equal to the degree of
j, for
, KPCC reduces to the maximum independent vertex coverage problem (MIVC). The authors proved approximation hardness results for MIVC and showed that MIVC is NP-hard on bipartite graphs. Further, they established bounds on the integrality for the LP relaxation of a clique-based formulation of MVIP. Bandyapadhyay [
1] investigated another special case of KPCC, where
for all
j. This problem is called the maximum budget independent set problem (MBIS), and the author presented a factor
d polynomial time approximation algorithm for the problem on
-claw free graphs.
In KPCC, if we restrict the budget constraint to be satisfied as an equality, we obtain an instance of the
exact knapsack problem with conflict pair constraints (KPCC-exact). The feasibility version of this problem was investigated by Milanic and Monnot [
3,
33], where it was shown that the problem is strongly NP-complete on bipartite graphs with maximum vertex degree of three. The authors also presented a pseudo-polynomial algorithm for the problem in cographs, chordal graphs, and interval graphs.
A generalization of KPCC involving more than one knapsack constraint was studied by Gabriel [
34], where a Dantzig–Wolfe decomposition scheme was proposed. Atamtürk et al. [
35] used conflict graphs in developing algorithms for 0-1integer programs.
The
quadratic knapsack problem (QKP) [
36] can be stated as
where
is an
matrix.
Although the QKP appears to be more general than KPCC, we now observe that QKP and KPCC are essentially equivalent, in the sense that one is a special case of the other.
Theorem 1. QKP and KPCC are equivalent.
Proof. Define the matrix such that , where M is a large positive number if and otherwise. With this choice of Q, the problem KPCC reduces to QKP. Thus, KPCC is a special case of QKP.
Let us now prove the converse. That is, QKP is a special case of KPCC. This is, however, achieved by appropriate modifications in the well-known reduction of a quadratic unconstrained binary optimization problem (QUBO) to the maximum weight stable set problem [
37,
38] to handle the budget constraint. Thus, KPCC can be viewed as a generalization of QKP. □
For further details on QUBO, refer to [
39]. The major contributions of the paper can be summarized as follows.
We present new complexity results on KPCC along with a thorough state-of-the-art review of the literature. Also, we show that KPCC and QKP are equivalent.
When the conflict graph is a complete bipartite graph, it is shown that KPCC decomposes into two knapsack problems. As a consequence, we have new polynomially solvable (and pseudo-polynomially solvable) special cases of KPCC and have special cases of KPCC that admit FPTASs. These results are extended to more general classes of graphs that accept a blakbox oracle with special properties. On a star (which is a special bipartite graph), the problem is shown to be NP-hard.
A new integer programming formulation that unifies and generalizes existing formulations of KPCC on general graphs is given. The strengths of the LP relaxation of this formulation and of special cases are analyzed theoretically.
Different methods are proposed to tighten our general integer programming formulation, and these methods are compared using the general purpose solver Gurobi.
3. KPCC on Bipartite Graphs
Let
be a bipartite graph with the generic bipartition of its vertex set as
, where
and
. Note that KPCC is trivial to solve when
G is a complete graph and is the same as the knapsack problem when
G is a collection of isolated nodes. When
G is a tree, Pferschy and Schauer [
31] showed that KPCC is NP-hard but solvable in
time and
space. Further, they showed that the algorithm can be modified to obtain a fully polynomial approximation scheme (FPTASs) for KPCC. Trees, in some sense, can be viewed as an extreme case of sparse connected bipartite graphs. Let us now look at KPCC on the complete bipartite graph
. Although KPCC is trivial on a complete graph, this simplicity does not extend to
.
Theorem 2. KPCC is NP-hard even if the conflict graph is a star (i.e., ). Further, on a complete bipartite graph, KPCC can be solved in pseudopolynomial time and it admits FPTASs.
Proof. It is not difficult to reduce the knapsack problem to a KPCC on
and this establishes the first part of the theorem. To prove the second part, note that for any solution
of a KPCC on
, if
for any
then by the conflict pair constraints,
for all
and vice versa. Thus, KPCC on
reduces to two knapsack problems
and
The best solution amongst the solutions of KP1 and KP2 solves KPCC on . Since the knapsack problem can be solved in pseudopolynomial time and it admits FPTASs, the second part of the theorem follows. □
From the above theorem, if KP1 and KP2 are solvable in polynomial time, then the corresponding KPCC on
is also solvable in polynomial time. If
for all
, then obviously KP1 and KP2 are solvable in polynomial time; hence, the corresponding KPCC on
is also solvable in polynomial time. This is interesting since KPCC on a general bipartite graph is NP-hard even if
for all
j [
31]. Deineko and Woeginger [
40] showed that the cross-ratio-ordered bounded knapsack problem on
n variables can be solved in
time. This, together with Theorem 2, leads to the following theorem.
Theorem 3. If the costs and weight for satisfy and , then KPCC on can be solved in polynomial time.
We can also mix conditions for different solvable cases for KP1 and KP2 to obtain solvable instances of KPCC on . For example, if for all and for , then KPCC on can also be solved in polynomial time. Later, we present additional conditions for polynomially solvable special cases of the knapsack problem that can be used in different combinations for KP1 and KP2 to produce polynomially solvable cases of KPCC on .
Let
be an undirected graph such that the subgraphs induced by
and
belong to a family
of graphs with specific properties. We call such graphs
-bipartite (see
Figure 1 for an example). If an
-bipartite graph contain all edges
for
, then it is called a complete
-bipartite graph. Bonamy et al. [
41] studied
-bipartite graphs, where the induced subgraph of
is a collection of isolated nodes and the induced subgraph of
is a
k-degenerate graph. Depending on the properties imposed on the family
, the
-bipartite structure can have algorithmic advantages in solving some optimization problems. For example,
could contain all classes of graphs on which the KPCC can be solved in pseudo-polynomial time. In this case,
includes graphs that are isolated vertices, trees, complete bipartite graphs, complete graphs, chordal graphs, interval graphs, cographs, and circular arc graphs.
Assume that for a -bipartite graph, the bipartition and along with the structure of the subgraphs induced by and are given. Then, the results obtained in the previous sections on the solvability of KPCC can be extended to -bipartite graphs.
Theorem 4. KPCC can be solved in pseudo-polynomial time on a complete -bipartite graph if KPCC can be solved in pseudo-polynomial time on graphs in . Further, KPCC admits FPTASs on G whenever KPCC admits FPTASs on graphs in .
Proof. The proof of the above theorem is similar to that of Theorem 2, except that instead of KP1 and KP2 we will have problems of the type KPCC but with reduced sizes. However, we present the complete proof of the theorem here to facilitate further related discussions. Let
be the subgraph of
G induced by
and
be the subgraph of
G induced by
. Since every vertex in
is connected to every vertex in
, for any solution
of the KPCC, if
for any
then
for all
and vice versa. Thus, KPCC reduces to the following reduced KPCCs.
and
The best solution amongst the solutions of KPCC1 and KPCC2 provides an optimal solution to KPCC. By assuming the properties of , KPCC1 and KPCC2 can be solved in pseudo-polynomial time. The proof of the second part of the theorem also follows in an analogous way. □
Theorem 4 is a proper generalization of Theorem 2.
The results discussed above extend in a natural way to complete multi-partite graphs. However, to take any computational advantage we need to know the partite sets a priori or need to compute them. Bipartite graphs can be recognized in polynomial time but recognizing
p-partite graphs for
is NP-complete [
42]. If there are
p partite sets, then we will be solving
p knapsack problems.
General Bipartite Graphs
We now consider KPCC on a general bipartite graph
, where
and
. Note that we are considering a maximization problem. Consider the maximization version of a combinatorial optimization problem such that the objective function value of its feasible solutions are positive. Let
be any feasible solution to such a problem and
be a corresponding optimal solution. Then,
is said to be
-optimal if
, where
is the objective function value of the solution
. When
, then
is an optimal solution. Consider KP1 and KP2, as indicated in Theorem 2. Let
and
be
-optimal solutions to KP1 and KP2, respectively. Define the solutions
and
, where
Theorem 5. The best solution amongst and is a -optimal solution to KPCC on .
Proof. Let
be an optimal solution to KPCC on
. Define the solutions
and
such that
Let
be the restriction of
to
and
be the restriction of
to
. Then,
and
are feasible solutions of KP1 and KP2, respectively. Then, by the
-optimality of
and
to KP1 and KP2, we have
and
Without a loss of generality, assume
. Then,
□
An approximation algorithm
for a combinatorial optimization problem (in maximization form) is said to be a
δ-fully polynomial time approximation scheme (
-FPTASs) if it guarantees a solution
such that
, where
is an optimal solution and
runs in polynomial time in the input size and
for any
. Thus, a
is precisely an
. Since KP1 and KP2 admits FPTASs, from Theorem 5, KPCC admits
. This result is interesting because the problem is strongly NP-complete on bipartite graphs with maximum vertex degree of three [
3,
33]. Further, when KP1 and KP2 are solvable in polynomial time, KPCC on a general bipartite graph can be solved by a polynomial time 2-approximation algorithm. When
for all
j, Theorem 5 reduces to a corresponding result obtained by Kalra et al. [
2], and for the special case, they showed that the performance bound is asymptotically tight. We summarize these observations in the following corollary.
Corollary 1. KPCC on a bipartite graph admits 2-FPTASs. Further, when KP1 and KP2 are solvable in polynomial time, then a 2-optimal solution to KPCC on a bipartite graph can be obtained in polynomial time, and in this case the bound 2 is asymptotically tight.
The concept of -bipartite graphs can be extended to the more general case of -multipartite graphs. Let be an undirected graph such that the subgraphs induced by for any belong to the family . We call G a -multipartite graph. If for any for , then G is called a complete -multipartite graph. Theorem 4 and the discussions that follow extend in a natural way to complete -multipartite graphs.
4. Integer Programming Formulations
Let us now discuss some integer programming formulations of the KPCC. For each vertex
, let
and the sets
be a given partition of
, where
for any
,
whenever
, and
. Then, KPCC can be formulated as the 0-1 programming problem
The above formulation reduces to the standard integer programming formulation of the KPCC when each
is singleton, i.e.,
for all
i. When
for all
i then
, and the above integer program reduces to
Thus, our general integer programming framework StarIP provides a class of integer programming formulations of KPCC that subsumes two well-studied formulations.
Theorem 6. Among all of the choices of the partition for StarIP, the standard formulation is the strongest and MaxStarIP is the weakest in terms of their LP relaxation values.
Proof. Let
be any feasible solution to the LP relaxation of the standard formulation. Then,
We successfully show that
is a feasible solution to the LP relaxation of StarIP for any partition
. Choose a set
for the vertex
i. Now, by adding the edge inequalities from (
5) for all
such that
, we obtain
Thus, any feasible solution to the standard integer programming formulation of the KPCC is also a solution to the StarIP. Thus, the LP relaxation of the standard formulation is the strongest among the LP relaxation of any formulation within the class of StarIP. The second part of the theorem can be proved in an analogous way. □
Note that, for the LP relaxation of StarIP (for any choice of the partition of
),
for all
i as a feasible solution, provided that
. One way to strengthen or tighten StarIP is by computing upper bounds on the stability number of the subgraph
induced by
. For example, if this subgraph is perfect, we can calculate the stability number exactly. If
is the largest cardinality of a matching in this graph, then
can be replaced by
. Another way to tighten the formulation StarIP is as follows: let
be the objective function value of the LP relaxation of the standard formulation of KPCC restricted to
, where
for all
i. Then,
can be replaced by
. This number can be viewed as a fractional budgeted stability number. We can use other similar upper bounds to strengthen StarIP. For a review of different upper bounds that can be easily calculated, we refer to [
43]. By tightening StarIP this way, we can obtain a better formulation for KPCC.
Let
and
be a given partition of
. We can replace
by
in StarIP to obtain a valid formulation, say StarIP
′, of KPCC. Likewise, we can replace
by
in MaxStarIP to obtain a valid formulation, say MaxStarIP
′, for KPCC. The formulation MaxStarIP
′ has been studied by Hifi et al. [
6]. The coefficient matrices of these modified formulations are sparser. However, tightening the resulting formulations by using upper bounds as discussed above are likely to yield weaker LP relaxations, in general, compared to tightening the corresponding formulation of StarIP and MaxStarIP. Moreover, any possible computational advantage achieved by sparsity over the suspected weaker LP relaxation needs to be analyzed experimentally.
Based on the ideas used in Theorem 2, we now present a single integer programming formulation of KPCC on
.
Note that constraint (
6) guarantees that if any
for
then
for all
. Likewise, constraint (7) guarantees that if any
for
then
for all
.
4.1. Experimental Analysis
The formulation MaxStarIP
′ has been studied and used in algorithms to solve KPCC [
6]. There are also effective branch-and-bound algorithms to solve KPCC for general graphs [
6,
28,
44]. However, to the best of our knowledge, there is no computational study available that compares the different formulations of KPCC. Note that the complexity of KPCC can vary significantly depending on the structure of the underlying graph and the level of tightness of the budget constraint. Also, tight efficiently computable upper bound estimates on the stability number of a graph can affect the efficacy of our formulations StarIP and MaxStarIP. The following
Table 1 provides some easily computable upper bounds on the stability number of the graph
.
In the table above, denotes the maximum degree and denotes the minimum degree of vertices in the graph .
Recall that the integer programming formulation MaxStarIP belongs to the class StarIP. Therefore, it can be strengthened by replacing with an upper bound on the stability number of the subgraph of G induced by . When MaxStarIP is strengthened using the bound UBk obtained from to replace , , we denote the resulting MaxStarIP formulation by MaxStarIPk. Thus, in addition to MaxStarIP, we have the strengthened formulations MaxStarIP1, MaxStarIP2, MaxStarIP3, MaxStarIP4, MaxStarIP5, and MaxStarIP6. The major goals of the experimental analysis were to:
Compare the relative performance of the standard KPCC formulation given in the introduction, MaxStarIP, and its variations MaxStarIPk, using the general purpose solver GUROBI.
Study the impact of the tightness of the budget constraints on the formulations that we compare.
Study the impact of sparsity of G on the formulations that we compare.
Explore the heuristic value of the formulations.
It may be noted that UB1 will be at least as tight as UB2 and that UB4 applies to a connected graph. For these reasons, we will not use UB2 and UB4 in our experiments.
All of the computational experiments were carried out on a PC with the Windows 10 Enterprise 64-bit operating system, Intel(R) Core(TM) i7-3770 CPU @ 3.40 GHz, and 16.0 GB of memory. The calling program was written in Python (Python 3.6.1) and called GUROBI 9.0 using the interface gurobipy. The time limit for GUROBI was set to 3600 s. Most of the test problems were solved to optimality within this time limit, except some large or structured instances.
4.2. Benchmark Instances
Our experiments used benchmark instances selected from Bettinelli et al. [
28] and instances generated using the DIMACS clique graph library [
48]. The instances of Bettinelli et al. [
28] have three data sets: R1, R2, and R10. Each data set consists of eight classes, and each class has many instances. For the data set R1, in the first four classes, the items have a uniformly distributed weight from the range [20, 100], and the knapsack capacity is set to
. The number
n of items is 120, 250, 500, and 1000, respectively. The last four classes have uniformly distributed weights from [250, 500], and the knapsack capacity is set to
. The conflict graphs were generated randomly. For the data set R1, classes 1 to 4 have a total of 360 instances and classes 5 to 8 have a total of 360 instances. Thus, the data set R1 has a total of 720 instances. The data set R3 is similar to R1 except that the budget
b is set to
. Likewise, the data set R10 is almost identical to R1 except that the budget
b is set to
. For details on these instances, we refer to the paper [
28].
For our computational experiments, we selected instances from the first four classes of each of the data sets R1, R3, and R10. We categorized these instances further into small, medium, and large. The small instances consist of conflict graphs with 120 to 250 vertices, the medium instances consist of conflict graphs with 500 vertices, and the large instances consist of conflict graphs with 1000 vertices. From each category, we selected 120 instances using a prescribed selection rule. We refer to the aforementioned class of instances as random data instances.
The DIMACS clique graph library consists of unweighted graphs. We generated two sets of KPCC benchmarks out of these instances, called positively correlated problems and negatively correlated problems. For positively correlated problems, the costs and weights have the same ordering, whereas for the negatively correlated problems, the costs and weights have opposite ordering. We used graphs that consist of 171 to 1024 vertices. These instances are further classified into small, medium, and large sizes. The small instances consist of graphs with 171 to 378 vertices, the medium instances consist of graphs with 400 to 500 vertices, and the large instances consist of 700 to 1024 vertices. There are 23 small instances, 24 medium instances, and 13 large instances. The DIMACS graphs were originally generated as clique benchmarks. To obtain stable set instances, we took the complement of these graphs. Let
be such a complement graph. For each
, we generated the cost
and the weight
as uniformly distributed random integers in the range
. To obtain positively correlated data, we first sort the vectors
and
in non-decreasing order and then apply the same random index permutation to both vectors. Negatively correlated instances are constructed in a similar manner but with opposite ordering. The budgets for the correlated instances are set as follows. First, we generate a random subset
S of the vertices of
G such that the cardinality of
S is the known size of the maximum cardinality stable set on
G and set
. Then, instances are generated by setting the budget equal to the value
b scaled by the factors 0.25, 0.50, and 0.75. The instances we used in our experiments are available at
https://github.com/jasdeepdhahan/kpcc_instances.git.
4.3. Experimental Results
Let us now discuss the experimental outcomes of our analysis. First, we compared the standard KPCC formulation given in the introduction to MaxStarIP and its variations MaxStarIP1, MaxStarIP3, MaxStarIP5, and MaxStarIP6 using all of the test instances.
Table 2 provides some insight into the relative strength of our formulations based on average running times (in seconds). The numbers marked in bold letters correspond to the lowest computational time taken. The experimental results disclose that there is no consistently superior formulation of KPCC among the ones we tested.
The efficacy of various formulations could also be dependent on how tight the budget constraint is and what special structure the underlying graph has.
Table 3 presents the experimental results after classifying the budget constraint in the test instances as tight, very tight, and moderately tight.
Table 4 presents the experimental results after classifying the random test instances based on tightness of the budget constraints as well as the density of the underlying graph. We sometimes refer to a KPCC instance as sparse or dense. A
sparse (dense) KPCC instance has a sparse (dense) underlying graph. The results on the correlated test instances are classified only with respect to budget tightness since the density of the underlying graphs was uniform and generally sparse. These results are presented in
Table 5.
After this categorized analysis, the standard formulation appears as the preferred formulation overall, if one is constrained to recommend only one formulation. However, it may be noted that MaxStarIP appears to be more suited for sparse small instances with the tightest and loosest budgets. In addition, MaxStarIP outperforms the standard formulation on sparse medium and large instances with the tightest budgets. For our medium and large instances, as the underlying graph density increased the standard formulation became more robust than MaxStarIP. It is also important to point out that 8 out of 120 medium-sized instances were not solved to optimality within the time limit provided, either by the standard formulation or by the MaxStarIP type formulations. In addition, 30 of the 120 large instances with the loosest budget were not solved to optimality within the time limit specified by either of the aforementioned formulations.
Now, let us compare the standard formulation and MaxStarIP type formulations over the correlated data. We found that the positively correlated instances were easier to solve than the negatively correlated instances. Increasing the budget within limits generally increased the running time of our formulations. Large negatively correlated instances often hit the time limit set for the experiments. The standard formulation generally outperformed MaxStarIP over our negatively correlated data set. We further observed that MaxStarIP is slightly better than the standard formulation for negatively correlated large instances with the loosest budget. For positively correlated data, MaxStarIP is best suited for medium instances and the standard formulation is the strongest for our small and large instances.
It may also be noted that the standard formulation is the most robust in terms of running times, followed by MaxStarIP1, and there is no clear formulation that wins third place. MaxStarIP1 outperformed our other KPCC formulations over denser and larger random instances, while the standard formulation showed superiority on sparser large random instances. Each of the formulations MaxStartIP, MaxStarIP3, MaxStarIP5, and MaxStarIP6 were found to be the preferred formulation for at least one of the random data set classifications in
Table 4. Based on this, we conclude that there is no one formulation that is best suited for our positively correlated data set. The standard formulation is best suited for our medium and small negatively correlated data. In addition, MaxStarIP3 outperformed the other formulations over our larger negatively correlated instances with the tightest budgets, and MaxStarIP5 outperformed the other formulations in
Table 5 for the remaining large negatively correlated KPCC instances.
The frequency distribution of the fastest running times can provide further insight into the complexity of KPCC that averaging out the running times could not provide. For this analysis, we consider only the test instances that were solved to optimality within the time limit specified. In the event of a tie, any formulations in a tie for the fastest running time were considered to be equally good. From
Table 6, we observe that the standard formulation is the preferred formulation followed by MaxStarIP1, and there is no clear winner for third place, based on frequency analysis.