The Design of Approximation Algorithms 第二章习题

2025-02-14 41 min

Ex 2.1 $k$ -Suppliers

(a) Give a $3$ -approximation algorithm for the $k$ -supplier problem.

设最优解为 $\mathrm{OPT}$ , 假设我们有一个 $3$ -approximation 的算法，它给出的 suppliers 集合为 $S$ , 最优解的集合为 $S^{\ast}$ , 考虑集合 $F$ , 只要 $S$ 中的点能用半径为 $2\mathrm{OPT}$ 的 ball 覆盖住 $S^{\ast}$ 中的点，那么类似于 $k$ -center problem 中的证明，所有的 customer 到最近的 $S$ 中的点的距离不会超过它先到最近的 $S^{\ast}$ 中的点 $s^{\ast}$ 再到能包含 $s^{\ast}$ 的 $S$ 中的点 $s$ , 即 $\forall ~j \in D, d(j,S) \leq d(j, S^{\ast})+2\mathrm{OPT} \leq 3\mathrm{OPT}$ .

显然 $F$ 内部若能用半径为 $2\mathrm{OPT}$ 的 ball 覆盖除 $S$ 以外的所有点就必然能覆盖 $S^{\ast}$ , 这可以用 $2$ -approximation 的 $k$ -center algorithm 解决。

(b) Prove that there is no $\alpha$ -approximation algorithm for $\alpha<3$ unless $\mathrm{P=NP}$ .

对于任意的 $F$ 及其对应的 $d_{ij}$ 组成的子图，我们总能构造出 $D$ 和 $d_{i'j'}$ 使得其内部任意 $k$ 个点的集合可以成为最优解 $S^{\ast}$ , 若能找到一个 $\alpha$ -approximation algorithm 满足 $1 \leq \alpha < 3$ , 那么该算法找出的 $S$ 必然能用半径 $\alpha-1$ 的 ball 覆盖住 $F$ 中的所有点，于是我们用该算法求解 $k$ -center 问题，从而给出一个 $\beta$ -approximation algorithm for $k$ -center problem. 而根据书中 Theorem 2.4, 这说明了 $\mathrm{P=NP}$ .

Ex 2.2 Proof of Lemma 2.8

见 Scheduling Jobs on Identical Parallel Machines.

Ex 2.3 List Scheduling with Precedence Constraints

显然所有的 jobs with precedence constraints 会得到一个 DAG，其中每个点的点权为 $p_{i}$ , 记从入度为 0 的点到出度为 0 的点且拥有最大点权和的路径为 $A$ , 那么一定有最优解 $C_{\max}^{\ast} \geq p(A)$ , 其中 $p(A)$ 表示路径 $A$ 的点权和。而对于位于 $A$ 上的 jobs，list scheduling 算法给出的 schedule 中上一个 job $i$ 的结束时间到下一个 job $j$ 的开始时间如果存有空档，那么这个空档时期所有的 machines 都在工作，否则一定可以提早处理 job $j$ , 记 list scheduling 算法得到的解为 $C_{\max}$ , 那么一定会有 $C_{\max} \leq p(A)+\sum_{i=1}^{n}p_{i} / m$ , 显然最优解 $C_{\max}^{\ast} \geq \sum_{i=1}^{n}p_{i} / m$ , 于是 $C_{\max} \leq 2C_{\max}^{\ast}$ , 因此 list scheduling 是 2-approximation algorithm.

(a) Show that given a polynomial-time $\rho$ -relaxed decision procedure for the problem of scheduling related machines, one can produce a $\rho$ -approximation algorithm for the problem.

记最优解为 $C_{\max}^{\ast}$ , 当 $D < C_{\max}^{\ast}$ 时，根据设定，一个 polynomial-time $\rho$ -relaxed decision procedure 一定会在 polynomial-time 内 states that no schedule of length $D$ is possible for the instance; 当 $D \geq C_{\max}^{*}$ 时，它一定可以返回一个长度不超过 $\rho \cdot D$ 的 schedule. 因此只需二分找到 $C_{\max}^{\ast}$ 便能得到一个长度不超过 $\rho \cdot C_{\max}^{\ast}$ 的 schedule，从而我们可以得到一个 $\rho$ -approximation algorithm.（注意没有 polynomial-time）

(b) Prove that this algorithm is a polynomial-time $2$ -relaxed decision procedure.

假设算法给出了一个 scheduling, 那么根据算法描述, 每个 machine $i$ 的最后一个 job $j_{\max}$ 的开始时间一定有 $S_{j_{\max}} < D$ , 并且它处理每个 job $j$ 所需时间都有 $p_{j} / s_{i} \leq D$ , 那么显然每个 machine 结束工作的时间不会超过 $S_{j_{\max}} + p_{j_{\max}} / s_{i} < 2D$ .

接着证明算法能够 correctly states that no schedule of length $D$ is possible for the instance. 记 job $j$ 的 label 为 $l_{j}$ , 实际处理它的 machine（如果存在）为 $M_{j}$ , machine $i$ 实际处理的 jobs 的集合为 $J_{i}$ , 被 label 为 $i$ 的 job 集合为 $L_{i}$ , 根据算法描述，必然有 $M_{j} \leq l_{j}$ , 因此 job $j$ 最终无法被处理当且仅当所有满足 $i \leq l_{j}$ 的 machine $i$ 在时间 $D$ 以前都不存在 idle time, 即 $\sum_{j \in J_{i}}p_{j} / s_{i} \geq D$ , 又由于 machine $i$ 处理不属于 $L_{i}$ 的 job 只会使总的处理时间减小，因此有 $l_{j}D \leq \sum_{i=1}^{l_{j}}\sum_{j' \in J_{i}}p_{j'} / s_{i} \leq \sum_{i=1}^{l_{j}}\sum_{j' \in L_{i}}p_{j'} / s_{i}$ . 假设存在可以拥有 schedule of length $D$ 的 instance $I$ 使得算法 states that no schedule of length $D$ is possible for the instance, 不妨记满足长度小于等于 $D$ 的 schedule 为 $\mathcal{S}$ , 那么对于任意 job $j$ , 它在 $\mathcal{S}$ 中也一定是被序号小于等于 $l_{j}$ 的 machine 处理的，否则 $j$ 不可能在时间 $D$ 以前处理完，因此有 $L_{i} \subseteq \mathcal{S_{i}}$ 。而由于 $\mathcal{S}$ 的长度不会超过 $D$ , 这说明 $\forall i, \sum_{j \in L_{i}}p_{j} / s_{i} \leq \sum_{j \in \mathcal{S}_{i}}p_{j} / s_{i} \leq D$ , 对于任意 job $j$ 的 label $l_{j}$ , 都有 $\sum_{i=1}^{l_{j}}\sum_{j' \in L_{i}}p_{j'} / s_{i} \leq \sum_{i=1}^{l_{j}}\sum_{j' \in \mathcal{S}_{i}}p_{j'} / s_{i} \leq l_{j}D$ , 这与算法会 states that no schedule of length $D$ is possible for the instance 的条件矛盾，因此不存在反例，所以算法能够 correctly states that no schedule of length $D$ is possible for the instance.

显然算法是 polynomial-time 的，因此算法是一个 polynomial-time $2$ -relaxed decision procedure.

Ex 2.5 Minimum-Cost Steiner Tree

(a) Show that this gives a $2$ -approximation algorithm for the minimum-cost Steiner tree problem.

记最优解的 cost 为 $\mathrm{OPT}$ , Steiner tree 为 $T^{\ast}$ , $T^{\ast}$ 上点一定是由 $R$ 和部分 nonterminals 组成的，考虑删除 $T^{\ast}$ 中的 nonterminals. 不妨将 nonterminal $v_{0}$ 作为根结点，假设它的度为 $k$ ，将与其相连的结点分别记作 $v_{0,1},v_{0,2}, \dots, v_{0,k}$ , 它们与 $v_{0}$ 的边的边权分别为 $w_{1}, w_{2}, \dots, w_{k}$ . 删除边 $(v_{0}, v_{0,1}), \dots, (v_{0}, v_{0,k})$ 和点 $v_{0}$ 会得到 $k$ 个子图，此时我们用 $k-1$ 条边 $(v_{0,1}, v_{0,2}), \dots, (v_{0,k-1}, v_{0,k})$ 将它们连接起来得到一颗新的 Steiner tree, 将这些边的边权记为 $w_{1}', w_{2}', \dots, w_{k-1}'$ .

根据 triangle inequality, 我们有 $\forall i \in [k-1],w_{i}' \leq w_{i} + w_{i+1}$ , 于是新的 Steiner tree 的 cost 与旧的 Steiner tree 的 cost 之差有 $\sum_{i=1}^{k-1}w_{i}' - \sum_{i=1}^{k}w_{i} \leq \sum_{i=2}^{k-1}w_{i} \leq \mathrm{OPT}$ . 我们依次按照上述方式将 $T^{\ast}$ 中所有 nonterminals 删去即可得到一棵点集为 $R$ 的生成树，并且该生成树的 cost 满足 $C - \mathrm{OPT} \leq \mathrm{OPT}$ , 即 $C \leq 2\mathrm{OPT}$ , 而 $R$ 的最小生成树的 cost 显然满足 $C^{\ast} \leq C \leq 2\mathrm{OPT}$ , 于是题述算法是一个 $2$ -approximation algorithm.

(b) Show that this is still a $2$ -approximation algorithm for the minimum-cost Steiner tree problem on the original (incomplete) input graph $G$ .

$G'$ 中任意两点之间的边权为 $G$ 中两点之间最短路径的 cost，因此 $G'$ 中的边满足 triangle inequality, 同时 $G'$ 是一个完全图，因此 $G'$ 中使用 (a) 中算法得到的 Steiner tree 的 cost 不会超过 $G'$ 的最优解的 cost 的 2 倍，只需证明 $G'$ 的最优解也是 $G$ 中的最优解即可。

记 $G$ 中最优解的 Steiner tree 为 $T^{\ast}$ , 点集为 $V^{\ast}$ , 考虑将 $T^{\ast}$ 中任意两点之间的边修改为 $G'$ 中的边。若当前需要修改的边为 $(u,v)$ , 在 $G'$ 中 $(u,v)$ 对应的边与 $G$ 中的不同当且仅当存在 $u$ 与 $v$ 之间的最短路径 cost 小于 $cost(u,v)$ , 因此 $G'$ 中的最优解的 cost 一定不会大于 $G$ 中的最优解，而 $G'$ 的点和最后实际生成的边都来自于 $G$ ，只要 $G'$ 的最优解不存在环，即 $G'$ 实际生成的 Steiner tree 的确是一棵树，那么 $G'$ 的最优解的 cost 不会小于 $G$ 中的最优解，也就是说 $G'$ 中的最优解也是 $G$ 中的最优解。

下面证明 $G'$ 中的最优解 $T'$ 以及 MST 的解在实际生成的图中一定也是 $G$ 中的一个合法解，即 $T'$ 是一棵树。若 $T'$ 中存在环，那么导致环出现的最小结构如下：

其中蓝色部分属于 $G'$ , 红色部分是实际生成的，红色虚线省略了一部分无关的 $G$ 中的点与边。若 $w(v_{3}, v_{5}) < w(v_{4}, v_{5})$ ，那么显然将 $(u_{2}, v_{2})$ 改为 $(u_{1}, v_{2})$ 来将 $v_{2}$ 加入 Steiner tree 中会更优，即上图中的 cost 为

$2w(u_{1}, v_{3})+2w(u_{2}, v_{4})+w(v_{3}, v_{4})+w(v_{3}, v_{5})+w(v_{4}, v_{5})+w(v_{1}, v_{5})+w(v_{2}, v_{5})$

而删去 $(u_{2}, v_{2})$ 改为 $(u_{1}, v_{2})$ 的 cost 为

$2w(u_{1}, v_{3})+2w(u_{2}, v_{4})+w(v_{3}, v_{4})+2w(v_{3}, v_{5})+w(v_{1}, v_{5})+w(v_{2}, v_{5})$

更小，并且此时并不会出现环，反之若 $w(v_{3}, v_{5}) > w(v_{4}, v_{5})$ ，则将将 $(u_{1}, v_{1})$ 改为 $(u_{2}, v_{1})$ 来将 $v_{1}$ 加入 Steiner tree 中会更优，若这两条边相等则任取一种情况将环消去。可以发现， $G'$ 最优解不会存在环，而根据 MST 的性质，它也不会存在环，从而 $G'$ 的最优解展开后也是 $G$ 的最优解， $G'$ 中的 MST 展开后是 $G$ 的合法解，从而 MST 解不会超过 $G$ 中最优解的 2 倍，即它仍然是一个 $2$ -approximation algorithm.

Ex 2.6 Minimum-Degree Spanning Tree

Prove that there can be no $\alpha$ -approximation algorithm for the minimum-degree spanning tree problem for $\alpha < 3 / 2$ unless $\mathrm{P=NP}$ .

反证法，假设存在 $\alpha$ -approximation algorithm $A$ for the minimum-degree spanning tree problem, 我们证明它可以作为 Hamiltonian path 问题的 decider.

对于图 $G$ , 若它存在 Hamiltonian path, 那么它对应的最优解的 max degree 必然是 $\Delta(T^{\ast})=2$ ¹, 假设 $A$ 给出的解为 $T$ , 那么有 $\Delta(T) \leq \alpha \Delta(T^{\ast}) = 2\alpha < 3$ , 由于 $\Delta(T)$ 为整数，因此它只能为 $2$ , 因此对于存在 Hamiltonian path 的图 $G$ 它一定能返回一条 Hamiltonian path.

若 $G$ 不存在 Hamiltonian path, 那么必然有 $\Delta(T^{\ast}) > 2$ , 于是有 $2 < \Delta(T^{\ast}) \leq \Delta(T) \leq \alpha\Delta(T^{\ast})$ , 即 $A$ 一定不会返回一条 Hamiltonian path.

因此根据 $A$ 返回的解可以判断任意图 $G$ 是否存在一条 Hamiltonian path, 而 Hamiltonian path 的判定问题是 NP-hard 的，从而只有 $\mathrm{P=NP}$ 时才可能存在 $\alpha$ -approximation algorithm for the minimum-degree spanning tree problem, 其中 $\alpha < 3 / 2$ .

Ex 2.7 Path Finding

Suppose that an undirected graph $G$ has a Hamiltonian path. Give a polynomial-time algorithm to find a path of length at least $\Omega(\log n / (\log \log n))$ .

在图 $G$ 上跑 Finding minimum-degree spanning trees 中的算法得到一颗生成树 $T$ , 满足 $\Delta(T) \leq 2\mathrm{OPT}+\lceil \log_{2}n \rceil = 4 + \lceil\log_{2}n \rceil$ , 于是有 $\Delta(T) = O(\log n)$ .

下面证明这棵树的直径的长度 $L$ 至少为 $\Omega(\log n / (\log \log n))$ .

构造结点数量为 $n$ 的树 $T'$ , 令根结点的度数为 $\Delta(T)-1$ , 其余非叶子结点的度数为 $\Delta(T)$ （结构与完美二叉树类似）, 此时 $T'$ 的高度 $h' = \Theta\left(\log_{\Delta(T)}n\right) = \Omega\left(\frac{\log n}{\log \log n}\right)$ , 直径为 $L'=2h'=\Omega\left(\frac{\log n}{\log \log n}\right)$ , 显然 $T'$ 的直径 $L' \leq L$ , 因此有 $L \geq \Omega\left( \frac{\log n}{\log \log n} \right)$ .

WARNING
$L' \leq L$ 这一点还是需要证明一下的，不过证明不难，我偷懒不写了 OvO.

Ex 2.8 Local Moves Without Restrictions for MDST

Consider the local search algorithm of Section 2.6 for finding a minimum-degree spanning tree, and suppose we apply a local move to a node whenever it is possible to do so; that is, we don’t restrict local moves to nodes with degree between $\Delta(T) - \ell$ and $\Delta(T)$ . What kind of performance guarantee can you obtain for a locally optimal tree in this case?

猜想是此时的 locally optimal tree $T$ 满足 $\Delta(T)=\mathrm{OPT}$ , 但是算法不能保证在 polynomial time 内终止。然后我没证出来 $\Delta(T)=\mathrm{OPT}$ …

找了 Section 2.6 中的算法原论文 Approximating the minimum degree spanning tree to within one from the optimal degree, 正确答案是 $\Delta(T) \leq \mathrm{OPT} + 1$ , 算法不能保证在 polynomial time 停止。

$\textit{Proof.}$ 记算法停止时的 locally optimal tree 为 $T$ , 令 $\Delta(T)=k$ , 此时满足 $\deg=k$ 的点集为 $S$ , 满足 $\deg=k-1$ 的点集为 $B$ . 若移除 $S$ 、 $B$ 以及与它们相邻的边，则 $T$ 会被划分为一片森林，且拥有至少 $\lvert S \rvert \cdot k + \lvert B \rvert \cdot (k-1) - 2\cdot (\lvert S \rvert + \lvert B \rvert - 1) = (\lvert S \rvert + \lvert B \rvert) \cdot (k-2) - (\lvert B \rvert - 2)$ 棵树。若原图 $G$ 中存在一条不属于 $T$ 且不是被移除的边，能将 $T$ 中两棵树 $T_{1}, T_{2}$ 连接起来，那么将 $T_{1}, T_{2}$ 之间被移除的边替换为这条边可以达成一次 local move, 不满足 $T$ 为此时的 locally optimal tree 的条件，因此 $G$ 中不存在一条能将 $T$ 中两棵树 $T_{1}, T_{2}$ 连接起来的不属于 $T$ 且不是被移除的边，从而任意连通的 spanning tree 在 $S \cup B$ 中的 average degree 至少为 $\frac{(\lvert S \rvert + \lvert B \rvert) \cdot (k-2) - (\lvert B \rvert - 2) + \lvert S \rvert + \lvert B \rvert - 1}{\lvert S \rvert + \lvert B \rvert}=k-1-\frac{\lvert B \rvert - 1}{\lvert S \rvert + \lvert B \rvert}$ , 由于 maximal degree 必然 $\geq$ average degree, 因此 $\mathrm{OPT}\geq k-1-\frac{\lvert B \rvert - 1}{\lvert S \rvert + \lvert B \rvert} \implies \mathrm{OPT} \geq k-1$ , 于是有 $\Delta(T) \leq \mathrm{OPT} - 1$ .

NOTE
感觉一开始的思路就错了，不应该如此自信地认为此时的算法得到的就是最优解的。本题实际上只需把当前的 locally optimal tree 的设定往 Section 2.6 的证明中套就可以了。

Ex 2.9 Minimum-Degree Steriner Tree

As given in Exercise 2.5, in the Steiner tree problem we aew given an undirected graph $G=(V, E)$ and a set of terminals $R \subseteq V$ . A Steiner tree is a tree in $G$ in which all the terminals are connected; a nonterminal need not be spanned. Show that the local search algorithm of Section 2.6 can be adapted to find a Steiner tree whose maximum degree is at most $2\mathrm{OPT}+\lceil \log_{2} n \rceil$ , where $\mathrm{OPT}$ is the maximum degree of a minimum-degree Steiner tree.

从任意一棵 Steiner tree 开始，删去与 $D$ (terminals) 无关的边，仅保留删去后无法保证 $D$ 内部连通的边以及相应的点集，记此时的 Steiner tree 为 $T$ , 它的点集为 $W$ .

记以 $W$ 中两个不同点为端点、边集与 $E(T)$ 的交集为 $\emptyset$ 的一条 path 为 non-tree path, 显然在 $T$ 上加入一条 non-tree path 会构成一个环。一次 local move 的过程如下：将任意一条 non-tree path 加入 $T$ , 在构成的环中寻找一条边 $(u,v)$ ，满足 $\Delta(T) - \log n \leq \max\{\deg(u), \deg(v)\} \leq \Delta(T)$ , 然后将这条边删去，得到一条新的 Steiner tree $T'$ 满足 $\max\{\deg'(u), \deg'(v)\} = \max\{\deg(u), \deg(v)\}-1$ .

当无法进行 local move 时，便得到了一棵 pseudo optimal Steiner tree (POST) $T^{\ast}$ . 可以证明， $\Delta(T^{\ast}) \leq 2\mathrm{OPT} + \lceil \log_{2} n \rceil$ . 证明方法与 Section 2.6 中的 Theorem 2.19 大同小异，也与 Ex 2.8 的证明有着相似的框架，因此这里不再赘述。

Ex 2.10 Monotone and Submodular Function

Let $E$ be a set of items, and for $S \subseteq E$ , let $f(S)$ give the value of the subset $S$ . Suppose we wish to find a maximum value subset of $E$ of at most $k$ items. Furthermore, suppose that $f(\emptyset)=0$ , and that $f$ is monotone and submodular. We say that $f$ is monotone if for any $S$ and $T$ with $S \subseteq T \subseteq E$ , then $f(S) \subseteq f(T)$ . We say that $f$ is submodular if for any $S$ , $T \subseteq E$ , then
$f(S) + f(T) \geq f(S \cup T)-f(S \cap T).$
Show that the greedy $\left( 1 - \frac{1}{e} \right)$ -approximation algorithm of Section 2.5 extends to the problem.

一模一样的算法，只需将函数 $v$ 修改为 $f$ 即可，甚至证明过程也可以直接拿来用（

Ex 2.11 Maximum Coverage Problem

In the maximum coverage problem, we have a set of elements $E$ , and $m$ subsets of elements $S_{1}, \dots, S_{m} \subseteq E$ , each with a nonnegative weight $w_{j} \geq 0$ . The goal is to choose $k$ elements such that we maximize the weight of the subsets that are covered. We say that a subset is covered if we have chosen some element from it. Thus we want to find $S \subseteq E$ such that $\lvert S \rvert = k$ and that we maximize the total weight of the subsets $j$ such that $S \cap S_{j} \not= \emptyset$ .

此处给出的问题定义与 Wikipedia 上给出的不同，但是可以将这个版本等价转化为 Wikipedia 上的问题： $E$ 中 element $e_{i}$ 看作集合 $E_{i}=\{S_{j} \mid e_{i} \in S_{j} \land S_{j} \subseteq E\}$ , $S_{i}$ 则看成 element $s_{i}$ , 目标是选择 $k$ 个集合, 设其下标集合为 $I$ , 记 $S'=\cup_{i \in I}E_{i}$ , 使得 $\sum_{i \colon s_{i} \in S'}w_{i}$ 最大。

类似地，Wikipedia 版本也可以等价转化为这个版本，两个版本的问题等价。下面给出 Wikipedia 版本的算法（符号也沿用 Wikipedia 版本）。

(a) Give a $\left( 1 - \frac{1}{e} \right)$ -approximation algorithm for this problem.

记 $S(i)$ 为第 $i$ 次迭代选择的 subsets 的集合，则第 $i$ 次选择的 subset 为 $S(i)-S(i-1)$ , 其中 $S(0)=\emptyset$ , $E(i)$ 为此时 cover 的 element 集合，value function 为 $f(S(i)) = \sum_{i \colon e_{i} \in E(i)}w_{i}$ . 每次贪心地选出能使 $f(S(i-1) \cup \{S_{i}\})$ 最大的 subset $S_{i}$ 直到 $\lvert S(i) \rvert = k$ .

这里的 $f$ 显然也是 monotone and submodular function, 同时算法步骤与 Section 2.5 中的一致，因此自然也是 $\left( 1-\frac{1}{e} \right)$ -approximation.

(b) Show that if an approximation algorithm with performance guarantee better than $1 - \frac{1}{e} + \epsilon$ exists for the maximum coverage problem for some constant $\epsilon > 0$ , then every NP-complete problem has an $O\left(n^{O(\log \log n)}\right)$ time algorithm. (Hint: Recall Theorem 1.13)

考虑 unweighted ( $\forall i, w_{i}=1$ ) maximum coverage problem 与 unweighted set cover problem. 二者的输入都是 elements 集合 $E$ 与 subsets 集合 $S$ , 另外 MCP 还有输入 $k$ . (a) 中算法同样适用于 unweighted set cover problem, 只需修改算法终止条件为 cover $E$ 中所有元素即可。

下面证明该算法对于 unweighted set cover problem 的 approximation 为 $\ln n+1$ .

记 $x_{i}$ 为第 $i$ 次迭代 cover 的新元素数量， $y_{i} = \sum_{j=1}^{i}x_{j}$ , 以及 $z_{i}=\mathrm{OPT}-y_{i}$ , 其中 $\mathrm{OPT}$ 为 unweighted MCP 的最优解（ $k$ 个 subset 能 cover 的最大元素数量），令 $k^\ast$ 为 unweighted SCP 的最优解（cover $E$ 中所有元素所需 subset 的最小数量）。

则当 $k = k^\ast$ 时，MCP 的最优解可以 cover 的元素数量为 $\mathrm{OPT}=n$ , 上述算法给出的解满足 $z_{k^{\ast}} \leq \frac{n}{e}$ . 当算法迭代 $2k^\ast$ 次后，给出的解满足 $z_{2k^\ast} \leq \frac{n}{e^{2}}$ , 回忆 Theorem 2.16 的证明，又有

$v(S^{i}) \geq \left( 1 - \left( 1 - \frac{1}{k} \right)^{i} \right)v(O) \implies y_{i} \geq \left( 1 - \left( 1 - \frac{1}{k} \right)^{i} \right) \mathrm{OPT}$

于是 $z_{i} \leq \left( 1 - \frac{1}{k^{\ast}} \right)^{i} \cdot n$ . 由于最后求 approximation 时需要除以 $k^{\ast}$ , 因此令 $z_{i} \leq k^{\ast}$ , 由 $\frac{n}{k^{\ast}} \cdot \left( 1 - \frac{1}{k^{\ast}} \right)^{i} \leq \frac{n}{k^{\ast}} \cdot e^{-i/k^{\ast}}$ , 令不等式右边 $\leq 1$ 得到 $i \geq k^{\ast}\ln{\frac{n}{k^{\ast}}}$ . 因此迭代 $t > k^{\ast} \ln{\frac{n}{k^{\ast}}} + k^{\ast} = k^{\ast}\left( \ln \frac{n}{k^{\ast}} + 1\right)$ 次后 $z_{t} \leq 0$ , 算法停止，则算法找到的解不超过 $k^{\ast}\left( \ln \frac{n}{k^{\ast}} + 1\right) \leq k^{\ast} \left( \ln n + 1\right)$ , 从而 approximation 为 $\ln n + 1$ .

若 MCP 存在 $\left( 1-\frac{1}{e} + \epsilon \right)$ -approximation algorithm, 我们可以不断对剩余的 subsets 以及未被 cover 的 elements 运行该算法直至所有 elements 都被 cover 掉。根据上面的分析，此时有 $z_{ik^\ast} \leq n\left( \frac{1}{e} - \epsilon \right)^{i}$ , 令不等式右边 $< 1$ 得到 $i \geq \frac{\ln n}{1 - \ln(1-\epsilon \cdot e)}$ , 于是该算法的 approximation 为 $\frac{\ln n}{1 - \ln(1-\epsilon \cdot e)} = c \ln n$ , 其中 $c < 1$ . 根据 Theorem 1.13, 可以推出每个 NP-complete problem 都存在一个 $O\left(n^{O(\log \log n)}\right)$ -time deterministic algorithm 的结论，得证。

Ex 2.12 Matroid

(a) Given an undirected graph $G=(V,E)$ , show that the forests of $G$ form a matroid.

$S$ 是 $G$ 的一个 forest.

Axiom 1: 若 forest $S \in \mathcal{I}$ , 由于 $S$ 的子集 $S'$ 也是一个 forest, 所以 $S'$ 也属于 $\mathcal{I}$ .
Axiom 2: 若 forset $S,T \in \mathcal{I}$ $S, T \in I$ , 且 $\lvert S \rvert < \lvert T \rvert$ $∣ S ∣ < ∣ T ∣$ , 则 $T$ $T$ 包含的连通块数量少于 $S$ $S$ . 只需证明至少存在一条边 $e \in T-S$ $e \in T - S$ 且 $e$ $e$ 的两个端点不在 $S$ $S$ 中的同一连通块内，则 $S \cup \{e\}$ $S \cup {e}$ 会得到一个新的 forest, 从而有 $S \cup \{e\} \in \mathcal{I}$ $S \cup {e} \in I$ .
- 若 $e \in T - S$ 是 $S$ 的连通块内部的边，那么给 $S$ 加上 $e$ 会得到一个环，由于 $T$ 也是一个 forest, 因此环上至少有一条边 $e' \in S-T$ , 将 $e'$ 移除可以得到一个新的 forest. 不断进行上述操作，若不存在 $e \in T-S$ 满足 $e$ 的两个端点不在同一连通块内，那么最终可以得到 $T$ , 但是这与 $\lvert T \rvert > \lvert S \rvert$ 矛盾，因此至少存在一条边 $e \in T-S$ 且 $e$ 的两个端点不在 $S$ 中的同一连通块内，Axiom 2 满足。
从而 $G$ 的所有 forest 可以构成一个 matroid.

(b) Show that for any matroid, every base of the matroid has the same number of ground elements.

若 $S_{1}, S_{2}$ 都是 matroid $(E, \mathcal{I})$ 的 base, 且 $\lvert S_{1} \rvert < \lvert S_{2} \rvert$ , 根据 axiom 2, 必然存在 $e \in S_{2}-S_{1}$ 使得 $S_{1} \subset S_{1} \cup \{e\} \in \cal{I}$ , 与 $S_{1}$ 是 base 矛盾，从而 $\lvert S_{1} \rvert \geq \lvert S_{2} \rvert$ . 若 $\lvert S_{1} \rvert > \lvert S_{2} \rvert$ , 同样会与 $S_{2}$ 是 base 矛盾，因此只能是 $\lvert S_{1} \rvert = \lvert S_{2} \rvert$ , 于是任意 matroid 的 base 都拥有相同数量的 ground element.

(c) For any given matroid, suppose that for each $e \in E$ , we have a nonnegative weight $w_{e} \geq 0$ . Give a greedy algorithm for the problem of finding a maximum-weight base of a matroid.

\begin{array}{ll} \\ \\ \textbf{Input } & \text{matroid } (E, \mathcal{I}), \text{weight vector } \mathbf{w} \\ 1 & W \coloneqq \emptyset \\ 2 & \text{Sort } E \text{ into monotonically decreasing order by weight} \\ 3 & \textbf{For each } e \text{ in } E \\ 4 & \textbf{do} \\ 5 & \text{\qquad} \textbf{if } W \cup \{e\} \text{ in } \mathcal{I} \textbf{ then} \\ 6 & \text{\qquad\qquad} W \coloneqq W \cup \{e\} \\ 7 & \text{\qquad} \textbf{fi} \\ 8 & \textbf{od} \\ 9 & \text{return } W \end{array}

Ex 2.13 Maximum-Value Base of Matroids

(a) Prove that for a locally optimal solution $S$ , $f(S) \geq \frac{1}{2}\mathrm{OPT}$ .

略，直接证 (b).

(b) Use this to prove that for any locally optimal solution $S$ , $f(S) \geq \frac{1}{2}\mathrm{OPT}$ .

由 Ex 2.12 (b) 的结论，不妨设 matroid $(E, \mathcal{I})$ 的 base 拥有的 ground element 数量为 $k$ . 记 $g$ 为 locally optimal solution $S$ 到 maximum-value base $O$ 的 bijection.

令 $T = \{ e \mid g(e)=e \}, M = S - T, M' = O - T, m = \lvert M \rvert = \lvert M' \rvert, \rho_{e}(S) = f(S \cup \{ e \}) - f(S)$ , 由 $f$ 是 nondecreasing submodular function, 有

f(O) \leq f(S) + \sum_{e \in O - S}\rho_{e}(S)

根据 submodularity, 有²

$\rho_{e}(S) = f(S \cup \{ e \}) - f(S) \leq f\left(\left(S - \{ g^{-1}(e) \} \right) \cup \{ e \} \right) - f\left( S - \{ g^{-1}(e) \} \right),$

又由于 $S$ 的 locally optimality, 有

f\left(\left(S - \{ g^{-1}(e) \} \right) \cup \{ e \} \right) - f\left( S - \{ g^{-1}(e) \} \right) \leq f\left( S \right) - f\left( S - \{ g^{-1}(e) \} \right).

从而

\begin{aligned} f(O) & \leq f(S) + \sum_{e \in O - S}\rho_{e}(S) \\ & \leq f(S) + \sum_{e' \in S - O} \left( f(S) - f(S - \{ e' \}) \right) \\ & \leq f(S) + \left( f(S) - f(S - \{ e_{1}' \}) \right) + \left( f(S - \{ e_{1}' \}) - f(S - \{ e_{1}', e_{2}' \}) \right) \\ &\quad + \dots + \left( f(M + \{ e_{m}' \}) - f(M) \right) \\ & = f(S) + f(S) - f(M) \\ & \leq 2f(S) \end{aligned}

即 $f(S) \geq \frac{1}{2}\mathrm{OPT}$ , 得证。

(c) For any $\epsilon > 0$ , give a variant of this algorithm that is a $\left( \frac{1}{2} - \epsilon \right)$ -approximation algorithm.

每次迭代时寻找 set $P \subseteq \mathcal{I}$ 满足 $\lvert P \rvert = k, \lvert P - S \rvert = \lvert S - P \rvert \leq \epsilon k$ , 并且 $f(P) > f(S)$ , 就将 $S$ 替换为 $P$ .

Ex 2.14 Edge-Disjoint Paths

Show that this greedy algorithm is an $\Omega(1 / \ell)$ -approximation algorithm for the edge-disjoint paths problem in directed graphs.

记 $O$ 为最优解 route 的 $(s,t)$ 点对的集合， $S$ 为题述算法找到的 $(s,t)$ 点对集合， $P(O)$ 与 $P(S)$ 分别为最优解与题述算法 route 的路径集合。

令 $O' = O - S$ , 若 $p' \in P(O')$ 满足 $\exists p \in P(S), p \cap p' \neq \emptyset$ 则将 $p'$ route 的点对 $(s',t')$ 归入集合 $O_{1}'$ , 显然有 $O_{1}' \subseteq O'$ , 设 $O_{2}' = O' / O_{1}' \subseteq O'$ .

对于 $p_{2}' \in P(O_{2}')$ , 由于它未与 $S$ 中 route 的任意一条 path 冲突且没有被算法 route, 因此必然有 $\lvert p_{2}' \rvert > \ell$ , 这样的点对数量不会超过 $m / \lvert p_{2}' \rvert \leq \ell$ , 即 $\lvert O_{2}' \rvert \leq \ell$ . 因为 $\lvert S \rvert \geq 1$ , 因此 $\lvert O_{2}' \rvert \leq \ell \lvert S \rvert$ .

对于 $P(O_{1}')$ 中的路径，考虑将其对应到与 $P(S)$ 中的路径存在交集的边上。因为 $O_{1}' \subseteq O$ , 因此 $P(O_{1}')$ 中的路径是两两不相交的，记 $E(p)$ 为路径 $p \in P(O_{1}')$ 对应到的边集，即 $E(p) = \{ e \mid e \in p \cap p', \forall p' \in P(S) \}$ , 于是有 $\forall p, p' \in P(O_{1}'), p \neq p', E(p) \cap E(p') = \emptyset$ . 考虑 $P(S)$ 中的路径 $p$ , 根据刚才的分析， $\forall e \in p$ , $e$ 最多出现在一个 $E(p')$ 中，其中 $p' \in P(O_{1}')$ , 对应唯一一对 $(s,t)$ 点对，于是 $p$ 对应的 $O_{1}'$ 中的点对数量至多为 $\lvert p \rvert \leq \ell$ , $P(S)$ 对应的 $O_{1}'$ 中的点对数量至多为 $\ell \lvert S \rvert$ , 由于 $O_{1}'$ 每一个点对都在 $P(S)$ 中有对应，因此 $\lvert O_{1}' \rvert \leq \ell \lvert S \rvert$ .

根据 $O'$ 的定义有 $\lvert O / O' \rvert \leq \lvert S \rvert$ , 因此

\lvert O \rvert \leq \lvert S \rvert + \lvert O' \rvert = \lvert S \rvert + \lvert O_{1}' \rvert + \lvert O_{2}' \rvert \leq (1 + 2\ell) \lvert S \rvert,

从而算法的 approximation 为 $\lvert S \rvert / \lvert O \rvert = \Omega\left( \frac{1}{1 + 2\ell} \right) = \Omega\left(1 / \ell\right)$ .

Ex 2.15 NP-Completeness of Edge Coloring

Prove that there is no $\alpha$ -approximation algorithm for the edge coloring problem for $\alpha < \frac{4}{3}$ unless $\text{P=NP}$ .

假设存在 $\alpha$ -approximation algorithm $A$ 满足 $\alpha < \frac{4}{3}$ , 那么它可以用来判断 $\Delta=3$ 的图是否是 $3$ -edge-colorable: 当 $\Delta=3$ 的图 $G$ 是 $3$ -edge-colorable 时，显然 $\mathrm{OPT}=3$ , 那么算法给出的解 $\leq \alpha \mathrm{OPT} < 4$ , 由于 edge coloring problem 的解是整数，因此算法的解只能是 $3$ ; 当 $G$ 不是 $3$ -edge-colorable 时，有 $\mathrm{OPT} \geq 4$ , 于是 $A$ 的解满足 $\geq \mathrm{OPT} \geq 4$ ; 因此当 $A$ 返回 $3$ 时说明 $\Delta=3$ 的图 $G$ 是 $3$ -edge-colorable 的，当 $A$ 返回值大于 $3$ 时说明图 $G$ 不是 $3$ -edge-colorable 的，而根据 Theorem 2.22, $A$ 的存在可以推出 $\text{P=NP}$ .

Ex 2.16 Edge Coloring on Bipartite Graphs

Give a polynomial-time algorithm for finding a $\Delta$ -edge-coloring of $G$ .

按照顺序给 $G$ 中的边染色。考虑当前要染色的边 $(u,v)$ , 设 $u, v$ 序号最小未使用的颜色分别为 $c_{u}, c_{v}$ :

若 $c_{u}=c_{v}$ 则直接给当前边染色 $c_{u}$ ;
若 $c_{u} \neq c_{v}$ , 不妨设 $c_{u} < c_{v}$ . 找到 $v$ 的边集中染色为 $c_{v}$ 的边，设其为 $(u', v)$ , 考虑将其染色为 $c_{u}$ , 同样的方式将 $u'$ 当前被染色为 $c_{u}$ 的边改为染色 $c_{v}$ , 继续下去相当于找到一条以 $v$ 为起点的颜色为 $c_{u}, c_{v}$ 交替出现的增广路，由二分图的性质， $u$ 不可能出现在这条增广路上，否则与 $c_{u}$ 是 $u$ 序号最小未使用的颜色矛盾（根据算法描述， $u$ 所使用的颜色序列只会是以 $1$ 为起点的一段连续区间），因此可以将这条增广路上的颜色互换 $c_{u} \leftrightarrow c_{v}$ , 此时可以将 $(u,v)$ 染色为 $c_{u}$ .

显然上述算法使用的颜色数量不会超过 $\Delta$ , 并且时间复杂度为 $O(nm)$ .

Notes

Ex 2.1 的解法参考：Intro to Approximation and Randomized Algs 2015/2016 - 02-hw-sol，它的主要思想来源于 A Unified Approach to Approximation Algorithms for Bottleneck Problems
Ex 2.5 更严谨、formal 的解法可以参考：Lecture 2 (Stanford University — CS261: Optimization)
Ex 2.7 可以阅读：
- An approximation algorithm for finding a long path in Hamiltonian graphs
- How to find long paths efficiently
Ex 2.8 见原论文 Approximating the minimum degree spanning tree to within one from the optimal degree 的 Section 5 The $\Delta^{\ast}+1$ algorithm.
Ex 2.9 见原论文 Approximating the minimum degree spanning tree to within one from the optimal degree 的 Section 3 The Steiner Problem.
Ex 2.11 (b) 的证明参考：CS 598CSC: Approximation Algorithms - Lecture 3

符号设定见 Finding minimum-degree spanning trees. ↩
见 An analysis of approximations for maximizing submodular set functions—I Proposition 2.2 (ii’). ↩

近似算法 TDAA