HW1¶
1. Exercise 2.10 of Convex Optimization¶
with \(A\in S^n, \quad b\in R^n, \quad c\in R\).
(a) Show that \(C\) is \(A\succeq 0\) (which means \(A\) is positive semidefinite).
(b) Show that intersection of \(C\) and the hyperplane defined by \(g^T x + h = 0\) ( where \(g \ne 0\) ) is convex if \(A + λgg^T \succeq 0\) for some \(λ ∈ R\). Are the converses of these statements true?
Proof
a. It is sufficient to show that \(f(x) = x^T A x + b^T x + c\) is a convex function (for the epi of a convex function is convex.) i.e.
that is
it is true since \(A\) is positive semidefinite.
b. Let
for \(x\in H\), \(x\in C\) equavalent to
whose solution set defined \(\tilde{C}\),
if \(A + λgg^T \succeq 0\) for some \(λ ∈ R\), then \(\tilde{C}\) is convex.
is intersection of two convex sets, so it is convex.
the converses is false. let's consider the case
where \(C=R^2\), \(H = \{ (x_1, x_2) | x_2 = 0\}\), \(C \cap H\) is obvious convex, but \(A + λgg^T = \begin{bmatrix} -1 & 0 \\ 0 & -1 + λ \end{bmatrix}\) can never be positive semidefinite.
2. Exercise 2.18 of Convex Optimization¶
Invertible linear-fractional functions.
Suppose the matrix
is nonsingular. Show that \(f\) is an invertible linear fractional function.
Solve
let \(y = (Ax+b)/(c^Tx+d)\), then
since \(Q\) is inversible, and \(c^Tx+d > 0\), we get
Then
so \(f\) is bijective.
Schur complement Matrix Analysis 0.8.5
1 . For \(A\) invertible,
so,
2 . If \(d \ne 0\), similarly, we get
So,
3 . If \(A\) is not invertable and \(d = 0\), (啊!我要疯了,哪里有这个定理啊,我快算死了!)
Since
is nonsingular, we have
- \(rank(A) = n-1\), the kernel space denoted \(\lambda v\),
- \(rank([A \quad b] = n)\)
- \(c^Tv\ne 0\),
- \(b \ne 0\).
Then we conclude \(A + bc^T\) is invertible!
Otherwise, there is some vector \(u\) s.t. \(Au + bc^Tu = 0\), but we have \(rank([A \quad b]) > rank(A) = n-1\), we deduce \(Au = 0\) and \(bc^Tu = 0\). so we get \(u = \lambda v\), and then \(0 = \lambda bc^Tv \Rightarrow \lambda = 0\)
Now, we can get
denote \(\tilde{A} = A + bc^T\),
(啊,我实在是不想写了,领会意思吧!我都不知道这有什么用。)
3. convex hull/Sparse Representation of a Polytope¶
reference: Sparse representation of a polytope and recovery of sparse signals and low-rank matrices
Please give the Sparse Representation of a Polytope \(A = \{x ∈ R^n : ‖x‖_∞ ≤ θ, ‖x‖_1 ≤ sθ\}\).
For any \(v ∈ R^n\), define the set of sparse vectors \(U (\theta, s, v) ⊂ R^n\) by
Any \(v\in A\) can be represented as
Solve
In the paper, the main idea of the proof is to fix some components of \(v\) and then view the left components as a vector \(\tilde{v}\) that analogous to the origin problem in lower dimension. And use the technique of decomposition of vector to the edge of the intersection of cube and hyperplane to decrease at least one dimension. The requests of \(j\) is found just to make sure that the decomposition of \(\tilde{v}\) satisfies the condition of \(U(\theta, s, \tilde{v})\).
But here, we give a another expression, and deduce a little stronger conclusion.
Without loss of generality, we can assume that \(\theta=1\), and just consider the case \(v \succeq 0\).
For some \(v\in A, \|v\|_1 = t\), we define
Indeed, \(P(t)\) is the extrime points of \(conv\\,U(1, s, v)\).
And points in \(P(k,t)\) is obviously \(\lceil t\rceil\)-sparse vectors.
e.g. For some \(v, \|v\|= 1.5\) \(P(1, 0.5) = \{ [1, 1/2, 0], [1, 0, 1/2], [0, 1, 1/2], [1/2, 1, 0], [1/2, 0, 1], [0, 1/2, 1] \}\), just like the graph in the paper.
We claim that \(v\) can be expressed by \(P(k,t)\).
Consider \(v=[a_1, ... , a_n], v\in A\), if \(v\notin P(t)\), then it has two components, WLOG, noted \(a_1, a_2 \notin \{0, 1\}\).
- If \(a_1+a_2 \le 1\), then
where \(\lambda_1 = \frac{a_1}{a_1+a_2}, \lambda_2=\frac{a_2}{a_1+a_2}\)
and \(u_1 = [a_1+a_2, 0, a_3, ..., a_n], u_2 = [0, a_1+a_2, a_3, ..., a_n]\).
- If \(1< a_1+a_2 < 2\), then
where \(\lambda_1 = \frac{1-a_1}{2-a_1+a_2}, \lambda_2=\frac{1-a_2}{2-a_1+a_2}\)
and \(u_1 = [a_1+a_2-1, 1, a_3, ..., a_n], u_2 = [1, a_1+a_2-1, a_3, ..., a_n]\).
It's obvious that \(\lambda_1+\lambda_2 = 1\) and \(\|u\|_i=t, i=1, 2\). And the number of components of \(u_1, u_2\) that not in \(\{0, 1\}\) is less than \(v\).
So, by induction, we can conclude that \(v\) can be expressed by \(P(t)\).