Approximating arbitrary complex-valued continuous functions

In machine learning, one often wants to uniformly approximate an arbitrary continuous function arbitrarily well using polynomials, neural networks, or something else. But in the case of complex-valued functions, this is more difficult. For example, the limit of holomorphic functions in the topology of uniform convergence on compact sets is always holomorphic, so holomorphic functions cannot directly approximate non-holomorphic functions. But in this post, I will show you how one can approximate arbitrary continuous functions indirectly from something like a holomorphic function.

I will state and prove the result in the full generality which means I will need to use uniform algebras over compact sets, but I do not need to use the theory of uniform algebras in order to state and prove the result. I will try to make everything here self-contained.

I came up with the statement and the proof of the result myself.

Motivation:

I am not too much of a fan of neural networks. Even though neural networks perform well in practice, they do not behave in a way that is too appealing to pure mathematicians. For example, if you try to feed purely imaginary inputs into a vanilla neural network with tanh activation and no bias, you will just obtain a random vector from some multi-variate Cauchy distribution as output.

Polynomials, on the other hand, behave in a way appealing to mathematicians, so it would be nice if multivariate polynomials had a stronger presence in machine learning. I have personally trained polynomial machine learning models and they seem to behave mathematically in the sense that if we train the model multiple times with different initializations, we tend to end up with the same trained model; these polynomial models that I have trained are somewhat sophisticated too since they can have multiple layers, but I have not developed them as well as neural networks.

To demonstrate how polynomials may be useful in machine learning, one may want to resort to a uniform approximation for polynomials, but the Stone-Weierstrass approximation theorem allows us to generalize such a uniform approximation theorem from polynomials to nearly arbitrary rings of real-valued functions on some compact Hausdorff space. In this post, we shall give another generalized uniform approximation theorem that applies to uniform algebras which are closed algebras of continuous functions on compact Hausdorff spaces.

In this post, we shall use quantum states to overcome the limitations of some classes of complex-valued functions in approximating arbitrary continuous functions. This use of quantum states alludes to a possible way that one can use quantum states and partial traces to improve the performance of machine learning models.

Uniform algebras:

Suppose that is a compact Hausdorff space. Let $C (X)$ denote the collection of all continuous functions $f : X \to C$ . Give $C (X)$ the norm $∥ * ∥$ defined by $∥ f ∥ = max {| f (x) | : x \in X}$ . Then $C (X)$ is a Banach algebra. A closed subalgebra $A$ of $C (X)$ is said to be a uniform algebra if $A$ contains all constant functions and whenever $x, y \in X, x \neq y$ , there is some $f \in A$ with $f (x) \neq f (y)$ .

Example 0: If $X$ is a compact Hausdorff space, then $C (X)$ is always a uniform algebra.

Example 1: If $K$ is a compact subset of $C^{n}$ , then let $A$ be the set of all continuous functions $f : K \to C$ which are holomorphic on the interior of $A$ . Then $A$ is a uniform algebra.

Example 2: Suppose that $B$ is a commutative Banach algebra. Let $X$ denote the set of all continuous Banach algebra homomorphisms $ϕ : B \to C$ . Then $X$ becomes a compact Hausdorff space in the weak*-topology. If $a \in B$ , then define a continuous function $^a : X \to C$ by setting $^a (ϕ) = ϕ (a)$ . Then ${^a:a∈B}$ is a uniform algebra.

If $X$ is a compact Hausdorff space, $A \subseteq C (X)$ is a subspace algebra, and $V$ is a finite dimensional complex vector space. Then let $A \otimes V$ denote the set of all functions $f : X \to V$ where if $L : V \to C$ is linear , then $L \circ f \in A$ .

Lemma: Suppose that $X$ is a compact Hausdorff space $A \subseteq C (X)$ is a linear subspace that contains all constant functions and where if $x, y \in X, x \neq y$ , there is some $f \in A$ with $f (x) \neq f (y)$ . Then whenever $x_{0} \in U \subseteq X$ and $U$ is open, there is some finite dimensional complex inner product space $V$ and $f \in A \otimes V$ with $f (x_{0}) = 0$ and $∥ f (y) ∥ > 1$ whenever $y \in X ∖ U$ .

Proof: The proof will just use a standard compactness argument. For each $y \in X ∖ U$ , let $f \in A$ be a function with $| f_{y} (y) | > 1$ and $f_{y} (x_{0}) = 0$ . Let $U_{y} = {x \in X : | f_{y} (x) | > 1}$ for each $y \in X ∖ U$ . Then by compactness, there are $y_{1}, \dots, y_{n} \in X ∖ U$ where $X = U \cup U_{y_{1}} \cup \dots \cup U_{y_{n}}$ . Define $f : X \to C^{n}$ by setting $f (x) = (f_{y_{1}} (x), \dots, f_{y_{n}} (x))$ . Then $f (x_{0}) = 0$ and if $y \in X ∖ U$ , there is some $k$ with $y \in U_{y_{k}}$ , so in this case, $| f_{y_{k}} (y) | > 1$ , so $| f (y) | > 1$ as well. Q.E.D.

Recall that a density operator is a positive semidefinite trace 1 linear operator. Given a finite dimensional complex inner product space $V$ , let $D (V)$ denote the collection of all density operators $A : V \to V$ , and let $L (V)$ denote the collection of all linear operators from $V$ to $V$ . Suppose that $V, W$ are finite dimensional complex inner product spaces. Then the partial trace is the unique linear mapping ${Tr}_{W} : L (V \otimes W) \to L (V)$ subject to the condition that ${Tr}_{W} (R \otimes S) = R \cdot Tr (S)$ whenever $R : V \to V, S : W \to W$ are linear.

Theorem: Suppose that $X$ is a compact Hausdorff space, $A \subseteq C (X)$ is uniform algebra, $V$ is a finite dimensional complex inner product space and $f : X \to D (V)$ is a continuous function. Then whenever $ϵ > 0$ and $∥ * ∥$ is a matrix norm, there is some finite dimensional complex inner product space $W$ and $g \in A \otimes V \otimes W$ where if $x \in X$ ,

then $∥ f (x) - \frac{{Tr}_{W} (g (x) \cdot g (x)^{*})}{∥ g (x) ∥_{2}^{2}} ∥ < ϵ$ .

Proof: Suppose that $U$ is an open cover of $X$ . For each $y \in X$ , there is some $V_{y} \in U$ , natural number $d (y)$ , and some $f_{y} \in A \otimes C^{d (y)}$ with $f_{y} (y) = 0$ and $∥ f_{y} (z) ∥_{\infty} > 2$ whenever $z \in X ∖ V_{y}$ . Therefore, let $U_{y} = {x \in X : ∥ f_{y} (x) ∥_{\infty} < 1 / 2}$ . Then by compactness, there are $y_{1}, \dots, y_{n}$ where $X = U_{y_{1}} \cup \dots \cup U_{y_{n}}$ . Let $U_{j} = U_{y_{j}}, V_{j} = V_{y_{j}}$ .

Define $f_{k} = f_{y_{k}}$ . Let $e$ be a unit vector orthogonal to each vector in complex Euclidean space.

Let $Z_{k} = C^{d (y_{k})} + ⟨ e ⟩$ for each $k$ , and let $Z = Z_{1} \otimes \dots \otimes Z_{n}$ . Let $W_{0}$ be a finite dimensional complex inner product space, and let $t : {y_{1}, \dots, y_{n}} \to V \otimes W_{0}$ be a function such that ${Tr}_{W_{0}} (t (y_{k}) t (y_{k})^{*}) = f (y_{k})$ for all $k$ . Let $N$ be a positive integer. Define a function $h : X \to Z$ by setting $h (x) = (e + f_{1} (x)^{N}) \otimes \dots \otimes (e + f_{n} (x)^{N})$ where the power is taken elementwise.

If $S$ is a finite set of natural numbers, then let $S [k]$ denote the $k$ -th element of the set $S$ . Set $[n] = {1, \dots, n}$ for each natural number $n$ . For each $S \subseteq [n]$ , let $v_{j, S}$ be a unit vector where if $(j, S) \neq (k, T)$ , then $⟨ v_{j, S}, v_{k, T} ⟩ = 0$ .

We shall now construct a linear function $I$ that maps $Z$ to some other vector space.

Suppose that $S \subseteq [n]$ with $| S | = m$ and $x_{j} \in C^{d (j)}$ for $j \in S$ and $x_{j} = e$ otherwise. Then set $I (x_{1} \otimes \dots \otimes x_{n}) = \sum_{j \in [n] ∖ S} t (y_{j}) \otimes x_{S [1]} \otimes \dots \otimes x_{S [m]} \otimes v_{j, S}$ . Define $g = I \circ h$ .

Observe that $g (x) = \sum_{S \subseteq [n]} \sum_{j \in [n] ∖ S} t (y_{j}) \otimes f_{S [1]} (x)^{N} \otimes \dots \otimes f_{S [| S |]} (x)^{N} \otimes v_{j, S}$ . We include the vector $v_{j, S}$ to make sure that the summands are all orthogonal.

Let $PTr$ denote the partial trace where we trace out all factors in the tensor product except for $V$ . Then

$PTr (g (x) g (x)^{*}) = \sum_{S \subseteq [N]} \sum_{j \in [n] ∖ S} f (y_{j}) \cdot ∥ f_{S [1]} (x)^{N} ∥_{2}^{2} \dots ∥ f_{S [| S |]} (x)^{N} ∥_{2}^{2}$ .

Suppose now that $ϵ > 0$ . Then suppose that whenever $U \in U$ , if $x, y \in U$ , then $∥ f (x) - f (y) ∥ < ϵ$ .

If $x \in U_{k}$ and $∥ f (x) - f (y) ∥ \geq ϵ$ , then $y \notin V_{k}$ , so $∥ f_{k} (y) ∥_{\infty} > 2$ .

When $N$ is large, the dominant terms in the sum for $PTr (f (x) f (x)^{*})$ are indexed by sets $S$ where $k \notin S$ but where $S$ contains all elements $j$ with $∥ f (x) - f (y_{j}) ∥ \geq ϵ$ . Therefore, the value $PTr (g (x) g (x)^{*}) / ∥ g (x) ∥_{2}^{2}$ can be uniformly approximated by $f (x)$ .

Q.E.D.

Conclusion

The proof of the above theorem consists of a standard compactness argument that any mathematician who specializes in analysis should be able to come up with. The above result also applies to the field of real numbers since I nowhere mentioned anything about complex numbers, but for the real numbers, the above result is a consequence of the Stone-Weierstrass theorem. The Stone-Weierstrass theorem also applies to quaternionic-valued functions, so it is unnecessary to state the result for quaternions (and the proof will need to be restated and rewritten since quaternionic tensor products don't work). One should therefore not be too surprised by the above theorem.

The above theorem may also be a corollary of a known result on Banach algebras (or something like that), but I was not able to find an appropriate reference.

LESSWRONG
LW

LESSWRONG
LW

4

Approximating arbitrary complex-valued continuous functions

4

4