There doesn't seem to be anything surprising that there are many orthogonal steering vectors for the layer 8 that results in similar behaviour (e.g. coding). The neural network between layers 8 and 16 can be seen as a nonlinear function that transforms outputs of layer 8 into the outputs of layer 16. As this function is nonlinear, it is not guaranteed to project orthogonal inputs into orthogonal outputs, i.e. there might exist multiple orthogonal inputs that are projected into almost collinear outputs by this function. And what your algorithm for learning ... (read more)
There doesn't seem to be anything surprising that there are many orthogonal steering vectors for the layer 8 that results in similar behaviour (e.g. coding). The neural network between layers 8 and 16 can be seen as a nonlinear function that transforms outputs of layer 8 into the outputs of layer 16. As this function is nonlinear, it is not guaranteed to project orthogonal inputs into orthogonal outputs, i.e. there might exist multiple orthogonal inputs that are projected into almost collinear outputs by this function. And what your algorithm for learning ... (read more)