OK, so wait, then the 4D is the 'transient state'. The shape 'lives' in the 4D by moving through it.
I'm not quite sure what you're saying here. There are two ways you can think about the animation above. One way is that you have a fixed light source in the 4-d space and the 4-d hypercube is rotating. You're looking at a 3-d "screen" onto which the shadow of the rotating hypercube is projected.
The other way you can think of it is that you have a stationary 4-d cube that isn't moving at all. Then you walk "around" it in the 4-d space so that you can see it from different sides.
---
If it's true that a 3-cube is the 'face' of a 4-cube, is a 4-cube the face of a 5-cube, etc?
Yes.
Is it generally true that a n-cube is the face of a (n+1)-cube, and if so, how large can n be?
Yes. And n can be as large as you like.