Software Skinning

December 1st, 2008

This paper discusses different methods to achieve real time skeletal animation and presents C++ code samples to demonstrate how to create your own skeletal animation systems. To see this stuff in action you can d/l the latest version of the GODZ Engine Demo and view the benchmarks (make sure the UseHWSkinning is set to false in the SkeletalMeshes.lua config file).
The goal is to take the data exported from the authoring software (such as
3D Studio Max/Character Studio) and animate a character in real time. The
content of the animation data should be composed of animation keys containing
position and rotation data for the character. Skinning information should also
be present within the content- so we can determine which vertices are
associated to a bone. Certain vertices within the mesh may be associated with
more than one bone so it is important that the weight associated with the
vertex is present also (if only one bone is influencing the vertex, then the
weight is 1). Below we define the BlendWeight structure:

	typedef unsigned int udword;  

	//Represents a weighted vertex
	struct GODZ_API BlendWeight
	{
		BlendWeight(float weight, udword blendVertexId);  

		udword vertexIndex;	//blend vertex id
		float weight;		//amount of influence exerted by the bone
	};


In this image we examine an arm of a character. Notice how right where the
joints met, we see vertices are being shared between two different bones. This
is where vertex blending is normally needed the most so that tears and other
effects don’t occur. So if a vertex is bound to more than 1 bone your engine
needs to perform vertex blending.

In 3D studio max, the first key in an animation is given relative to the
reference pose (this is the pose the model was in when it was bound to the
Biped). So when we play an animation sequence, we use the reference pose as the
‘base’ for our transformations. We do this by storing the initial matrix for
the bone within a matrix in the bone structure. Below we define a bone:

	struct Bone
	{
		Bone();
		Bone(const char* Name);  

		//Adds a weighted vertex that will be influenced by this bone
		void AddWeightedVertex(float weight, udword blendVertexId);  

		//Adds the bone as a child of this bone
		void AddBone(int boneIndex);  

		//returns the name of this bone
		const char* GetName() const;  

		//checks to see if this bone is assigned as a child of this bone
		bool IsChild(int boneIndex);  

		//Returns the parent of this bone
		Bone* GetParent();  

		//Assigns a parent to this bone
		void SetParentBone(int parentIndex);  

		//update the current transform of this bone
		//(and all children attached to this bone).
		//[in] dt - delta time
		//[in] animController - updates this bone with the current
		//animation track
		void Transform(float dt, AnimController *animController);  

		rstring m_sName;			//name of this bone
		WMatrix16f m_init;			//initial pose
		WMatrix16f m_offset;			//local transform matrix
		//final matrix of this bone (includes parent)
		WMatrix16f m_final;
		int parentIndex;			//parent of this bone
		std::vector<int> children;	//children of this bone
		//verts affected by this bone
		std::vector<BlendWeight> m_blendVerts;
	};

Notice the bone has 3 different matrices. The Matrix, m_init, stores the initial
transformation for the bone. This is the position of the bone when it was bound
to the model given in world coordinates. We use this matrix to transform
vertices within the local frame of the bone. Then we transform the vertex to
it’s world position using the world matrix, m_final, which is a combination of
the offset matrix and the the parent bone’s final transformation matrix. The
offset matrix (m_offset) is basically the difference between the previous
offset of the bone and it’s new position. For instance, let’s suppose the bone,
at key 0, was positioned at (25,4,5) and at key 2, the bone is positioned at
(28,9,7). The offset matrix has (3,5,2) for it’s translation. So for the final
position of this bone, we can multiply the offset matrix times the parent’s
bone world position to dertermine the bone’s location for this frame (m_final).
Thus, we end up with an hierachial animation chain. First, we compute the
matrix of the parent bone and then proceed downwards through the hierarchy,
transforming child bones. Below we demonstrate this process:

//you may just want to start out the root at an arbitrary world location
//or leave it at identity
Root Bone (Pelvis)
	m_final.MakeIdentity();
	//my offset from previous frame * parent location
	Spine 03
		m_final = m_offset * rootBone->m_final;
		Spine 02
			m_final = m_offset * spine03->m_final;
			.....

Below we show a model in it’s reference pose. Notice how the skeleton
neatly fits/overlaps the vertices of the mesh. This is where we get the initial
transformation matrix for our bones (m_init).

Let’s define a basic Matrix class used by the bone,etc.

	struct WMatrix
	{
		union
		{
			struct
			{
				float _11, _12, _13, _14;
				float _21, _22, _23, _24;
				float _31, _32, _33, _34;
				float _41, _42, _43, _44;
			};
			float m[4][4];
			float m_afEntry[16];
		};  

		//creates an identity matrix by default. Otherwise, a uniform
		//scale matrix is created.
		WMatrix(float scale=1)
			: _11(scale), _12(0), _13(0), _14(0),
			  _21(0), _22(scale), _23(0), _24(0),
			  _31(0), _32(0), _33(scale), _34(0),
			  _41(0), _42(0), _43(0), _44(1)
		{
		}  

		//rotates the vector by this matrix
		inline void InverseRotateVect( float *pVect ) const
		{
			float vec[3];  

			vec[0] = pVect[0]*_11 + pVect[1]*_12 + pVect[2]*_13;
			vec[1] = pVect[0]*_21 + pVect[1]*_22 + pVect[2]*_23;
			vec[2] = pVect[0]*_31 + pVect[1]*_32 + pVect[2]*_33;  

			memcpy( pVect, vec, sizeof( float )*3 );
		}  

		inline void InverseTranslateVect( float *pVect ) const
		{
			pVect[0] -=_41;
			pVect[1] -=_42;
			pVect[2] -=_43;
		}  

		void MakeIdentity()
		{
			WMatrix tm;
			memcpy(&m, &tm.m, sizeof(float) * 16);
		}  

		//transforms the vector relative to this matrix's
		//reference frame (4x3 mul)
		void Mul(float *v) const
		{
			float x0 = v[0], y0 = v[1], z0 = v[2];  

			//multiplies vec by the right, up, forward,
			//and translation components of the matrix
	        	v[0] = x0 * m[0][0] + y0 * m[1][0] + z0 * m[2][0]
			+ m[3][0];
		   	v[1] = x0 * m[0][1] + y0 * m[1][1] + z0 * m[2][1]
			+ m[3][1];
			v[2] = x0 * m[0][2] + y0 * m[1][2] + z0 * m[2][2]
			+ m[3][2];
		}  

		//Sets the translation parts of this matrix
		void PreTranslate(const float *v)
		{
			_41 = v[0];
			_42 = v[1];
			_43 = v[2];
		}  

		//matrix mul routine
		WMatrix WMatrix::operator*(const WMatrix &other) const
		{
			WMatrix newMatrix;
			int column = 0;
			for (int i = 0; i < 4; i++) // rows
			{
				for (int j=0;j<4;j++) //each column
				{
					float val = 0;
					for (int k=0;k<4;k++)
					{
						val += m[i][k] * other.m[k][j];
					}  

					newMatrix.m[i][j] = val;
				}
			}
			return newMatrix;
		}  

	}

During an animation sequence we adjust the m_offset matrix for the bone using
animation keys to help us interpolate the animation. Below we define the
structure of an animation key.

	//key frame data
	struct AnimKey
	{
		WVector pos;   //transform of the bone at this position
		Quat4f rot;    //quaternion x,y,z,w
		WVector scale; //scale of the bone
		int time;      //time of the key exported from authoring program  

		AnimKey()
			: time(0)
		{}
	};

The key frame data can be taken from the authoring program verbatim. Even
though the time field may not be intuitive, as long as it has a range from 0
-> n it does not matter. We just need to be able to take the times and
figure out how far along in the animation we are so we can interpolate between
2 keys. Below is typical C++ code to perform this task:

void AnimationInstance::Update(AnimController *animControl, Bone *bone)
{
	AnimSequence* currentSequence = 0;
	udword index = animControl->GetSequence();  

	assert(index lt; sequences.size());
	currentSequence = &sequences[index];  

	if (currentSequence == NULL)
	{
		if (sequences.size()>0)
		{
			//get default pose
			currentSequence=&sequences[0];
		}
		else
		{
			return;
		}
	}  

	//compute the current time
	float time = animControl->GetTime();  

	if (time > currentSequence->GetMaxTime())
	{
		//reset to the beginning of the animation...
		time=0.0f;
	}  

	animControl->SetTime(time);  

	//find the start and end keys
	AnimKey start, end, key;  

	//TODO: speed up finding bones associated with a track
	AnimTrack* track = currentSequence->GetTrackFor(bone);
	assert(track);
	track->GetKeys(time, start, end);  

	//determine if we need to interpolate between the keys....
	if (start.time == time)
	{
		key = start;
	}
	else if (start.rot == end.rot)
	{
		//if the rotation is the same, we can;t slerp so....
		float flerpvalue = float(time-start.time)/
			float(end.time-start.time);  

		key = start;  

		//interpolate position
		key.pos.x = start.pos.x + flerpvalue * (end.pos.x - start.pos.x);
		key.pos.y = start.pos.y + flerpvalue * (end.pos.y - start.pos.y);
		key.pos.z = start.pos.z + flerpvalue * (end.pos.z - start.pos.z);
	}
	else
	{
		assert(end.time > start.time);
		float flerpvalue = float(time-start.time)/
			float(end.time-start.time);  

		//interpolate between the start and end keys
		SlerpQuaternions(start.rot,end.rot,flerpvalue,key.rot);  

		//interpolate position
		key.pos.x = start.pos.x + flerpvalue * (end.pos.x - start.pos.x);
		key.pos.y = start.pos.y + flerpvalue * (end.pos.y - start.pos.y);
		key.pos.z = start.pos.z + flerpvalue * (end.pos.z - start.pos.z);  

	}		  

	WMatrix16f frame;
	QuaternionToMatrix(key.rot,frame);
	frame.PreTranslate(key.pos);
	bone->m_offset = frame;  

}

Looking at the above code sample you may have gotten confused and asked- “What
is an animation controller” or “What is an Animation Instance?”. These are just
methods that you can use to organize the animation data so that you can have
multiple characters that reuse the same skinning information but play different
animations. This concept is normally referred to as mesh instancing. Although
it is possible to reuse bones and the same vertex/index buffers we simply clone
this data for each entity that is going to use the animated mesh for the best
performance. Each animation instance has it’s own set of bones, vertex, and
index data.

Now I need to describe the Animation Controller in better detail. The
controller controls which animation sequence is being played and tells the
animation instance (the object that computes the bone offsets) what time to
use. Each entity has it’s own animation controller and instance. The animation
controller simply computes when the bone transformations should take place. We
do not want to update the vertex/index buffers every frame- that would be SLOW
so what we do is compute the animation frame rate which may be 30-60fps, etc.
Below is the formula to compute animation frame rate:

m_fCurrentAnimRate += deltatime * m_fAnimRate;

If the current frame rate is greater than 30 fps or whatever you desire then
you tell the animation instance to update the skeletal mesh instance.

Now the final part is computing the final transformation of the bone and
transforming the vertices relative to the bone. There are two ways to
accomplish this- we can go about transforming the vertices similar to how
vertex shaders work- each vertex knows all the bones that deform it. We iterate
through all the matrices that affect the vertex and transform it. This way is
‘okay’ if the meshes only have 1 bone influence per vertex. However, the best
way is to allow the bone to transform all the vertices it affects. For
completeness sake, we will go over both methods.

First, let’s review the code you need to transform a vertex relative to a bone
and then back to it’s final position.

void BoneMul(Bone *bone, WVector *v, float weight)
{
	//Transform the vertex relative to the bone
	bone->m_init.InverseTranslateVect(&v->x);
	bone->m_init.InverseRotateVect(&v->x);  

	//Transform the vertex to it's final position
	WVector localPos = *v;
	bone->m_final.Mul(&localPos.x);	  

	//multiply the vertex by it's weight for this bone
	v->x = localPos.x * weight;
	v->y = localPos.y * weight;
	v->z = localPos.z * weight;
}

The bone multiply routine first makes the vertex relative to the bone’s initial
position. Recall that an animated bone’s offset matrix is given relative to
it’s intial pose. Nice huh? So then all we have to do is transform the vertex
to it’s final position using the world matrix for the bone. Finally, we use the
weight to figure out how much of the trasnformation applies. If the vertex only
has 1 bone affecting it, the weight is equal to 1. However, if the vertex is
affected by more than 1 bone, the weight value will vary.
Now I will present a function to transform all the vertices in the mesh relative
to a world matrix (this may be the matrix of the entity displaying the mesh).
The BlendVertex struct is used here- it represents a vertex that contains a
list of weights for all the bones that affect it. We will not go over it’s
struct definition since this is not the optimial method however, it is
important we demonstrate the differences between the two methods:


 void Bone::Transform(float dt, AnimController
*animControl)
	{animControl->Update(this, dt);		  

	if (parentIndex > -1)
	{
		Bone* parent = skMesh->GetBone(parentIndex);
		m_final =  m_offset *parent->m_final;
	}
	else
	{
		m_final = m_init * m_offset;
	}  

	//update child transformations
	vector<int>::iterator
	childIter;for(childIter=children.begin();childIter!=children.end();
	childIter++)
	{
		udword childIndex =   

		(*childIter); //update child
		transformations Bone*  child=
		skMesh->GetBone(childIndex);child->Transform(dt, animControl);
	}
}  

void SkelMeshInstance::ComputeSkinVerts(WMatrix16f &worldMatrix)
{
	ModelResource* resource =
	GetModelResource(0);resource->LockVertexBuffer();	  

	SkeletalMesh *skMesh = SafeCast(mesh);
	vectorlt;VertexDuplication*> &duplicates = skMesh->GetVertexDuplication();  

	//blend vertices
	size_t vertices = blendVerts.size();
	for(udword i=0;i;<vertices;i++)
	{
		BlendedVertex *bv = &blendVerts[i];  

		//get all the 'clone' verts and update their world locations
		VertexDuplication *vd = duplicates[i];
		//assert(vd->indices.size() > 0);  

		//get this vertex world pos
		WVector pos(vd->pos.x,vd->pos.y,vd->pos.z);  

		//TODO... compute normals using 3x3 matrices...  

		udword maxInfl = (udword)bv->GetNumInfluences();
		if (maxInfl == 1)
		{
			BlendWeight *bw = &bv->GetInfluence(0);
			assert(bw->weight == 1);
			Bone *bone = &bones[bw->boneIndex];
			BoneMul(bone, &pos, bw->weight);
		}
		else
		{
			float LastWeight=0.0f;	  

			//we are only using this as an output vertex
			pos.x=0;
			pos.y=0;
			pos.z=0;  

			//blend between several matrices...
			for (udword j=0;j<maxInfl-1;j++)
			{
				BlendWeight *bw = &bv->GetInfluence(j);
				Bone *bone = &bones[bw->boneIndex];
				LastWeight = LastWeight + bw->weight;  

				WVector worldvert(vd->pos.x,vd->pos.y,vd->pos.z);
				BoneMul(bone,&worldvert,bw->weight);
				pos.x += worldvert.x;
				pos.y += worldvert.y;
				pos.z += worldvert.z;
			}  

			LastWeight = 1.0f - LastWeight;   

			// Now that we have the calculated weight, add
			//in the final influence
			WVector worldvert(vd->pos.x,vd->pos.y,vd->pos.z);
			BlendWeight *lw = &bv->GetInfluence(maxInfl-1);
			Bone *bone = &bones[lw->boneIndex];
			BoneMul(bone,&worldvert,LastWeight);
			pos.x += worldvert.x;
			pos.y += worldvert.y;
			pos.z += worldvert.z;
		}  

		//update all the vertex duplicates
		size_t verts = vd->indices.size();
		for (udword k=0;k<verts;k++)
		{
			int index = vd->indices[k];
			Vertex &vec = resource->GetVertex(index);
			vec.x = pos.x;
			vec.y = pos.y;
			vec.z = pos.z;  

			//TODO: rotate the normals....
		}
	} // loop  

	resource->UnlockVertexBuffer();
}

The code above is not the most efficient. It may be okay for transforming
vertices that have only 1 bone but when we start transforming meshes that
require a lot of blending we get into trouble fast. The reason is because the
above code is not optimized for the cache. The BlendVertex iterates through
each individual matrix that affects it to compute it’s final position. The
fastest and most cache friendly way to go about transforming the vertex is to
compute it’s final position right after we transform the bone- this way we are
not constantly pushing/popping huge matrices off the processor. For meshes that
are 3,000+ this is a very huge savings. To give you an example, you might get
200 fps using the above method for a mesh containing over 3,000 vertices and 52
bones. Using the below method, you may see an improvement of about 3 times in
performance.

 void Bone::Transform(float dt, AnimController
*animControl)
	{animControl->Update(this, dt);		  

	if (parentIndex > -1)
	{
		Bone* parent = skMesh->GetBone(parentIndex);
		m_final =  m_offset *parent->m_final;
	}
	else
	{
		m_final = m_init * m_offset;
	}  

	//m_box.TransformBy(m_offset); //transform the bounding box  

	//transform vertices...
	ModelResource *resource = skMesh->GetModelResource(0);  

	if (parentIndex == -1)
		{ //Root bone - reset
		verts Vertex  *v=
		&resource->GetVertex(0); intnumVerts=resource->GetNumVertices();
		for(int i=0;i<numVerts;i++)
		{
			v[i].x=0;
			v[i].y=0;
			v[i].z=0;  

			//TODO: zero out normals
		}
	}  

	size_t numWeights = this->m_blendVerts.size();
	for(udword i=0;i<numWeights;i++)
	{
		BlendWeight &bw = m_blendVerts[i];
		if (bw.weight < 0.000001f)
			continue;
		VertexDuplication *vd = mesh->GetVertexDuplicate(bw.vertexIndex);  

		//get this vertex world pos
		WVector pos(vd->pos.x,vd->pos.y,vd->pos.z);  

		BoneMul(this, &pos, bw.weight);
		//Log("weight %f vertex %d\n", bw.weight, bw.vertexIndex);  

		//update all the vertex duplicates
		size_t verts = vd->indices.size();
		for (udword k=0;k<verts;k++)
		{
			int index = vd->indices[k];
			BaseVertex *vec = resource->GetVertex(index);
			vec->x += pos.x;
			vec->y += pos.y;
			vec->z += pos.z;
		}
	}  

	//update child transformations
	vector<int>::iterator childIter;
	for(childIter=children.begin();childIter!=children.end();childIter++)
	{
		udword childIndex = (*childIter);  

		//update child transformations
		Bone* child = skMesh->GetBone(childIndex);
		child->Transform(dt, animControl);
	}
}

The above method could even be optimized further if we took the vertices and
transform them in parallel by simply taking the vertices and stuffing them into
a matrix and let the CPU perform the matrix * matrix multiply in one step or
use parellel vertex processing (SIMD). But I think you get the picture of how
we can skin meshes using platform indendant code and get very good performance.

References:

Skeletal Animation by Brett Porter
Geometry Skinning/Blending and Vertex Lighting

  1. Alex
    January 14th, 2009 at 08:45 | #1

    Thanks for this document. Can you introduce (shortly) how use animations blending in this case?

  2. Asger Friis-Vigh
    February 28th, 2009 at 11:49 | #2

    Great article. Nicely informative. Thanks.

  3. April 25th, 2009 at 13:51 | #3

    Fantastic. care to share your sources :) ?

  1. No trackbacks yet.