My big question has been, “How much more efficient could using #define statements be than safe, by-the-book code”. My preliminary code tested (Scalar is defined as double):
struct VectorDefines {
Scalar x, y, z;
VectorDefines( ) { }
VectorDefines( const Scalar& _X, const Scalar& _Y, const Scalar& _Z ) : x( _X ), y( _Y ), z( _Z ) { }
};
By calling one of each of the following operations once in a loop:
#define ADD(s,l,r) s.x = l.x + r.x; s.y = l.y + r.y; s.z = l.z + r.z
#define DOT(l,r) (l.x*r.x + l.y*r.y + l.z*r.z)
#define CROSS(s,l,r) s.x = l.y*r.z – l.z*r.y; s.y = l.z*r.x – l.x*r.z; s.z = l.x*r.y – l.y*r.x;
Against
class VectorIdeal {
Scalar mX, mY, mZ;
public:
inline Scalar& x() { return mX; }
inline Scalar& y() { return mY; }
inline Scalar& z() { return mZ; }
inline const Scalar& cx() const { return mX; }
inline const Scalar& cy() const { return mY; }
inline const Scalar& cz() const { return mZ; }
VectorIdeal( ) { }
VectorIdeal( const Scalar& _X, const Scalar& _Y, const Scalar& _Z ) : mX( _X ), mY( _Y ), mZ( _Z ) { }
};
VectorIdeal operator + ( const VectorIdeal& _Left, const VectorIdeal& _Right ) {
return VectorIdeal( _Left.cx() + _Right.cx(), _Left.cy() + _Right.cy(), _Left.cz() + _Right.cz() );
}
Scalar Dot( const VectorIdeal& _Left, const VectorIdeal& _Right ) {
return _Left.cx()*_Right.cx() + _Left.cy()*_Right.cy() + _Left.cz()*_Right.cz();
}
VectorIdeal Cross( const VectorIdeal& _Left, const VectorIdeal& _Right ) {
return VectorIdeal( _Left.cy()*_Right.cz() – _Left.cz()*_Right.cy(),
_Left.cz()*_Right.cx() – _Left.cx()*_Right.cz(),
_Left.cx()*_Right.cy() – _Left.cy()*_Right.cx() );
}
Which follows proper code formatting (at least, my way of proper formatting). At first, results were questionable. That is, the define method was taking more time than the proper way (about 1000/800 clock cycles ratio). This was because I was instantiating the vectors in the test code and that expansion was being copied through the defines (which I believe says something about the risk of horrible inefficiency with defines). I fixed that, and the next ratio blew my mind:
Defines : 1031
Ideal : 10630
That is, the defines were supposedly taking ~1/10 the processor time. I was about to go through my next ray-tracer iteration, CIMPR III, and re-hack the code when I remembered that I had run the speed test on Debug compiler settings. D’oh. Release settings gives:
Defines : 66
Ideal : 69
Or increasing the iterations of the test loop:
Defines : 824
Ideal : 886
So, all is good with the world and I can write safe code.
My second little project was just playing with particles and gradient fields:
http://img176.imageshack.us/img176/5720/47105234py4.jpg
http://img201.imageshack.us/img201/4462/79783795ut6.jpg
Basically the particles start between (-1,-1) and (1,1) and their velocities are modified by a function over R2, in these cases both combinations of trig functions.
The final little project is a physics one simulating a jelly ball. Basically a ring of springs with volume calculation and normal forcing. I even got bored and put in some crappy accumulation anti-aliasing.
Other than that my time is spent working on a ray-tracer, this time with spacial sorting for speed’s sake. I am just using Octrees for now, but hopefully will man up and figure the kd tree out some day.