Archive for the 'Rambles' Category

Rays be Damned

December 27, 2008

My big question has been, “How much more efficient could using #define statements be than safe, by-the-book code”. My preliminary code tested (Scalar is defined as double):

struct VectorDefines {
Scalar x, y, z;
VectorDefines( ) { }
VectorDefines( const Scalar& _X, const Scalar& _Y, const Scalar& _Z ) : x( _X ), y( _Y ), z( _Z ) { }
};

By calling one of each of the following operations once in a loop:

#define ADD(s,l,r)  s.x = l.x + r.x;  s.y = l.y + r.y;  s.z = l.z + r.z
#define DOT(l,r) (l.x*r.x + l.y*r.y + l.z*r.z)
#define CROSS(s,l,r) s.x = l.y*r.z – l.z*r.y;  s.y = l.z*r.x – l.x*r.z;   s.z = l.x*r.y – l.y*r.x;

Against

class VectorIdeal {
Scalar mX, mY, mZ;
public:
inline Scalar& x() { return mX; }
inline Scalar& y() { return mY; }
inline Scalar& z() { return mZ; }
inline const Scalar& cx() const { return mX; }
inline const Scalar& cy() const { return mY; }
inline const Scalar& cz() const { return mZ; }
VectorIdeal( ) { }
VectorIdeal( const Scalar& _X, const Scalar& _Y, const Scalar& _Z ) : mX( _X ), mY( _Y ), mZ( _Z ) { }
};
VectorIdeal operator + ( const VectorIdeal& _Left, const VectorIdeal& _Right ) {
return VectorIdeal( _Left.cx() + _Right.cx(), _Left.cy() + _Right.cy(), _Left.cz() + _Right.cz() );
}
Scalar Dot( const VectorIdeal& _Left, const VectorIdeal& _Right ) {
return _Left.cx()*_Right.cx() + _Left.cy()*_Right.cy() + _Left.cz()*_Right.cz();
}
VectorIdeal Cross( const VectorIdeal& _Left, const VectorIdeal& _Right ) {
return VectorIdeal(  _Left.cy()*_Right.cz() – _Left.cz()*_Right.cy(),
_Left.cz()*_Right.cx() – _Left.cx()*_Right.cz(),
_Left.cx()*_Right.cy() – _Left.cy()*_Right.cx() );
}

Which follows proper code formatting (at least, my way of proper formatting). At first, results were questionable. That is, the define method was taking more time than the proper way (about 1000/800 clock cycles ratio). This was because I was instantiating the vectors in the test code and that expansion was being copied through the defines (which I believe says something about the risk of horrible inefficiency with defines). I fixed that, and the next ratio blew my mind:

Defines : 1031
Ideal : 10630

That is, the defines were supposedly taking ~1/10 the processor time. I was about to go through my next ray-tracer iteration, CIMPR III, and re-hack the code when I remembered that I had run the speed test on Debug compiler settings. D’oh. Release settings gives:

Defines : 66
Ideal : 69

Or increasing the iterations of the test loop:

Defines : 824
Ideal : 886

So, all is good with the world and I can write safe code.

My second little project was just playing with particles and gradient fields:

http://img176.imageshack.us/img176/5720/47105234py4.jpg

http://img201.imageshack.us/img201/4462/79783795ut6.jpg

Basically the particles start between (-1,-1) and (1,1) and their velocities are modified by a function over R2, in these cases both combinations of trig functions.

The final little project is a physics one simulating a jelly ball. Basically a ring of springs with volume calculation and normal forcing. I even got bored and put in some crappy accumulation anti-aliasing.

Other than that my time is spent working on a ray-tracer, this time with spacial sorting for speed’s sake. I am just using Octrees for now, but hopefully will man up and figure the kd tree out some day.

Starting From Scratch

September 20, 2008

I have produced a model format which depends upon deflating the various coordinate data and packaging that into a nice container for easy management (including a string name for the planned resource manager/resource packager). The specifications are a WIP, but are as follows (note HTML massacred the nice alignment. Also, the number in brackets is the size in bits):

===================================================================================
STATIC MODEL V0
===================================================================================
[32] Identification
This is the identification of the file, “tm m”.
[8] Version
The version of the file format. The value zero indicates this model format.
[8] Length of Name
The number of characters in the following model name.
[8*Delta] Name
A character-string of length Delta where Delta is the previous field. This
is used when the file is loaded to reference it upon loading.

[4] LOD Count
The number of levels of detail to be specified.
[32*3] Minimum Bounding Box
The smallest vertex position coordinates in IEEE floating-point format,
three 32-bit values: x, y, and z respectively.
[32*3] Maximum Bounding Box
The biggest vertex position coordinates in IEEE floating-point format,
three 32-bit values: x, y, and z respectively.
{Repeat for Each LOD}
[5] LOD for Positions
[5] LOD for Normals
[5] LOD for UVs
[32] Number of Verticies
[32] Number of Polygons
{Repeat for Each Vertex}
[LODp*3] Vertex Position
The fields of this vector are signed.
[LODn*3] Vertex Normal
The fields of this vector are signed.
[LODuv*2] Vertex UV
{End Repeat}

{Repeat for Each Polygon}
[Omega*3] Polygon Vertex Indicies
{End Repeat}
{End Repeat}
[?] Padding
Aligns to the next byte.
[128] Checksum
Just an XORing of all bytes up to and including the padding. Gives a high
probability the data is valid if passes.
[8] EOF Confirmation
The value ‘E’, indicating a safe end of file was reached.

===================================================================================
FAST STATIC MODEL V0
===================================================================================
[32] Identification
This is the identification of the file, “tm m”.
[8] Version
The version of the file format. The value one indicates this model format.
[8] Length of Name
The number of characters in the following model name.
[8*Delta] Name
A character-string of length Delta where Delta is the previous field. This
is used when the file is loaded to reference it upon loading.

[8] LOD Count
The number of levels of detail to be specified.

{Repeat for Each LOD}
[32] Number of Verticies
[32] Number of Polygons
{Repeat for Each Vertex}
[32*3] Vertex Position
Three IEEE 32-bit floating-point values.
[32*3] Vertex Normal
Three IEEE 32-bit floating-point values.
[32*2] Vertex UV
Two IEEE 32-bit floating-point values.
{End Repeat}

{Repeat for Each Polygon}
[32*3] Polygon Vertex Indicies
{End Repeat}
{End Repeat}

===================================================================================
2D MATERIAL v0
===================================================================================
[32] Identification
This is the identification of the file, “tm t”.
[8] Version
The version of the file format. The value zero indicates this format.
[8] Length of Name
The number of characters in the following model name.
[8*Delta] Name
A character-string of length Delta where Delta is the previous field. This
is used when the file is loaded to reference it upon loading.

[4] Top Diffuse Texture Dimension (Two’s Exponent)
The base dimension of the texture to be stored. Id est, if the texture
is 512px by 512px, the value 9 is stored (2^9 = 512). The value zero is
invalid, and anything greater than 11 (2^11 = 2048) is most likely invalid.
[1] Stored MipMaps
If this is 1, mipmaps will be stored, otherwise only the base full texture
is used.
[4] Compression of Diffuse Map and MipMaps
0 – No compression
1 – JFIF (JPEG) lossy compression
2 – Lossless compression
3 – Bit depression
4-15 – Reserved
[4] Top Emission Map Texture Dimension
Same idea as the diffuse map dimension, but for the emission map (emission
maps are generally lower-quality). The value zero specifies that no emission
map is used.
[1] Stored MipMaps for Emission
If this is 1, mipmaps will be stored, otherwise only the base full texture
is used. Generally this is not used, but can be for high-quality textures.
If the emission map dimension is zero this is ignored.
[4] Compression of Emission Maps
Same values as “Compression of Diffuse Map and MipMaps” but for the emission
map. If the emission map dimension is zero this is ignored.
[4] Top Normal Map Texture Dimension
Same idea as the diffuse map dimension, but for the normal map. If zero no
normal mapping is used.
[1] Stored MipMaps for Normal Map
If this is 1, mipmaps will be stored, otherwise only the base full texture
is used. If the normal map dimension is zero this is ignored.
[4] Compression of Normal Maps
Same values as “Compression of Diffuse Map and MipMaps” but for the normal
map. If the normal map dimension is zero this is ignored.

{Repeat for Each LOD of Diffuse}
[?] Texture Data
This data depends on the compression field.
{End Repeat}
{Repeat for Each LOD of Emission}
[?] Texture Data
This data depends on the compression field.
{End Repeat}
{Repeat for Each LOD of Normal}
[?] Texture Data
This data depends on the compression field.
{End Repeat}
[?] Padding
Aligns to the next byte.
[128] Checksum
Just an XORing of all bytes up to and including the padding. Gives a high
probability the data is valid if passes.
[8] EOF Confirmation
The value ‘E’, indicating a safe end of file was reached.

{END OF FILE}

The philosophy behind the static model is that it will compress well while giving predictable results (LOD-wise). The fast static model is meant to be easily loaded into memory (ideally models are shipped/uploaded with the former and converted to the latter). The texture file is going to employ my own attempt at fourier series (JPEG-like) compression and the loss-less (non-patented) method used by the PNG format. The uncompressed format is for debugging primarily, and bit depression is simply reducing the number of bits per channel. The rest are reserved in case I want to bring along alpha maps or something later. I have not included alpha maps as is because they usually cause a dirty render with depth-sorting, which I don’t want to deal with at the moment.

Eventually multiple model/material files will be stored in a resource file and compressed further.

I may add my mesh optimization idea into a new model format which would be tasked with loading quickly and having the optimizations pre-calculated.

The resource file, later, will specify exactly what data is already calculated and still must be. For example, to personify the resource loader: “this model has its LOD and optimizations done, load it quickly. This one is compressed and needs to have its optimizations and LOD calculated and stored for further runs. This one is compressed, but so rarely used just leave it compressed, but on load calculate its LODs every time”.

Bah, in less than 9 hours I move to UCLA.

OH MY GOD FILTER KEYS FREAKED ME OUT WITH THE BEEP.

Until later, this shall all have to wait.

Particle Collision Optimization

July 29, 2008

This is just something that I had in the back of my head. I assumed it would speed things up and, well, it definitely did.

The problem is a bunch of particles which are supposed to repel only at a close distance, usually for something like fluid approximation, maybe just basic physics, I don’t know. I just wanted to do it. Anyway, it was all I did today (given my wonderful strep throat), and the results are promising:

The red line is is the number of frames per second (peaks at 175, after first five seconds only drops down to 109) and the blue line is the number of particles in the system. The method used is done by solving the new velocity of all particles (given any close interactions), repelling them, solving the new velocity at that position then going back to the original position and using an average of the two new accelerations/velocities, so this is with four system passes (now that I think about it, I could probably combine two…); however, this is not the purpose of it all.

If every point were checked with every other point, the number of distance checks would be N*N, or at 1,500 particles, 2.25 million. My system seems to only have to perform at most 8 per particle (though it could be anywhere from 0 to infinity). The system is divided into a grid and only adjacent grid blocks are tested for distance.

And finally, it working:

The blue dots are particles, the red blocks grid sections. The grid blocks get more red the more particles that are in them (I probably should have disabled grid rendering before benchmarking this… d’oh). Eventually the screen would fill and the ones that leave would be deleted (quite efficiently, I might add). As you can see, each particle testing only adjacent grid blocks probably averages… what, 6? Awesomeness, if you ask me. The grid size is also a parameter, including the grid location. The particle processing functions are callbacks. IE I set the first function, iterate the system, set the second function to move the particles, iterate, etc.

My only sadness comes from the fact that if the repel distance (AKA interaction distance) is greater that the size of one grid block, nothing outside the adjacent 9 cells will be considered. This might be useful for a RTS game engine. It is fast and makes it easy to find objects near one another (though it is limited).

This was done using a quad-core Q6600 CPU at 3.1 GHz, though because it is only single-threaded for now it utilizes 25% of that and the memory use peaked at no greater than 11mb. Double precision was used.

Oh man sneezing with strep throat sucks.

Mesh Optimization Pt. 1

July 25, 2008

I am going to test out two methods for optimizing complex mesh rendering.

Primarily, I am going to work on a method I have used in the past involving storing multiple polygon lists given the angle the model is viewed from. Essentially the model is stored as size separate versions: top, bottom, left, right, front, rear. If the camera is behind the object, the rear list is called et cetera. This is a crude methods but it is simple and from experience works somewhat well (75% of mesh rendered at a time using 4.5x the memory footprint). The effect is almost linear optimization (dot products are fixed CPU cost).

The second method has two branches. IE, optimization via polygon subtraction and addition with groups based on their dot product with stochastic vectors which are pre-processed and depending on how effective they are (IE how many polygons belong to them) possibly eliminated. Polygons belonging to no group will be added to a list which is rendered every time.

These methods assume the camera at no point enters the bounding volume, in this case AABB, of the object. If the camera does enter that close, a list containing all polygons will be called.

There is also possibility for SSE optimization.

Pleasure

June 16, 2008

Perhaps there exists a bright-line which dictates that pleasures inevitably result in more pain.  Perhaps (worldly) pleasures are simply best avoided.

Then, however, what becomes of life? Does not the sandwich provide pleasure? It is therefore reasonable to conclude that not all pleasure is to be avoided.

How does one differentiate? Can postulation really provide a means for determining if something will provide more pleasure than pain? Is one not almost constantly a poor observer to make that call?

Biology must have this down on some level. Naturally humans seek pleasure over pain. Humans wish to reproduce and generally that which facilitates this desire is pleasurable.

So, then, does one simply follow biology? Urges? Wants? What if one happens to have enough information to know that the most likely result will be more pain than pleasure? Should one follow urges which are rationalized to be short-lived in pleasure yet long-lived in pain?

Questions into a void, I cast.