So first let’s try the Google farmhash hash function, using a few fast hash table implementations.

We will do a similar benchmark as the one we did for integers a while back, but with strings. Actually, let’s add an integer version as well.

So first let’s try the Google farmhash hash function, using a few fast hash table implementations.

We will do a similar benchmark as the one we did for integers a while back, but with strings. Actually, let’s add an integer version as well.

It turns out that for integer hash tables implementations the identity function (f(x)=x) is preferred in most cases. This is a trade-off between speed and desirable properties such as uniform distribution that works well for integers. In essence the speed of attempted lookups is so fast that we can afford the increased number of collisions a naive hash function will give us.

Integrating cmake with unit-tests including testing for memory leaks.

We’re going to create a dummy “state” module, so let’s invent some dummy functionality and test it.

[test/state.c]Continuing looking at the performance of data structures and different implementations and languages, let’s look at a simpler but very common container, the dynamic array.

The interesting performance properties of a dynamic array is constant, O(1), lookup time and amortized constant insertion and deletion at the end of the array. Random insertion on the other hand is linear, O(n), and typically something you want to avoid.

I think the lack of simple and basic dynamic data structures is a major reason why people sometimes struggle with programming in C.

To take a simple example the naive task of splitting a string into words is near trivial in C++ using the standard library, while in C you are challenged with how to dimension the resulting array. In most cases likely the solution here would be to declare a fixed size array with a MAX_WORDS constant size, and later when the parsing is done and you know the number of words, declare a new array into which you copy the result, and then return. Hardly an elegant solution, and you end up with making difficult assumptions regarding how many words a line realistically can contain.

A short post about some habits with compiling C.

For some reason, as far as I can tell, people do not share their experience with, and views of, C so much. At least not in blogs. That is a bit surprising, given that Tiobe, for example, would rank it as the most popular programming language. Perhaps the typical C programmer simply isn’t very active in communities like these.