GMS 2 Custom Hasmap implementation in GML

Homunculus

Member
Introduction

tl;dr: i implemented my own version of a ds_map an run some benchmarks

While trying benchmark a storage solution for the data of a recent project that required a lot of ds_maps, I noticed that empty / semi-empty ds_maps have a very high memory footprint.

I decided to experiment a bit and ended up with custom hashmap implementation that you will find attached to this post as a local asset package (yay!) to be imported in GMS2.

I run a few performance / memory tests with this and, as you can imagine, ds_maps outperform my implementation in almost every possible way. Emphasis on almost though, because it yielded some really interesting results I want to share.

Maybe someone will find this information useful. Needless to say I don't suggest using this in a real project unless you have a very specific reason to.

Benchmark results ( Mac VM target | Runtime 2.2.3.344 )

Memory (RAM)

1. increased memory usage after creating 10'000 empty maps:
hashmap -> 2.3mb | ds_map -> 44.65mb


2. increased memory usage after adding 10'000 words to empty maps
hashmap -> 3.85mb | ds_map -> 0.6mb

Performance

3. adding 10'000 different words
hashmap -> 119.44ms (* 43.32ms) | ds_map -> 6.28ms

4. fetching 100'000 values
hashmap -> 317.84ms | ds_map -> 50.29ms

5. deleting 10'000 entries
hashmap -> 39.8ms | ds_map -> 5.3ms

6. generating a ds_list of all the keys (total of 10'000)
hashmap -> 5.97ms | ds_map -> 957.68ms


* my implementation allows preallocating buckets, the second value in test 3 refers to preallocating as many buckets as words

I highlighted in green the most interesting results. Judging by test 1, it is possible that ds_maps allocate a large amount of memory on initialization in order to avoid rehashing and resizing. This in turn slows down dramatically traversing the ds_map looking for keys (as shown in test 6).

The difference in memory required by the empty maps though becomes negligible and clearly in favor of ds_maps the more entries get added as shown in test 2. There has to be something like a "break even" at around a few hundred entries.

Implementation info

Hash function:
FNV-1a (with modulo % for reducing)
Collision handling: separate chaining (using arrays as linked lists)
Bucket size: bucket size is doubled automatically based on a load factor of 1 (by default). Shrinking is not handled automatically but can be done by using map_compact, reducing the number of buckets to the ideal load factor.

Local asset package download
 
Last edited:

YellowAfterlife

ᴏɴʟɪɴᴇ ᴍᴜʟᴛɪᴘʟᴀʏᴇʀ
Forum Staff
Moderator
This in turn slows down dramatically traversing the ds_map looking for keys
The likely cause is that each ds_map_find_next call re-traverses the key list [comparing each key to the given one] to find the specified key before it can get the one after it.
 
Top