1. Hey! Guest! The 35th GMC Jam will take place between November 28th, 12:00 UTC - December 2nd, 12:00 UTC. Why not join in! Click here to find out more!
    Dismiss Notice

GMS 2 Custom Hasmap implementation in GML

Discussion in 'Advanced Programming Discussion' started by Catan, Jul 19, 2019.

  1. Catan

    Catan Member

    Joined:
    Jun 20, 2016
    Posts:
    768
    Introduction

    tl;dr: i implemented my own version of a ds_map an run some benchmarks


    While trying benchmark a storage solution for the data of a recent project that required a lot ds_maps, I noticed that empty / semi-empty ds_maps have a very high memory footprint.

    I decided to experiment a bit and ended up with custom hashmap implementation that you will find attached to this post as a local asset package (yay!) to be imported in GMS2.

    I run a few performance / memory tests with this and, as you can imagine, ds_maps outperform my implementation in almost every possible way. Emphasis on almost though, because it yielded some really interesting results I want to share.

    Maybe someone will find this information useful. Needless to say I don't suggest using this in a real project unless you have a very specific reason to.

    Benchmark results ( Mac VM target | Runtime 2.2.3.344 )

    Memory (RAM)


    1. increased memory usage after creating 10'000 empty maps:
    hashmap -> 2.3mb | ds_map -> 44.65mb


    2. increased memory usage after adding 10'000 words to empty maps
    hashmap -> 3.85mb | ds_map -> 0.6mb

    Performance

    3. adding 10'000 different words
    hashmap -> 119.44ms (* 43.32ms) | ds_map -> 6.28ms

    4. fetching 100'000 values
    hashmap -> 317.84ms | ds_map -> 50.29ms

    5. deleting 10'000 entries
    hashmap -> 39.8ms | ds_map -> 5.3ms

    6. generating a ds_list of all the keys (total of 10'000)
    hashmap -> 5.97ms | ds_map -> 957.68ms


    * my implementation allows preallocating buckets, the second value in test 3 refers to preallocating as many buckets as words

    I highlighted in green the most interesting results. Judging by test 1, it is possible that ds_maps allocate a large amount of memory on initialization in order to avoid rehashing and resizing. This in turn slows down dramatically traversing the ds_map looking for keys (as shown in test 6).

    The difference in memory required by the empty maps though becomes negligible and clearly in favor of ds_maps the more entries get added as shown in test 2. There has to be something like a "break even" at around a few hundred entries.

    Implementation info

    Hash function:
    FNV-1a (with modulo % for reducing)
    Collision handling: separate chaining (using arrays as linked lists)
    Bucket size: bucket size is doubled automatically based on a load factor of 1 (by default). Shrinking is not handled automatically but can be done by using map_compact, reducing the number of buckets to the ideal load factor.

    Local asset package download
     
    Last edited: Jul 19, 2019
    GMWolf and NeZvers like this.
  2. YellowAfterlife

    YellowAfterlife ᴏɴʟɪɴᴇ ᴍᴜʟᴛɪᴘʟᴀʏᴇʀ Forum Staff Moderator

    Joined:
    Apr 21, 2016
    Posts:
    2,430
    The likely cause is that each ds_map_find_next call re-traverses the key list [comparing each key to the given one] to find the specified key before it can get the one after it.
     
    Catan likes this.

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice