• Hey Guest! Ever feel like entering a Game Jam, but the time limit is always too much pressure? We get it... You lead a hectic life and dedicating 3 whole days to make a game just doesn't work for you! So, why not enter the GMC SLOW JAM? Take your time! Kick back and make your game over 4 months! Interested? Then just click here!

GameMaker Generating string codes from other strings

V

Viktor1031

Guest
Hi.
I have an array of about 500 words.
words[241]="Hello"
words[214]="And"
...
and so on...

I want to generate an unique string/code from those 500 words.

I don't need to decrypt that code later so the only thing important is that you can take 500 words, put them through a function and get the same code every time you have those 500 words.

I would like the string you generate to be about 4-7 characters long.
How would i do this?
 

SoVes

Member
You can probs get away with like a binary of if word contains letters. So you'd do something like. CodeForWord = string(hasA + hasB*2 + hasC*4). Double letters can just be ignored. And then you just add CodeForWord[1]+[2].... and you'll get an unique code for it.
 
Last edited:
V

Viktor1031

Guest
I tried using
string_set_byte_at
and
string_get_byte_at
to mix up everything, but it seems like the function string_get_byte_at sometimes get's things wrong...
 
V

Viktor1031

Guest
You can probs get away with like a binary of if word contains letters. So you'd do something like. CodeForWord = string(hasA + hasB*2 + hasC*4). Double letters can just be ignored. And then you just add CodeForWord[1]+[2].... and you'll get an unique code for it.
I will try something like that, i'm still open to other methods if anyone can help
 
H

Homunculus

Guest
In short you need a hash, if I got your question correctly. If the order is not important (that is, you want the same hash even if the words are in different places but are the same overall), you may want to use a ds_list instead of an array, so that you can sort the strings and generate the hash on the same ordered list.

As for the hash itself, you could write all the words (ordered, if it's relevant) to a buffer as buffer_string and call buffer_md5 on it to get the hash. An MD5 is obviously longer than 4-7 character, but it's for a reason: you need to avoid generating the same hash for different word sets as much as possible, and with just a few characters you have a high risk of collisions. You can truncate the MD5 though if you really want to enforce that.

Edit: probably an MD5 on the concatenated string could work as well, but consider using a separator character.
 
Last edited by a moderator:
V

Viktor1031

Guest
In short you need a hash, if I got your question correctly. If the order is not important (that is, you want the same hash even if the words are in different places but are the same overall), you may want to use a ds_list instead of an array, so that you can sort the strings and generate the hash on the same ordered list.

As for the hash itself, you could add all the words (ordered, is it's relevant) to a buffer and call buffer_md5 on it to get the hash.
Sounds like a fast method, i will do this instead. The order is relevant, *if the same 500 words are in the same place i want to always get one unique code
 
V

Viktor1031

Guest
In short you need a hash, if I got your question correctly. If the order is not important (that is, you want the same hash even if the words are in different places but are the same overall), you may want to use a ds_list instead of an array, so that you can sort the strings and generate the hash on the same ordered list.

As for the hash itself, you could write all the words (ordered, is it's relevant) to a buffer as buffer_string and call buffer_md5 on it to get the hash. An MD5 is obviously longer than 4-7 character, but it's for a reason: you need to avoid generating the same hash for different word sets as much as possible, and with just a few characters you have a high risk of collisions. You can truncate the MD5 though if you really want to enforce that.

Edit: probably an MD5 on the concatenated string could work as well, but consider using a separator character.
Yeah now that i think about it 32 characters is good. For now it's a good trade to use up some more memory for a lower risk of collisions.
 
Top