Legacy GM Loading and searching in a 100k word list

Appsurd

Member
Hi guys,

Before I forget, I use GMS1.4. For a couple of days I'm struggling with the following situation. I want to create a game like Wordfeud or Ruzzle where players can select or enter certain letters and I want to check whether the word they used is a valid word. Therefore I found online a free word list (not quite a real dictionary, since it contains words only and no explanations). All words in the txt file are seperated using enters. I also converted the txt file to a csv file (and also made a json version). I tried the following possibilities.

Try 1.
Code:
- Load the word_list.csv  from a webserver. ~10sec
- Split word_list on enter and save it to array
- Array contains all words
Result: Not possible, arrays may contain only 30k elements

Try 2.
Code:
- Load the word_list.csv  from a webserver. ~10sec
- Split word_list on enter and save it to ds_list ~50sec
- ds_list contains all words
Result: Works, but in practise terribly slow due to csv->list conversion. I personally guess this is due to the fact that in each iteration, you need to increase the size of the list, which makes it slow.

Try 3.
Code:
- Load the word_list.json  from a webserver. ~10sec
- Convert json to ds_map with json_decode
- Convert ds_map to ds_list ~20sec
- ds_list contains all words
Result: the reason I use a ds_list and not a ds_map is that searching in the keys of a ds_map is too slow to use in-game. Furthermore, this method is undesirable due to having to convert ds_map to ds_list every time the user starts the game. Naturally, the downloading will take place only once, but since I can't save a ds_list and load a ds_list directly from a file, this seemed a viable solution but isn't after all.

Try 4.
Code:
- Enter all words into a online mysql database
- At every word, use a PHP script to check whether the given word exists in the database
- Return if it exists or not
Result: This method works absolutely fine in a game of Wordfeud, but is a disaster in a Ruzzle game. Depending on the user's internet, the game 'shocks' every now and then because the game is waiting to get answer from the server. (Don't worry, checking if a word exist in a mysql database is super fast, like less than 0.1 sec) So this is not a viable option.

Try 5.
Code:
Use https://marketplace.yoyogames.com/assets/448/dictionary
Result: This solution was recommended to me, but isn't what I'm looking for. The problem with this asset is that I cannot change nor tweak the contents of the word list. But moreover, I cannot add support for multiple languages.

================================================================================

And that's about it. At the moment, I really don't know what to do. I'll put down a summary of what I think I require in-game. If anybody knows some fantastic method, please let me know :)

- Need to either download the file from a server or add it as an Included File to the game. Simply sending a PHP request to a database makes the game lag on fast-paced games.
- Word list must be in ds_list format since ds_map searching is too slow, array cannot be big enough (and searching isn't possible at all) and I'm unaware of any possible formats.
 

YellowAfterlife

ᴏɴʟɪɴᴇ ᴍᴜʟᴛɪᴘʟᴀʏᴇʀ
Forum Staff
Moderator
Result: the reason I use a ds_list and not a ds_map is that searching in the keys of a ds_map is too slow to use in-game.
Depends on how you search. If the map is structured like
Code:
{ "word1": 1, "word2": 1 }
then you would be able to check if the word is on the list via a simple ds_map_exists check.
 

Appsurd

Member
Depends on how you search. If the map is structured like
Code:
{ "word1": 1, "word2": 1 }
then you would be able to check if the word is on the list via a simple ds_map_exists check.
It is :)
But I thought I tried that, and it was really slow. Well, any time larger than 0.1 ms is too long, since I want to prevent shocking gameplay.

EDIT

Hmm, it's in the format
Code:
{"50088": "supermarkets", "44884": "scripture", ...
Perhaps if I change outside GM, then download, and then search (because ds_map_exists seems to be pretty fast after all).
I'll come back to you asap!

EDIT 2

Changing the format into the "right' form was luckily only a matter of seconds due to my Python script :)
It seems all is working fine, I'll confirm this (after some mobile tests) within a few days
 
Last edited:

SSJCoder

Member
It is :)
But I thought I tried that, and it was really slow. Well, any time larger than 0.1 ms is too long, since I want to prevent shocking gameplay.

EDIT

Hmm, it's in the format
Code:
{"50088": "supermarkets", "44884": "scripture", ...
Perhaps if I change outside GM, then download, and then search (because ds_map_exists seems to be pretty fast after all).
I'll come back to you asap!

EDIT 2

Changing the format into the "right' form was luckily only a matter of seconds due to my Python script :)
It seems all is working fine, I'll confirm this (after some mobile tests) within a few days
If you were to do this using lists, it would probably be wise to separate all the words in groups, based on eg. the first three letters.

so you would have about 15,000 lists, and the first three letters would define the list to look through. In this case, you would check the first letter, then the second letter, then the third, and get a group based on a mathematical calculation. Then would it actually (in my opinion) be practicable.

(though this requires a longer time to setup all the words and the lists they are in, and a decent algorithm)
 

Appsurd

Member
(though this requires a longer time to setup all the words and the lists they are in, )
Which is not what I want, but I'm starting to understand that it won't be super fast either way :) I consider this a viable option, but I assume 15,000 lists is a little over the top. I would personally split it into like 50 lists.

At the moment, on Windows, the option suggested by @YellowAfterlife seems to work.

Code:
Dictionary size: 106,000

Downloading from webserver: 2000 ms
Convert json to map: 1000 ms

Searching using ds_map_exists: 10,000 searches take 15 ms :D

Saving ds_map using secure_save: 600 ms
Loading ds_map using secure_load: 1100 ms
The upper two are one-time only. The searching is fast. And loading and saving, well, it's not a big deal, I'll just through in some animations or so :)
I will test it on mobile soon
 

dannyjenn

Member
Result: Not possible, arrays may contain only 30k elements
Can't you just use 4 arrays?

since I can't save a ds_list and load a ds_list directly from a file
Why can't you just use ds_list_write() and ds_list_read()? Too slow? Too big?

And borrowing from @SSJCoder's idea, what kind of searching algorithm are you using? A linear search doesn't make much sense for words beginning with "z" or whatever. If you want it all in the same list then you could keep track of the indices of the first "b" word, the first "c" word, the first "d" word, etc. and then jump to the correct position before beginning the linear search.

Or, what you could do is use 26 different ds_maps (one for each initial letter) and see if that runs fast enough.
 

Xer0botXer0

Senpai
Loads words into a buffer until buffer has reached maximum size that array can support, add to array, continue loading save into second array, etc.
In fact regarding searching use a 2D array with what DannyJenn and SSJ is saying, first dimension represents the first character of the words within the second dimension.
 

SSJCoder

Member
The upper two are one-time only. The searching is fast. And loading and saving, well, it's not a big deal, I'll just through in some animations or so :)
I will test it on mobile soon
Cool, so you have it working

I consider this a viable option, but I assume 15,000 lists is a little over the top
The more lists, the faster the search.

50 Lists = ~2000 word checks per request (100k / 50)
15k lists = ~7 word checks per request (285x faster)

Then again, it seems you already have everything working, but this is probably how it would work if you implemented it using lists.

(Oh and by the way, in case you were wondering how I got 15,000, 25^3 = 15,625, since the alphabet has about 25 letters, and since there are 3 letters that we're using as an index in this case, 25^3 gives us total number combinations)
 
Last edited:
J

Jomar

Guest
Anyone can give me advice/teach me how to make a word search game like ruzzle using a bluetooth for multiplayer game.
 
Top