GM:S 1.4 Loading and searching in a 100k word list

Discussion in 'Programming' started by Appsurd, Aug 14, 2017.

  1. Appsurd

    Appsurd Member

    Joined:
    Jun 20, 2016
    Posts:
    231
    Hi guys,

    Before I forget, I use GMS1.4. For a couple of days I'm struggling with the following situation. I want to create a game like Wordfeud or Ruzzle where players can select or enter certain letters and I want to check whether the word they used is a valid word. Therefore I found online a free word list (not quite a real dictionary, since it contains words only and no explanations). All words in the txt file are seperated using enters. I also converted the txt file to a csv file (and also made a json version). I tried the following possibilities.

    Try 1.
    Code:
    - Load the word_list.csv  from a webserver. ~10sec
    - Split word_list on enter and save it to array
    - Array contains all words
    
    Result: Not possible, arrays may contain only 30k elements

    Try 2.
    Code:
    - Load the word_list.csv  from a webserver. ~10sec
    - Split word_list on enter and save it to ds_list ~50sec
    - ds_list contains all words
    
    Result: Works, but in practise terribly slow due to csv->list conversion. I personally guess this is due to the fact that in each iteration, you need to increase the size of the list, which makes it slow.

    Try 3.
    Code:
    - Load the word_list.json  from a webserver. ~10sec
    - Convert json to ds_map with json_decode
    - Convert ds_map to ds_list ~20sec
    - ds_list contains all words
    
    Result: the reason I use a ds_list and not a ds_map is that searching in the keys of a ds_map is too slow to use in-game. Furthermore, this method is undesirable due to having to convert ds_map to ds_list every time the user starts the game. Naturally, the downloading will take place only once, but since I can't save a ds_list and load a ds_list directly from a file, this seemed a viable solution but isn't after all.

    Try 4.
    Code:
    - Enter all words into a online mysql database
    - At every word, use a PHP script to check whether the given word exists in the database
    - Return if it exists or not
    
    Result: This method works absolutely fine in a game of Wordfeud, but is a disaster in a Ruzzle game. Depending on the user's internet, the game 'shocks' every now and then because the game is waiting to get answer from the server. (Don't worry, checking if a word exist in a mysql database is super fast, like less than 0.1 sec) So this is not a viable option.

    Try 5.
    Code:
    Use https://marketplace.yoyogames.com/assets/448/dictionary
    
    Result: This solution was recommended to me, but isn't what I'm looking for. The problem with this asset is that I cannot change nor tweak the contents of the word list. But moreover, I cannot add support for multiple languages.

    ================================================================================

    And that's about it. At the moment, I really don't know what to do. I'll put down a summary of what I think I require in-game. If anybody knows some fantastic method, please let me know :)

    - Need to either download the file from a server or add it as an Included File to the game. Simply sending a PHP request to a database makes the game lag on fast-paced games.
    - Word list must be in ds_list format since ds_map searching is too slow, array cannot be big enough (and searching isn't possible at all) and I'm unaware of any possible formats.
     
  2. YellowAfterlife

    YellowAfterlife ᴏɴʟɪɴᴇ ᴍᴜʟᴛɪᴘʟᴀʏᴇʀ Forum Staff Moderator

    Joined:
    Apr 21, 2016
    Posts:
    2,404
    Depends on how you search. If the map is structured like
    Code:
    { "word1": 1, "word2": 1 }
    then you would be able to check if the word is on the list via a simple ds_map_exists check.
     
  3. Appsurd

    Appsurd Member

    Joined:
    Jun 20, 2016
    Posts:
    231
    It is :)
    But I thought I tried that, and it was really slow. Well, any time larger than 0.1 ms is too long, since I want to prevent shocking gameplay.

    EDIT

    Hmm, it's in the format
    Code:
    {"50088": "supermarkets", "44884": "scripture", ...
    
    Perhaps if I change outside GM, then download, and then search (because ds_map_exists seems to be pretty fast after all).
    I'll come back to you asap!

    EDIT 2

    Changing the format into the "right' form was luckily only a matter of seconds due to my Python script :)
    It seems all is working fine, I'll confirm this (after some mobile tests) within a few days
     
    Last edited: Aug 14, 2017
  4. SSJCoder

    SSJCoder Member

    Joined:
    Aug 12, 2017
    Posts:
    126
    If you were to do this using lists, it would probably be wise to separate all the words in groups, based on eg. the first three letters.

    so you would have about 15,000 lists, and the first three letters would define the list to look through. In this case, you would check the first letter, then the second letter, then the third, and get a group based on a mathematical calculation. Then would it actually (in my opinion) be practicable.

    (though this requires a longer time to setup all the words and the lists they are in, and a decent algorithm)
     
  5. Appsurd

    Appsurd Member

    Joined:
    Jun 20, 2016
    Posts:
    231
    Which is not what I want, but I'm starting to understand that it won't be super fast either way :) I consider this a viable option, but I assume 15,000 lists is a little over the top. I would personally split it into like 50 lists.

    At the moment, on Windows, the option suggested by @YellowAfterlife seems to work.

    Code:
    Dictionary size: 106,000
    
    Downloading from webserver: 2000 ms
    Convert json to map: 1000 ms
    
    Searching using ds_map_exists: 10,000 searches take 15 ms :D
    
    Saving ds_map using secure_save: 600 ms
    Loading ds_map using secure_load: 1100 ms
    
    The upper two are one-time only. The searching is fast. And loading and saving, well, it's not a big deal, I'll just through in some animations or so :)
    I will test it on mobile soon
     
  6. dannyjenn

    dannyjenn Member

    Joined:
    Jul 29, 2017
    Posts:
    568
    Can't you just use 4 arrays?

    Why can't you just use ds_list_write() and ds_list_read()? Too slow? Too big?

    And borrowing from @SSJCoder's idea, what kind of searching algorithm are you using? A linear search doesn't make much sense for words beginning with "z" or whatever. If you want it all in the same list then you could keep track of the indices of the first "b" word, the first "c" word, the first "d" word, etc. and then jump to the correct position before beginning the linear search.

    Or, what you could do is use 26 different ds_maps (one for each initial letter) and see if that runs fast enough.
     
  7. Xer0botXer0

    Xer0botXer0 Member

    Joined:
    Jun 29, 2016
    Posts:
    670
    Loads words into a buffer until buffer has reached maximum size that array can support, add to array, continue loading save into second array, etc.
    In fact regarding searching use a 2D array with what DannyJenn and SSJ is saying, first dimension represents the first character of the words within the second dimension.
     
  8. SSJCoder

    SSJCoder Member

    Joined:
    Aug 12, 2017
    Posts:
    126
    Cool, so you have it working

    The more lists, the faster the search.

    50 Lists = ~2000 word checks per request (100k / 50)
    15k lists = ~7 word checks per request (285x faster)

    Then again, it seems you already have everything working, but this is probably how it would work if you implemented it using lists.

    (Oh and by the way, in case you were wondering how I got 15,000, 25^3 = 15,625, since the alphabet has about 25 letters, and since there are 3 letters that we're using as an index in this case, 25^3 gives us total number combinations)
     
    Last edited: Aug 14, 2017
    andev likes this.
  9. andev

    andev Member

    Joined:
    Jul 2, 2017
    Posts:
    444
     
  10. Jomar

    Jomar Member

    Joined:
    Sep 11, 2017
    Posts:
    14
    Anyone can give me advice/teach me how to make a word search game like ruzzle using a bluetooth for multiplayer game.
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice