1. Hey! Guest! The 35th GMC Jam will take place between November 28th, 12:00 UTC - December 2nd, 12:00 UTC. Why not join in! Click here to find out more!
    Dismiss Notice
  2. Hello Guest! It's with a heavy heart that we must announce the removal of the Legacy GMC Archive. If you wish to save anything from it, now's the time! Please see this topic for more information.
    Dismiss Notice

GMS 2 Getting Japanese text from a a txt file.

Discussion in 'Programming' started by Kyrieru, Jan 20, 2019.

  1. Kyrieru

    Kyrieru Member

    Joined:
    Jul 24, 2017
    Posts:
    71
    Following some advice I was able to read from a .txt file and get a dialogue system working with english.
    Code:
    if file_exists(path+file)
    {
    read = file_text_open_read(path+file);
    var num = 0;
    while (!file_text_eof(read))
    {
    str[num++] = file_text_read_string(read);
    message[num] = ""
    file_text_readln(read); 
    }
    file_text_close(read);
    }
    
    However, even after adding character ranges to the font, getting japanese text from a file returns □□□□□□□
    [​IMG]

    Yet, if I just write something directly like this,

    Code:
    draw_text(x,y,"かなづかい")
    It displays it correctly, using the same font.
    What string functions would cause the text to get messed up?
     
  2. FrostyCat

    FrostyCat Member

    Joined:
    Jun 26, 2016
    Posts:
    4,617
    It's the way you saved the text file that got you messed up.

    When you save text containing material outside the 7-bit ASCII range, make sure you set the encoding to UTF-8. That's the safe bet for most European languages aside from English, and your only safe bet for non-Latin scripts. On Notepad this is configured on the Save window. On others like Atom, Notepad++ or Sublime, you will see a dedicated menu for this.

    In whatever form this functionality takes, what it does is adding a short header to the file signalling that it contains UTF-8 content, with the bytes 0xEF, 0xBB, 0xBF in order. This is called the byte-order marker, or BOM for short. It tells compliant readers (the GMS file_text_*() function set is one of them) that UTF-8 tricks are employed ahead and the bytes that go above 0x7F are not to be taken alone. Failing this would give you the nonsense characters that crop up when bytes containing UTF-8 tricks are taken at an isolated face value instead of in combination with adjacent bytes.
     
    Binsk, The Reverend and Kyrieru like this.
  3. Kyrieru

    Kyrieru Member

    Joined:
    Jul 24, 2017
    Posts:
    71
    That's 2 for 2. What're you some kinda genius that solves all my problems?

    Thanks for the quick response man. You've helped a lot.
     
    The Reverend likes this.

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice