Localizing / Consolidate text?

Neptune

Member
Is this still pretty up to date for language localizations in 2.3+ ?
Shaun talks about his "ini file" method around 3:00

Any information or suggestions before I dive in to something like this are appreciated.
 
Last edited:

Neptune

Member
One problem I see with Shaun's is it requires a ton of instance variables if you have things that draw a lot of different bits of text... Or you'd be gathering from the ini files 60x/second which doesnt sound like a great idea.

I wonder how effective creating a script with a giant 'switch' would work? 🤔
And at what size would the switch have to be before causing issues accessing it 60 or X*60 or so times per second in many cases. I would imagine 100,000s of cases before performance is visually impacted, but I don't really have proof.

[edit]
This doesnt lag with everything else that's running on my 500$ desktop so... I guess I'll just try a switch.
Can't imagine I'd have more than a couple thousand texts anyway.
GML:
if keyboard_check(ord("H"))
{
    var cc = 0;
    repeat(35000) {cc = choose(0,1); if cc {count += 1;}}
}
 
Last edited:

Mert

Member
The best option is to have a struct to hold them.

GML:
#macro lang global.language

global.language = {
   
    menu : {
        play : oyna,
        exit : çıkış
    },
    level_1 : {
        do_this : Git felan filan
    }

}
For Using
GML:
draw_text(x,y,lang.menu.play);
And then depending on the language, you may load an ini file and replace the struct's values. You shouldn't create variables for every language and hold them in-memory.
 
There's zero need to use ini files. I would highly recommend against using inis for basically anything. JSON is much more suited for this purpose. You create your JSON file, populate with languages, and populate that with string. You can load JSON into a struct no-fuss through GML and everything is preserved, so no need to mess around with inis.
 

Neptune

Member
What do you think if a large portion of my texts are like this (started trying the switch method):
1610680296650.png
Or this:
1610680355265.png

As you can see, a lot of my text is parsed with special characters or reference code...
For that fact alone, I think a switch-script might be best suited otherwise I'd be frankensteining texts together like get_text(3) + button + get_text(4) + item_name + get_text(5)
 
Last edited:

rytan451

Member
There's zero need to use ini files. I would highly recommend against using inis for basically anything. JSON is much more suited for this purpose. You create your JSON file, populate with languages, and populate that with string. You can load JSON into a struct no-fuss through GML and everything is preserved, so no need to mess around with inis.
JSON files don't play well with source control, unlike INI files. There are places where INI files are overused, but I recommend either creating your own text format for localization (plurals are particularly difficult) or using INIs for their simplicity. For example, Czech uses a different plural for 2-4 than for other numbers. Here's a more professional look at localization (from the developers of one of my favorite games): https://factorio.com/blog/post/fff-244
 

Neptune

Member
I don't think Factorio is quite the same as translating an adventure RPG... I wish I only had to translate a HUD and digital clock though 🤣

I'm thinking if my texts were not parsed with specials and references, some JSON stuff would probably work fine, but in this case maybe a dirty switch is the way to go.
Either way, I'll leave it unsolved and see if someone has a better solution for my situation.
 
Last edited:
JSON files don't play well with source control, unlike INI files. There are places where INI files are overused, but I recommend either creating your own text format for localization (plurals are particularly difficult) or using INIs for their simplicity. For example, Czech uses a different plural for 2-4 than for other numbers. Here's a more professional look at localization (from the developers of one of my favorite games): https://factorio.com/blog/post/fff-244
How does JSON plays any worse than INI with source control? I've had zero issues myself. It's set up almost identically to actual code, so I just don't see how. INIs are fine for games with a small amount of text, but with larger games like RPGs and adventure games, the amount of text gets to be unmanageable unless you can organize it to a far greater degree than INIs allow. Creating your own format is fine, but it's likely to look a lot like JSON after you're done if you're doing something with a lot of text, and since GM has built-in (optimized) functions for JSON...

I'm thinking if my texts were not parsed with specials and references, some JSON stuff would probably work fine, but in this case maybe a dirty switch is the way to go.
Either way, I'll leave it unsolved and see if someone has a better solution for my situation.
Specials meaning like codes for text colors, effects, etc.? Wouldn't that be in the text itself and thus won't matter where you're getting text since it'll all be a string at the end?
 

Neptune

Member
Right, the specials are just dotted in the dialogues and such to trigger events, player choice conditionals, play a sound effect etc.
But as you can see in my example above (the second picture) string(sound_effect) wouldnt execute properly if it was read from a file... Right? Now I'm unsure o_O
I think the string would end up containing "string(sound_effect)" and not "X"
 

Pixel-Team

Master of Pixel-Fu
I'm working on a node based text mapper with localization support. It's based on one I saw on itch.io. It exports to JSON. This is good for getting text for every button, and every piece of dialog, and branching conversations. It also supports adding data on nodes for custom behaviors. I will be making it free when it's ready, and it's very close to a 1.0.
 
Right, the specials are just dotted in the dialogues and such to trigger events, player choice conditionals, play a sound effect etc.
But as you can see in my example above (the second picture) string(sound_effect) wouldnt execute properly if it was read from a file... Right? Now I'm unsure o_O
I think the string would end up containing "string(sound_effect)" and not "X"
I've always used some kind of a tag system. Commands are parsed and replaced in-game with the necessary code on-the-fly.
For example, this:
GML:
draw_text(x, y, "Hello! How are you, " + player_name + "?";
Would be written as something like
Hello! How are you, [player_name]?
It's a bit of work up-front, but it makes working on dialogue so much faster that I couldn't imagine doing it otherwise now. It saves so much headache.
 
If tags are something in GM ive never used them. Can you explain what [player_name] is?
It's just a string. It's parsed in-game. What I do is:
  • Find starting position of tag. In my case, I use square brackets, so the starting position would be string_pos(_string, "[")
  • Find the ending position of the tag. In my case, it would be string_pos(_string, "]")
  • Check the text inside the tags against a switch that tells the code what to replace the tag with
  • Delete the tag in the displayed text
  • Repeat until no tags are left
 

rytan451

Member
Creating your own format is fine, but it's likely to look a lot like JSON after you're done if you're doing something with a lot of text, and since GM has built-in (optimized) functions for JSON...
For short snippet translation, using JSON isn't exactly a great idea. For example, for English, the word "minute" has two variants: singular and plural. For other languages (like Czech), there can be more than two variants. The devs of Factorio ran into this problem.

An old localization file for English used to have these lines:

Code:
minute1=minute
minute2-4=minutes
minute5+=minutes
But for Czech, had these lines:

Code:
minute1=minuta
minute2-4=minuty
minute5+=minut
Notice that the Czech translation has a different plural for 2-4 than for 5.

There are many plural localization rules. The devs of Factorio found it easier to create their own format for snippet translation. This is what they ended up with for Romanian:

Code:
minutes=__1__ __plural_for_parameter_1__[1]__minut__[ends in 01-19]__minute__[rest]__de minute__
How does JSON plays any worse than INI with source control? I've had zero issues myself. It's set up almost identically to actual code, so I just don't see how. INIs are fine for games with a small amount of text, but with larger games like RPGs and adventure games, the amount of text gets to be unmanageable unless you can organize it to a far greater degree than INIs allow. Creating your own format is fine, but it's likely to look a lot like JSON after you're done if you're doing something with a lot of text, and since GM has built-in (optimized) functions for JSON...
Because there's no trailing comma at the end of objects or arrays, any commit adding data to the end of an object or array has a chance of getting a merge conflict with another commit adding data to the end of an object or array. In comparison, INI doesn't have comma delineation. So, if you add data to the end of a section (and for localization, there's probably only one section per language, or maybe one section per file), there is no chance of any merge conflict.

Sure, INI is less powerful than JSON. But, in my eyes, that's a plus. When programming, it's usually a good idea to use the least powerful tool that can accomplish the task at hand.

This is why INI plays better with git than JSON:
Before:
Code:
1. {
2. "list": [
3. "a"
4. ]
5. }
Branch A:
Code:
1. 1. {
2. -  "a": "A"
+  2. "a": "A",
+  3. "b": "B"
3. 4. }
Branch B:
Code:
1. 1. {
2. -  "a": "A"
+  2. "a": "A",
+  3. "c": "C"
3. 4. }
Merge branch B into branch A:

Code:
1. 1. {
<<<<<<< Branch A
2. -  "a": "A"
+  2. "a": "A",
+  3. "b": "B"
=======
2. -  "a": "A"
+  2. "a": "A",
+  3. "c": "C"
>>>>>>> Branch B
4. 5. ]
5. 6. }
In comparison, here's the same thing in INI:

Original:

Code:
[Example]
1. a=A
Branch A:
Code:
1. 1. a=A
+  2. b=B
Branch B:
Code:
1. 1. a=A
+  2. c=C
Merge branch B into branch A:

Code:
1. 1. a=A
+  2. b=B
+  3. c=C
Note that since there was no line deletions in the INI, there was no merge conflict. However, with the line deletion in JSON, there was a merge conflict.
 

Neptune

Member
Take your fancy git and shove it up your a... I mean, yeah.. git, I use git, and not my early 2000s flash stick and dropbox 😂
 

rytan451

Member
I use git for two major reasons and one minor reason. The first reason is that I have to work with other people. Without Git, that's pretty hard. The second reason is that it's easier to experiment. If I git commit regularly, then as long as the Git repository stays intact, I can mess up my codebase as much as I like without any permanent repercussions. After all, if I break anything, I can always roll back to the last good commit. The third, if minor, reason is that it saves so much space in comparison with backing up the whole project multiple times (though since computers have so much memory, that's much less important nowadays).
 
For short snippet translation, using JSON isn't exactly a great idea. For example, for English, the word "minute" has two variants: singular and plural. For other languages (like Czech), there can be more than two variants. The devs of Factorio ran into this problem.

An old localization file for English used to have these lines:

Code:
minute1=minute
minute2-4=minutes
minute5+=minutes
But for Czech, had these lines:

Code:
minute1=minuta
minute2-4=minuty
minute5+=minut
Notice that the Czech translation has a different plural for 2-4 than for 5.

There are many plural localization rules. The devs of Factorio found it easier to create their own format for snippet translation. This is what they ended up with for Romanian:

Code:
minutes=__1__ __plural_for_parameter_1__[1]__minut__[ends in 01-19]__minute__[rest]__de minute__
Yes, I read that article. That method of localization/translation is only feasible for projects with a minimal amount of text. You're not translating individual words in an RPG/Adventure game; you're translating a novel's worth of dialogue.

Because there's no trailing comma at the end of objects or arrays, any commit adding data to the end of an object or array has a chance of getting a merge conflict with another commit adding data to the end of an object or array. In comparison, INI doesn't have comma delineation. So, if you add data to the end of a section (and for localization, there's probably only one section per language, or maybe one section per file), there is no chance of any merge conflict.

Sure, INI is less powerful than JSON. But, in my eyes, that's a plus. When programming, it's usually a good idea to use the least powerful tool that can accomplish the task at hand.
That's human error, not a fault of JSON. If you're consistently having that issue, you need to be more careful in making your commits. Alternatively, you can take a page from Haskell and place the trailing commas at the start of the next line:
JSON:
{
    "name": "John"
    , "age": 21
    , "job": "Mechanic"
}
 
Top