• Hey Guest! Ever feel like entering a Game Jam, but the time limit is always too much pressure? We get it... You lead a hectic life and dedicating 3 whole days to make a game just doesn't work for you! So, why not enter the GMC SLOW JAM? Take your time! Kick back and make your game over 4 months! Interested? Then just click here!

Precisely chaining audios for ASR effect ?

DaveInDev

Member
Hi,

I search in the doc, but I did not find the answer :
Is there a way to precisely chain audio assets ?
(more precisely than just testing audio_exists() in a step event, and trigger the next sound when the previous sound dies).
What I want to achieve is a 3 part sound simulating an ASR enveloppe :
I'd like to play a single (A)ttack sound , followed by a looped (S)ustained sound that could last any time, and then ended by a (R)elease sound.

As I said before, using audio_exists() in a step event leaves audible gap between sounds.
Another possibility is to use alarms to trigger the next sound just a little before the previous one dies ; but then there is also an audible overlapping giving a temporary increase of volume, as the end of one sound is quite the same as the beginning of the next one, it finally doubles the result....

Infact, the best solution would be to be able to travel in one single sound asset. Setting an inside loop in the form of two time points : the entrance and exit of the loop.

Like this :
1610925750908.png

So I tried to use audio_sound_set_track_position with an alarm to loop backwards regularly, but it also creates a sound glitch because the loop in an out points are not guarantied to be on a zero of the sound wave, giving a discontinuity in the resulting wave form...

Any idea on how to achieve this properly ?

PS: note that it's not only a question of volume or gain enveloppe : the starting sound can have a different "texture" than the looping sound and than the ending sound. But they chains nicely, continously.
 

DaveInDev

Member
I saw this doc, but all buffers seems to do is loading raw data or PCM data into a buffer, working on it and finally transforming it into a sound object, that can be used like any other sound assets. But then... What can I do more with these buffers that I cannot do with regular sound assets ?
 

kburkhart84

Firehammer Games
I saw this doc, but all buffers seems to do is loading raw data or PCM data into a buffer, working on it and finally transforming it into a sound object, that can be used like any other sound assets. But then... What can I do more with these buffers that I cannot do with regular sound assets ?
If I'm not mistaken, you make an audio queue, and then add buffers to the end of it. So you would be able to turn your separate sections into buffers, and then just add them in the order you want.
 
I don't know your specific case, but usually these kind of 'sustained loops' as you said, are very short in duration, unlike in your graph. The 'head' and 'tail' being longer, you can set them at a higher priority than the body of the sound and be assured you won't have a 'sustained' sound after the tail dies. Does requires 3-part files, tho, I'm not too sure how to approach all this from a single-file thing...
 

DaveInDev

Member
I don't know your specific case, but usually these kind of 'sustained loops' as you said, are very short in duration, unlike in your graph. The 'head' and 'tail' being longer, you can set them at a higher priority than the body of the sound and be assured you won't have a 'sustained' sound after the tail dies. Does requires 3-part files, tho, I'm not too sure how to approach all this from a single-file thing...
Well, typically, I'll have a 2sec head, a 3sec tail and a 1 sec loop. But I still do not understand how these priorities ensure a perfect continuity of the playing between, for example, head and sustained parts.
 
Well, typically, I'll have a 2sec head, a 3sec tail and a 1 sec loop. But I still do not understand how these priorities ensure a perfect continuity of the playing between, for example, head and sustained parts.
Let me try to explain a little bit better, mate!
If you think, about it, the problematic part is that 'sustain', as it's the only sample of the 3 that we don't know how long it will be played for/where will it be cut. The head and tail of the sample will most likely be triggered with key pressed/release events. So approching the problem like this, the 2 things I need to ensure are that the sound doesn't "pop" between samples, and make sure the "sustain" part never ever overlaps the head and tail.
The overlapping part, that's what I was talking about in the previous post. Given what you tell me, this is not a probem, tho. If you have, say, a 0.5 sec "sustain" and a 1sec "tail", no matter if the tail triggers at 0.3s of the sustain sample, the sustain sample will never go beyond, since the tail is 1s long (longer than the previous sample), and at higher priority than the "sustain" sample (whick makes it "play over" lower priority sounds).
And to avoid popping, just follow this simple rule of editing. This is what will make your edits apperar seamless (I was an audio engineer for 10 years, I did this a time or two, lol)
If all your samples are spliced at zero-crossing, you'll never have a "pop" or weird artifacts. That's also why the "shorter" sustain, the better.
image849.png
At the studio I used to work at, we took "snapshots" of sounds like that, and most of them were way shorter than a full second. Depends on the complexity of the sound, I guess, but you can always add small fluctuations in sound with coded-effects, like pitch modulation, and make shorter samples appear much longer and complex.

Anyway, hope that made a bit of sense
 

DaveInDev

Member
@Slow Fingers thanks man. That's funny because I'm also a musician (as a strong hobby) and I often record and mix stuff, so I am aware about these pops/clicks and the need to split on a zero. Usually when making my own samples, I even make very quick fades in or out of a few samples to make sure that no click will be heard. But then usually, I also volontarely overlap successive samples by the size of these small fade in/out, so the transitions are very smooth.

Of course, I would like to do this here, but the tools offered by GMS2 (in terms of sound timing) are limited (but interesting). As @kburkhart84 suggested, I had a look at buffers and queues and they are part of the answer, because they allow you to queue "sound clips" extracted of a same file/buffer. So I can push the head clip + 1 sustain clip, and as soon as the head is played (I use the async event to know that), I push another sustain clip, and so on. If I want to stop the whole sound, at the following async, I push a tail clip and wait for its complete end.

It works. The only problem is that I must always have 1 clip already queued while the previous one is playing (the queue must not be empty when teh async event is called, otehrwise, there is a big glitch). And apparently, I cannot shorten in any way the clips already queued. So the "release" is not triggered immediately when I ask for it... But the result is here.

EDIT: I have an idea: to be able to end the sustain clip even in the middle of it and jump directly to the tail clip, I will push the sustain clip in little chunks instead of one piece. So it will trigger several async events during its playback.
 
It works. The only problem is that I must always have 1 clip already queued while the previous one is playing (the queue must not be empty when teh async event is called, otehrwise, there is a big glitch). And apparently, I cannot shorten in any way the clips already queued. So the "release" is not triggered immediately when I ask for it... But the result is here.
Hmmm, that does sound like problem. But yeah, GM and audio in general, it could be way better. We are far from having a built-in Wwise 😂
Just having automatic cross-fades built-in would be SUCH a step foward. Pretty sure it wouldn't be that hard to implement either.
But yeah, the smaller the chuncks, the better the result will be. Maybe you are familiar with Impulses Responses that are used with guitar and bass players. Those files are like 0.0001 seconds long. That's all that's needed to get an accurate representation of a sound rendition in a given room. Uless those sustain sounds are actual melodies, they will most likely be/sound like processed white-noise, so you really don't need a long sample. I'd try to shorten it to a couple ms, or as short as possible, and see if it imprves things.
Glad to know I'm not speaking chinese here :)
 

kburkhart84

Firehammer Games
Just having automatic cross-fades built-in would be SUCH a step foward. Pretty sure it wouldn't be that hard to implement either.
The sound system I'm working on already has working cross-fading. It also has what I call "swap-fading" so you can switch to the exact same position in a different track, like if you wanted to have something that is a more intense version of the same music, seamlessly, the same way cross-fading happens(or instantly if you so wish). It also has stuff for creating variety in sounds, like automatic variation of pitch and volume, as well as having multiple sounds be available for a single sound effect(like multiple footstep sounds for more variety than just pitch and volume changes). I still have to finish up a few more features, and get some documentation done.

Just like the input system(and some other stuff as well), I also feel they are sorely lacking, so I'm working to fix them, at the least to the extent I can.
 
The sound system I'm working on already has working cross-fading. It also has what I call "swap-fading" so you can switch to the exact same position in a different track, like if you wanted to have something that is a more intense version of the same music, seamlessly, the same way cross-fading happens(or instantly if you so wish). It also has stuff for creating variety in sounds, like automatic variation of pitch and volume, as well as having multiple sounds be available for a single sound effect(like multiple footstep sounds for more variety than just pitch and volume changes). I still have to finish up a few more features, and get some documentation done.

Just like the input system(and some other stuff as well), I also feel they are sorely lacking, so I'm working to fix them, at the least to the extent I can.
Seems very nice. I usually don't mess with audio in GM since I have, like, a 10k$ Pro Tools HD rig at home and can do pretty much anything I want already right in the box. And I also tend to leave audio for last, just because of the insane build time increase, even with empty sound assets.
Given the quality of the samples nowadays, my workflow consists pretty much of working MIDI and then tweaking the VST to get my sound. That way I can reuse/tweak stuff with different samples super easily. If we had MIDI implementation with VST support, we could do just about anything, but this is pure fantasy.

And I don't think the doc in your input system is lacking at all, it's super-professional and in line with the GML documentation nomenclature. What is lacking is something more visual like a youtube tutorial for dumb people like me 😂
(plus, I'm not really a teapot, so I'm guessing younger/inexperienced/dumber people could find it intimidating, maybe)
 

kburkhart84

Firehammer Games
What is lacking is something more visual like a youtube tutorial
You know...I have no idea why I never thought of this. It will require me to get me some more skills that I don't have, but most certainly I can do something like that.

EDIT***

I was more referring to the systems in GMS, not my docs, but thanks for the compliment anyway :)
 

DaveInDev

Member
Glad to know I'm not speaking chinese here :)
Oh no you don't, I can speak recording, audio treatment, impulses, convolution reverb, VST, etc... for hours 😁
And I also will work my samples, fxs and so on in external tools.
But I still need a way to handle and chain them properly in GMS2. Especially for these variable length sounds I'd like to obtain with an ASR approach. I will use this for example on a spaceship propulsion : a starting "exploding" sound for the ignition of the rocket, then a looping propulsion sound as long as you press "forward" and the rocket is pushing, and a "stop engine" sound when you release the forward key. (excuse my bad english).

The sound system I'm working on already has working cross-fading. It also has what I call "swap-fading" so you can switch to the exact same position in a different track,
By curiosity, are you developping this in GML or in an external DLL and another language ?
 

kburkhart84

Firehammer Games
Oh no you don't, I can speak recording, audio treatment, impulses, convolution reverb, VST, etc... for hours 😁
And I also will work my samples, fxs and so on in external tools.
But I still need a way to handle and chain them properly in GMS2. Especially for these variable length sounds I'd like to obtain with an ASR approach. I will use this for example on a spaceship propulsion : a starting "exploding" sound for the ignition of the rocket, then a looping propulsion sound as long as you press "forward" and the rocket is pushing, and a "stop engine" sound when you release the forward key. (excuse my bad english).



By curiosity, are you developping this in GML or in an external DLL and another language ?
It's all GML, all for Gamemaker games. As I'm designing it though it doesn't seem like it would fit what you are trying to do. It will handle fading between music tracks for example, but I haven't yet decided if I'm going to mess with loop points, etc... like what you are describing. I might if I can figure out how, and if you can get a buffer out of a sound asset possibly. It's meant more for normal gameplay stuff though and not the low level control you seem to be looking for. I'm not fully ruling it out though.

In the specific case of a rocket, you could probably just get "close enough" as far as between the intro, loop, and stopping sound. You could just use the gain changing function to go between them and cross-fade so that it's smooth. The gain changing functions also give you an argument for time, and it changes the gain over that time, so you could easily fade out the one sound in 100 milliseconds, and right then, start the next sound at zero gain and fade it to full volume in 100 milliseconds. That would get around the clicking from not being at a 0 point on the wave pattern and would likely be close enough to be good enough for the game, especially since there would be other stuff going on.
 

DaveInDev

Member
The gain changing functions also give you an argument for time,
I missed this one ! audio_sound_gain can be progressive ! Nice. Could be of a great help in my problem, to chain sounds smoothly.
I imagine how you can set it at the begining of audio_play for fade in, but for fade out, do you approximate the end of the sound with an alarm ? Alarms are so precise ? The problem with the sound is that it is timed by the soundcard, no by the CPU, so for long sounds, there could be a discrepency, but it should be negligible.
 

kburkhart84

Firehammer Games
I missed this one ! audio_sound_gain can be progressive ! Nice. Could be of a great help in my problem, to chain sounds smoothly.
I imagine how you can set it at the begining of audio_play for fade in, but for fade out, do you approximate the end of the sound with an alarm ? Alarms are so precise ? The problem with the sound is that it is timed by the soundcard, no by the CPU, so for long sounds, there could be a discrepency, but it should be negligible.
I'm with you. I thought I was going to have to do some LERPing with the gain to get the fading for my system but then I saw this and it made my code that much simpler.

So, I don't think you will ever get very precise...but since the sound is at 0 volume it doesn't matter if you are a few milliseconds off anyway. I personally would just set an alarm(whichever timing method you use, variables, actual alarm events, whatever), or even just in some step event use current_time and see if you pass a point. Let's say I'm doing a 250 millisecond cross fade(not instant, but close). If the "intro" is 500 milliseconds, then I know I need to start the crossfading when current_time is current_time + 250. So once that if check passes, do the audio_gain thing to progress the intro to silence, and the play the looping part, starting at zero volume, progressing to full volume over 250 milliseconds. Of course you can tweak the time to get the effect better as you need. And once the intro sound is done, you don't have to stop playing it since it wasn't looping.

Now, going from the looping part to the ending part is quite similar. However, the beginning of the cross-fade happens on demand instead of being timer based. But you still use a timer to actually stop the looping sound once it's volume is at zero. Besides swapping that around, the process is basically the same.

Remember that precision is really not going to be super important if you are doing things this way. In neither case would you ever hear popping because the sound is always at(or really close) to zero volume when the popping could happen, and there is always another sound that is at full volume(or also really close) that would hide it away.
 

DaveInDev

Member
Remember that precision is really not going to be super important if you are doing things this way. In neither case would you ever hear popping because the sound is always at(or really close) to zero volume when the popping could happen, and there is always another sound that is at full volume(or also really close) that would hide it away.
Right !

I finally managed to do the thing I wanted with the audioqueue. So it's very precise chaining and I get no pop at all. I just split the looping area into chunks of approx. 44100/20 samples (1 period of a low 20Hz sine) and I correct the chunk cut point to get close to a zero crossing. Like this, the release trigger is reactive enough. Finding zero crossing was tricky, I tried with many different waves, especially FM ones, with a lot of high harmonics and I wrote this progressive search which works nicely (no pops while chaining any sample, without fadings) :

Code:
function find_buffer_zero(buff, pos_init)
{
    // find next zero of a PCM 16bit stereo audio stored into a buffer_u8
  
    var val_l, val_r, val_m, val_prev = 0;
    var pos, l = buffer_get_size(buff);
    var look_forward;    // how many samples to explore
    var range_to_zero = 1024;    // how close to the zero should I be ? (double at each pass)
  
    pos_init = 4 * (pos_init div 4); // align on audio sample

    while(range_to_zero <= 32768)
    {
        look_forward = 44100/20; // explore at least the period of a 20Hz sine wave (lowest sound, you must have a crossing in this range !)
      
        buffer_seek(buff, buffer_seek_start, pos_init);
        pos = pos_init;
      
        while(pos < l)
        {
            val_l = buffer_read(buff,buffer_s16);
            val_r = buffer_read(buff,buffer_s16);
            val_m = (val_l + val_r) div 2;
      
            if((val_m == 0) and (val_prev == 0))
                return(pos);
          
            if((sign(val_m) != sign(val_prev)) and (abs(val_m - val_prev) <= range_to_zero))
            {
                // crossing found
                if(pos == 0)
                    return(pos);
                else
                {
                    if(abs(val_m) > abs(val_prev))    // choose the closest one
                        pos -= 4;
                      
                    return(pos);
                }
            }
      
            val_prev = val_m;
            pos += 4;
      
            if(--look_forward < 0) break;
        }
      
        range_to_zero *= 2;
    }
  
    return(pos_init);    // nothing found, return the init pos....
}
 

DaveInDev

Member
I might if I can figure out how, and if you can get a buffer out of a sound asset possibly.
I just read this sentence again because I ran into this problem : for the moment, you cannot edit a sound asset using a buffer, right ?
Moreover, it seems that audio buffer only works on PCM uncompressed and that audio assets can be stored compressed with free OGG or commercial MP3, stream or in memory.

You have at least audio_get_type that specifies if a sound id is streamed (and hard to buffer) or in memory.
 

kburkhart84

Firehammer Games
I just read this sentence again because I ran into this problem : for the moment, you cannot edit a sound asset using a buffer, right ?
Moreover, it seems that audio buffer only works on PCM uncompressed and that audio assets can be stored compressed with free OGG or commercial MP3, stream or in memory.

You have at least audio_get_type that specifies if a sound id is streamed (and hard to buffer) or in memory.
Indeed, I haven't been able to find anything internal to GMS that allows this, even if the sound is of the right format. It's why the looping point thing is on my TODO list but is on hold unless something changes. I'm possibly going to try doing some kind of thing where I cross-fade into a second track(the same music though), and see how close the thing can be. If I'm crossfading it, and since the track's position can be set accurately, I might be able to get it close enough to be viable. It would require the end of the track to have a bit of the first part of the loop tacked on to get the crossfade to work, but I'll mess with it.

I think the only thing you can do is get the buffers filled from your own data. So you would need to do the ogg conversion yourself if that's the format, or use the much easier WAV format, but put them external(in the included files works) section instead of importing them as sound assets.
 

DaveInDev

Member
I think the only thing you can do is get the buffers filled from your own data. So you would need to do the ogg conversion yourself if that's the format, or use the much easier WAV format, but put them external(in the included files works) section instead of importing them as sound assets.
Yes finally it's not a problem, because sound assets are already imported files, so I can keep them in the included dir. The only difference is that the user have access to these files. But if I want to "protect" them, I can still store them as raw data, even all in the same big sound file, not of any known format, just PCM and eventually lightly "crypted".
 
Top