As I see it, timers are a pretty valid approach, though maybe not the ones created via timelines or alarms.
You might want to take into account
delta_time and/or
current track position rather than the number of frames, or else the game might desynchronise.
Now how to make the timers not so tedious: I generally would divide the track into "sections", and try to determine how many "beats" are in each section. For example, I could determine that in section from 0:34.254352 to 1:15.6245634 there are 96 beats spread uniformly. It means that every time a 1/96th part of that section passes, another "beat" should make whatever the game has in store. Some tracks might play around with tempo, so determining timings for these might be harder, but the general idea remains the same - identify the timings of "beats" in the track, then decide what to do with specific beats (e.g. shooting a bullet at a player for each beat in the given section, or maybe moving to another position every fourth beat).
You still need to decide what happens at which beat, and the more complex patterns you want to make, the more fine-grained beat management will be. But at least now you think in terms of a sequence of discrete steps - that you can reposition in time if the need arises - rather than wobbly timestamps that are a pain to change, especially if they are at multiple positions.
Hope this helps.