Concept

Voice

One playback instance. What sound.play() returns.

TL;DR

A Voice is a single, live playback. It owns its own gain stage and source node, fires lifecycle cues, can be faded or stopped at any time, and tears itself down on natural end. You don't construct it; you receive it from sound.play().

API surface

Voice — public interface ts
class Voice {
  readonly id: number;
  readonly priority: number;
  readonly bus: string | undefined;
  readonly sourceName: string | undefined; // name of the Sound that spawned it
  readonly startedAt: number;              // engine.now at spawn
  readonly playbackRate: number;           // current rate
  readonly isPaused: boolean;
  readonly spatializer: Spatializer | undefined;

  readonly ended: Promise<void>;           // resolves on natural end, stop, or abort

  fade(opts: { to: number; duration: number; curve?: FadeCurve }): Promise<void>;
  pause(): void;                           // captures offset
  resume(): void;                          // resumes from captured offset
  setPlaybackRate(rate: number, opts?: { duration?: number; curve?: FadeCurve }): void;
  stop(opts?: { fade?: number }): void;     // ~8 ms click-free fade by default
  cues(): AsyncIterableIterator<'started' | 'paused' | 'resumed' | 'ended'>;
  level(): { rms: number; peak: number };   // live amplitude readout (lazy analyser tap)
}

Live demo

Hit the button repeatedly. Random pitch + volume jitter on every voice.

Recipes

Random pitch + volume jitter

ts ts
// Stacked SFX with variation so they don't sound robotic.
for (let i = 0; i < 6; i++) {
  engine.sound('hit').play({
    pitch:  { jitter: 0.08 },        // ±8% playback rate
    volume: { jitter: 0.05 },        // ±5% gain
  });
}

Fade-in on start

play(// "Drop in smoothly" — voice ramps from 0 → volume over fadeIn seconds. // The dual of the click-free stop fade. engine.sound('ambience').play({ loop: true, volume: 0.7, fadeIn: 0.5 });) is the dual of the click-free stop fade — voice ramps from 0 to volume over the configured window. Convenience for ambient layers that should drift in instead of pop in.

ts ts
// "Drop in smoothly" — voice ramps from 0 → volume over fadeIn seconds.
// The dual of the click-free stop fade.
engine.sound('ambience').play({ loop: true, volume: 0.7, fadeIn: 0.5 });

Async iterator for cues

ts ts
const v = engine.sound('intro').play();
for await (const cue of v.cues()) {
  if (cue === 'started') analytics.send('intro:start');
  if (cue === 'ended')   ui.advance();
}

AbortSignal cancellation

ts ts
const ac = new AbortController();
const v = engine.sound('alert').play({ signal: ac.signal });

// Anywhere — close a modal, route change, etc.
ac.abort();
await v.ended;                       // resolves immediately

Pause / resume on blur

pause() tears down the current source node and remembers the sample offset. resume() rebuilds the source and starts from that offset — for looping voices this round-trips through the loop region.

ts ts
// Pause-on-blur: keep the voice alive across menu/modal transitions.
const v = engine.sound('intro').play({ loop: true });

window.addEventListener('blur', () => v.pause());
window.addEventListener('focus', () => v.resume());

// resume() picks up at the offset captured on pause().

Loop crossfade (off by default)

AudioBufferSourceNode's native loop is a hard cut from loopEnd back to loopStart. If those points don't sit on a zero crossing, you get a click every loop. Setting loopCrossfade spawns a parallel buffer source one crossfade-window before the boundary and equal-power-ramps between them — same trick a sample editor would offer.

ts ts
// Off by default. Set loopCrossfade > 0 to splice an equal-power
// crossfade at the loop boundary — masks the click from non-zero-crossing
// loop regions without needing to re-edit the asset.
const v = engine.sound('music-bed').play({
  loop: true,
  loopStart: 0.04,        // skip a fade-in
  loopEnd: 31.96,         // and fade-out
  loopCrossfade: 0.05,    // 50 ms equal-power overlap
});

// Cost: one extra AudioBufferSourceNode + GainNode per loop iteration.
// Ignored when loop is false, or when the region is shorter than 2× the
// crossfade window (silent fallback to native hard-cut loop).

Live playback-rate ramp

setPlaybackRate automates the underlying playbackRate AudioParam. Pitch and tempo move together — for pitch-preserving stretch see pitch & time-stretch.

ts ts
// Slow-mo sting on a boss intro — ramp the rate over 800 ms.
const v = engine.sound('boss-stinger').play();
v.setPlaybackRate(0.6, { duration: 0.8, curve: 'easeOut' });

Per-voice level readout

voice.level() returns { rms, peak } as linear values in [0..1]. The first call lazily attaches an AnalyserNode to the voice's gain stage — no cost when nothing reads. Powers the 'quietest' voice-stealing strategy and any UI that needs per-voice meters; see the rhythm-metronome example for a runnable demo.

ts ts
// Per-voice peak meter — useful for "show the loudest voice" UI,
// or for driving custom voice-stealing rules beyond the built-in strategies.
function tick() {
  for (const v of engine.activeVoices()) {
    const lv = v.level();
    if (lv.peak > 0.95) flashClipIndicator(v.id);
  }
  requestAnimationFrame(tick);
}
requestAnimationFrame(tick);

Live spatial control

The Spatializer is now exposed on the Voice. Hold the ref, call setPan/setPosition from your render loop.

ts ts
// Hold the voice and steer the spatializer as the source moves.
const v = engine.sound('engine').play({
  loop: true,
  spatializer: { position: [0, 0, 0] },
});

requestAnimationFrame(function tick() {
  if (v.spatializer) v.spatializer.setPosition(player.x, 0, player.z);
  requestAnimationFrame(tick);
});

Stop semantics

voice.stop() applies a short click-free fade-out before the source node actually stops — Web Audio cuts buffers mid-waveform on source.stop(), which produces a digital click on non-zero crossings. The default ~8 ms ramp on the gain stage suppresses that without being audible as a fade. Pass { fade: 0 } for a sample-accurate hard cut, or set voice.stopFade on the engine to override the default. The behaviour applies to all stop paths: explicit stop(), AbortSignal aborts, region-timer expiry, and concurrency-driven stealing.

Pitfalls

Don't hold a Voice ref past ended.
Once ended resolves, the source/gain are disconnected. Calling stop() is a no-op; fade() resolves immediately.
pause() rebuilds the source on resume().
The Web Audio API doesn't expose pause on a buffer source, so we stop + re-spawn at the captured offset. For very tight cue chains, prefer fade({ to: 0 }) over pause().

Related