1 Overview

1.1 Introduction

Grainstorm has four tracks, which never stop to loop if active and a sound is loaded.

While the track loops short snippets (grains) are extracted from the loaded sound and after an envelope is applied to them written to an output buffer.

If the velocity at which the READ OFFSET, which is the position from where the current grain is read, is moved along the sound is equal (SPEED 1.0) to the velocity at which output is sent to Digital to Analog Conversion and there are no gaps between successive written grains the resulting sound is the same as the original sound. When it is unequal (SPEED unequal to 1.0) and overlapping grains are written to the output buffer, there are periodical phase interferences of the frequency components of the grains in the resulting sound with period determined by the DENSITY parameter.

A large part of Grainstorm is about getting evolving spectra by modulation of parameters that affect the way how grains are read from the input sound, how they are written to the output buffer and how phases align when grains are overlapped. Most of these parameters can be modulated by LFO.

Grainstorm has some effects that manipulate individual grains, both in the time- and spectral domain. The spectral domain effects are known as Phase Vocoder and Cross Synthesis effects. Cross Synthesis blends two sounds into one hybrid sound.

1.2 Application Structure

The four tracks each have three LFOs and three Envelope Followers.

It is possible to synchronize the loop durations of all tracks and LFOs.

Each track has three effect queues, one for the grain by grain- (2), one for the mono- (6) and one for the stereo effects (7). Each effect is appended to the corresponding queue when activated. The last activated effect will be the last one to be processed.

Here is what happens when a track computes sound:

1. Read the grain, apply PRE GAIN -> Pass it through grain by grain effects -> Phase Vocoder if active -> Cross Synthesis if active -> Apply grain envelope -> Write grain to outputbuffer -> Do the same with next grain until no more grains are to be processed
2. Mono Effects on outputbuffer
3. Stereo Effects on outputbuffer, apply POST GAIN

Note: Stereo effects have been seperated from the mono effects for performance reasons.

2 Controls

2.1 Navigation

The four tracks can be accessed by the upper horizontal buttons, numbered 1 - 4.

Track specific synthesis parameters are accessed by the vertical buttons numbered 1 - 7 inside the track. They are arranged in sections in the order by which the tracks read parameters when computing sound.

Sections have subsections which can be browsed by the arrows or entered directly by tapping the text inside the arrows:


Sections and subsections Track1

2.2 General Controls

powerUpper right power button outside the track turns the audio engine on/off.

settingsUpper settings button outside the track enters general settings.

Short double tap on rotary controls and sliders reset their current value to its initial state.

Tap on the title of a control opens an input field.

Short double tap on waveform changes the loop mode.

Waveform can be pinch zoomed.

2.3 Track and LFO Specific Controls

Eject button Opens the Android filepicker to load a sound. Decoding can be interrupted by pressing the back button on your device.

Mic button Starts and stops microphone recording.

Sync button Activates or deactivates Syncing on the specific track, LFO or delay.

back2 forw Syncfactor

settingsSome track specific settings are accessed by the settings button that is located inside the track.

power Activate/Deactivate track or LFO.

swapSKIP BACK to start of the loop

play STOP set playback speed to zero

play PLAY set playback speed to its last value

swapSKIP FORW to end of the loop

swap REV inverse playback direction

slowerVEL DOWN Slow down times syncfactor

fasterVEL UP Speed up times syncfactor

Sync buttonSYNC

2.4 Loop Modes

Loop Mode is changed by a short double tap on the waveform.

Three Loop Modes are available:

When the offset position marker has blue colour, read offset is set to the beginning of the loop if it passes the end of the loop. If it passes the beginning of the loop, it is set to the end of the loop.

When it is orange coloured looping is in ping-pong mode and the playback direction is reversed when read offset passes either of the loop boundaries. Additionally the read direction of the sound data is reversed.

If it is green coloured, only the playback direction is reversed when read offset passes either of the loop boundaries. The read direction of the sound data is left untouched.

The second mode gives smoother loops when random read or write offset is inactive (and playbackspeed is 1.0) while the third mode gives smoother loops when it is active.

When TAPE MODE is active and looping is in the third mode (green coloured loop endpoints) bidirectional looping is done.

2.5 Syncing

Sync buttonSYNC

Sync button Activates or deactivates Syncing on the specific track, LFO or delay.

back2 forwSyncfactor

SYNC modifies playback speed, LFO rate or delay of all tracks, LFOs and delays with Syncing enabled, so that their loop durations match the loop duration of the calling track, LFO or delay. If the resulting playback speed, LFO rate or delay would be too small or too large, it is multiplied or divided by the sync factor of the calling track, LFO or delay until it is in the appropiate range.

If STOP, PLAY, VEL DOWN, VEL UP, SKIP, REV is pressed on a track, LFO or delay that has Syncing enabled, the same event is sent to all tracks, LFOs and delays that have syncing enabled.

2.6 Sharing of Sound Input between Tracks

Per default each track routes to itself what has been imported from a sound file or recorded. Inside the track settings the input routing can be changed. DENSITY, SPEED and loop positions are shared, while all other parameters and the effects can be independently adjusted for each track that operate on the same sound.
This may be useful for processing different parts of the spectrum individually, by lowpass filtering the signal on one track and highpass filtering it on another track for example or for layering different pitches.

3 Synthesis Parameters

The following descriptions are in the order in which synthesis parameters appear in the UI from top to bottom for the main sections and left to right for the subsections. The UI is arranged in the order in which parameters are read by the tracks for sound synthesis.

3.1 Granulation 1

GRAINSIZE determines the size of the grain. Range 5ms to 1s.

DENSITY the rate at which new grains are triggered in Hz. Range 1-500 Hz.

ENV CYCLES means envelope cycles. Determines how often the wavetable containing the grain envelope is iterated through when the envelope is applied to the grain.

RND READ adds a random offset to the current read position inside the sampled sound. DEVIATION adds randomness to grain triggering. Both cancel the periodicity of phase interferences of overlapping grains in the outputbuffer. The tradeoff is introduction of noise.

PITCH changes the overall pitch of the resulting sound by simple resampling of individual grains. For a pitch increase of one octave the grain is read from twice the samples as its grain size, with every second sample skipped.

GRAINS, SILENCE to skip grains.

Note: When the app is in stereo mode all randomized granulation parameters give a stereo effect, because of different randomized values on each channel.

3.2 Granulation 2

BANDLIMITED ENVELOPE a bandlimited grain envelope is computed. Please also see 3.6, Envelopes.

ENVELOPE ONLY / NO INPUT the grain is not filled with samples and only consists of the grain envelope. Please also see 3.6, Envelopes.

INTEGER ENVCYCLES stands for integer envelope cycles. If deactivated only fractions of the grain envelope may be applied. The UI shows a rounded value, but the real value of ENV CYCLES may be a fraction. Deactivation distorts the sound, but may be interesting when ENV CYCLES is LFO modulated.

3.3 Grain Sequencing

Granulation2 also includes the controls for a simple sequencer which is exposed in Granulation3. It can be used to build sequences of grains each with individual pitch/grainsize/gain. STEPS determines the length of the sequence. VEL UP/DOWN are routed to DENSITY, which is multiplied or divided by sync factor, such that grains are triggered at a higher or lower rate, which makes the sequence run faster or slower. On receiving the SYNC event DENSITY is modified so that the duration of the sequence matches the loop duration of the calling unit and vice versa.

Note: Use the slider at the bottom of Granulation3 to scroll the screen.

3.4 Random Pitch

Adds a random offset to the the current pitch. As an example, when SEMITONES is set to 6, MIN to -1 and MAX to 1, pitch of the current grain is randomly chosen to be one tritone below the current pitch, the current pitch and one tritone above the current pitch.

PITCH MIN AND PITCH MAX additionally add a random pitch offset that is not restricted to semitones.

3.5 Grain by Grain Effects

RESON is a basic resonant bandpass filter. Filters each grain with randomized center frequency and bandwidth (Q).

RINGMOD Ring modulates each grain with randomized modulation rates.

REVERB A simple grain by grain reverb. State of the reverb is not reset for new grains. Previous grains still resonate, such that different grainsizes, densities and envelopes have a qualitative impact on the reverb.

TAPE MODE When activated the sound is not granulated. Everything granulation related is skipped. SPEED is restricted to 0.1 - 2.0x (ten times slowed down to two times sped up). Lower or higher values are ignored. When SPEED is unequal to 1.0 highest quality resampling is done, which is less expensive at small integer ratios. (SPEED at 0.1, 0.15, 0.2, ... , 1.95, 2.0). FADE can be used to smooth the loop. Please watch the waveform to understand how it works.

Grain by Grain Effects included in Upgrade

BUZZ fills the grain with PARTIALS harmonics of a fundamental with frequency CPS. OFFSET determines from which harmonic to start. BRIGHTNESS determines the loudness of higher harmonics.

MODAL either passes an impulse with loudness PRE GAIN or the grain with loudness IN GAIN through a bank of resonators at frequencies set to the resonances (modes) of some well known sound objects.

VCO fills the grain with the output of up to three bandlimited oscillators. Each oscillator can be detuned in relation to CPS which results in beating. When the app is in stereo mode DETUNELR results in binaural beating. Pulsewidth (PW) can be (LFO) modulated when the oscillator waveform is either PULSE or RAMP. This effect has its own FILTER instance, which can be controlled from the grain by grain FILTER section and is activated with the FILTER switch.

CPS of BUZZ, MODAL and VCO takes its value from PDETECT if FOLLOW is enabled in either of these. PDETECT has to be active.

If HOLD is active in BUZZ, MODAL and VCO the current pitch is held.

FILTER filters each grain with one of some resonant filters, with sweepable CUT/CF on a grain per grain basis. In conjunction with VCO this creates grains in analogue substractive style.

PDETECT Does a pitch detection on the unprocessed source sound before any of the grain by grain effects is invoked. SMOOTH smoothes rapid changes of the dectected pitch in between successive pitch detections. PRELP is routed to a low pass filter, by which the signal is filtered before pitch detection and may give a more precise result. BOUNDA and BOUNDB can be adjusted to keep the detected pitch in a certain range. TRANSPOSE transposes the detected pitch in octaves.

Note: Pitch detection is done with each new grain and should only be used at low grain densities.

3.6 Phase Vocoder, Cross Synthesis


Before we describe each effect that falls into the two categories we try to give a short description of the algorithms that underlie both and implementation details.

Phase Vocoder and Cross Synthesis effects manipulate the spectral representation of the grain. The term spectral representation is used instead of spectrum, because all effects should affect its spectrum, but not all do this by manipulation of the spectral representation of the signal. For the same reason the term frequency components is used instead of frequencies.

To transform the grain from its time- into its spectral representation a Fast Fourier Transform (FFT) is taken from it. The FFT is limited to power of two grain sizes in samples. Therefore, when PV or Cross Synthesis is active, GRAINSIZE is ignored and grain size can only be changed by the FFT SIZE parameter. The fft size parameter is given in samples. It has a huge influence on the resulting sound. Usually larger fft sizes give a better sound quality, but are computationally more demanding.

To not twice perform the forward- and the inverse FFT, which is the transformation of the signal from its spectral representation back to its time representation, when both PV and Cross Synthesis are active, there is only one fft size for both. Thus the output of the PV can directly be passed to the input of Cross Synthesis.

Fast Fourier Transform refers to an optimized algorithm to compute the real to complex discrete Short Time Fourier Transform. The result of the forward r2c STFT from an n samples long grain are n/2+1 complex numbers which after conversion from rectangular to POLAR FORM give the magnitudes and phases of n/2+1 frequency components increasing in frequency from 0Hz to samplerate/2 Hz in steps of samplerate/n Hz. The magnitude is how much of the corresponding frequency component is present in the source signal at the point the grain was extracted from and the phase is the phase of that frequency component. The inverse c2r STFT takes the n/2+1 complex numbers computed by the forward STFT to reconstruct the grain.

Phase Vocoder and Cross Synthesis effects consist in taking the forward STFT, doing mostly simple manipulations of the computed magnitude/phase pairs and then doing the inverse STFT on consecutive grains. The term Phase Vocoder has probably been chosen because the manipulations were done on the phases and the Fourier Transform can be seen as a bank of filters as used by the time domain Vocoder.

Phase Vocoder

PHASE CORRECTION I and II try to eliminate phase interferences at playback speeds unequal to 1.0 in between successive grains by adjusting the phases of the frequency components of the current grain in relation to the phases of the frequency components of the previous grain such that they align continuously.

FORMANT STRETCH computes a spectral envelope, widens or narrows it and reapplies it to the grain.

RANDOM PHASE sets the phase of each frequency component of the grain to a random value. It is a simple manipulation, but we like it. It gives a real texture, if you define texture as having a continuous and a random element.

ZERO PHASE Sets the phase of each frequency component of the grain to zero. Gives a metallic sound.

OSC BANK Only keeps the loudest frequency components of the grain, while setting all others to zero. The phase of the kept frequencies components can either be the original phase, zero or continuous.

Cross Synthesis

Cross Synthesis effects blend two sounds into one hybrid sound. It is not to be confused with mixing that just adds the sounds.

When it is enabled on a track, which we call source, the MODULATOR track, which is the track to blend the source with, does not produce any output by itself and stops computation when it has done its part in providing the current grain in spectral representation, which is then taken by the source to finish computation.

Changing grain size or fft size on the modulator does not have any effect, because fft sizes have to be equal and the fft size of the source is taken.

POLAR FORM does simple interchanges of the magnitudes or phases of source or modulator.

CEPSTRUM shapes the spectrum of the source by the spectral envelope of the modulator.

CEPSTRUM WHITE additionally shapes the spectral envelope of the modulator by the spectral envelope of the modulator.

INTERPOLATION interpolates between the spectral envelopes of modulator and source. The interpolation parameter can be LFO modulated.

The algorithm for getting the spectral envelope used by these effects is called "Cepstrum Method for Spectral Envelope Estimation". The CUT OFF parameter determines the smoothness of the spectral envelope with lower cutoff giving a smoother spectral envelope.

VOCODER packs the result of the FFT into CHANNELS, with magnitudes averaged and interpolated between source and modulator. Phases are not modified.

Vocoder and cepstrum based algorithms have their time domain counterpart, being the mono effects vocoder for the first and the lpc based vocoder of the pro upgrade for the second.

CONVOLUTION convolves the signals of source and modulator by doing a fast convolution as in the well known convolution reverbs, with the difference that convolution reverbs do not granulate and the impulse response is static and usually longer, while in CONVOLUTION the impulse response (the modulator signal) is time varying and shorter (up to 16384 samples, about 0.4 seconds long).

3.7 Grain Envelopes

Grainstorm has two options to create a grain envelope, ENV1 and ENV2.

ENV2 is to draw an envelope with a shape that resembles an ADSR curve.

ENV1 consists in an inner envelope, which is the main envelope and an outer envelope by which the inner envelope is shaped. The DEPTH parameter determines how much of the outer envelope is applied to the inner envelope and the CYCLES parameters determines how many times the outer envelope is applied. This way complex, but symmetric waveforms can be created. The available waveforms for outer and inner envelope are common waveforms from sound synthesis and others which are used in granular synthesis that do not have discontinuities. Waveforms that have the prefix "FULL" are bipolar, while waveforms that do not have this prefix are unipolar.

Waveforms that have discontinuities cause distortion. The app can compute bandlimited envelopes (GRANULATION2, BANDLIMITED ENVELOPE) which smoothes discontinuities. The computation of bandlimited envelopes is expensive and it is recommended to only enable it when you have an envelope with edges and you want to hear a bandlimited version.

It is possible to disable filling the grain with sampled sound and only use the envelope as content for the grain. (GRANULATION2, ENVELOPE ONLY) This option is experimental. It often results in silence in particular with large grain sizes at low density. For example, if grainsize is set to 1 second and density to 1 Hz the result is the envelope waveform oscillating at 1 Hz, considerable less than the 18-20Hz humans are able to perceive as sound. There may also be a strong DC component, more so with waveforms that are only positive. If inner envelope is set to a bipolar waveform, it is more unlikely that grains sum up to a signal with only DC.

3.8 LFOs

LFOs with syncing enabled are synced to the loop of the track, such that the duration of one LFO cycle matches the loop duration of the track that calls SYNC. The other direction also works: the loop durations of the tracks with enabled syncing are synced to one LFO cycle by pressing the SYNC button of the LFO.

Each parameter that can be LFO modulated has its own modulation range, which can be adjusted by BOUND A and BOUND B. On application start BOUND A and BOUND B are both set to the same initial value and when the LFO gets activated there will not be any modulation, because the modulation range is zero. Modulation starts when BOUND A and BOUND B have different values.

You can choose between four common and a custom waveform.

Note: When an LFO is controlling a grain parameter at high rate and at the same time DENSITY is at low rate, it may sound as if the LFO was running slowly or not running at all. For example, if the LFO rate is 20Hz and DENSITY is 20Hz, the LFO phase at the point it is looked up for the corresponding grain parameter will be the same for each new grain. DENSITY should at least be 40Hz or more in this case for satisfying results. (LFO rate / densityratio of 1:2 or better 1:4) In other words: For grain parameters the LFO phase is looked up (sampled) at relatively low intervals and a grain density with Nyquist lower than the LFO rate will produce LFO aliasing.

LFO waveform editor

The number of points is changed with SEGMENTS.

There are three options to connect points. Polynomial, linear and rectangular.

A short double tap on the editor window adjusts the positions of points according to QUANT.

Waveform editor can be pinch zoomed.


3.9 Envelope Followers

Envelope followers have two modes, PEAK and MS. MS stands for mean square and gives a smoother envelope.

If ADD is active the envelope will be added to the current value of the parameter that is set as target. If SUB is active it will be subtracted.

When TRACK1 - TRACK4 is set to a track different than the track on which the envelope follower is active, for performance reasons the control signal is the last computed buffer of that track (one buffer delay), otherwise it is what arrives at the effect in.

3.10 Effects

SSB MOD is based on "Single Sideband Modulation". The effect is also called "Frequency Shifter". It basically does the same as ring modulation. Carrier frequencies below audio rate result in a phasing effect, but only if there is some dry signal mixed with the wet signal.

VOCODER blends the output of two tracks as does Cross Synthesis. It takes as input the computed output buffers. (Granulation is already done at this point). The track whose sound is to be blended with is indicated as MODULATOR. Advantage is taken of multithreading, and the source track waits for the modulator track to precompute its part in the effect. If the modulator is inactive the effect is bypassed.
Unlike Cross Synthesis the modulator track still produces output and its POSTGAIN can be reduced if this is undesired.
In commercial vocoders there usually is a synth integrated which acts as source, while the signal which is routed into the vocoder (often a voice) acts as modulator. Grainstorm allows you to take any two sounds and do some granular manipulations before used as source or modulator.

CREVERB Convolution Reverb. Takes as impulse response what is inside the loop of the track from where it is to be loaded. Ten seconds maximum impulse response length.
It should be noted that any sound can be tried as impulse response, not only ambiental resonances, but care has to be taken, because the result may be very loud. Starting with a short impulse response or a stream of very short grains at low density as impulse should be a good practice.

COMPRESSOR allows sidechain compression. As with the envelope followers, if the track to take the control signal from is different than the track on which the compressor is active, the last computed output buffer of that track is taken as control signal, such that it has a delay of one buffer. Because of multithreading the compressor cannot wait until output is ready on another track.

Effects included in Upgrade

We only mention the uncommon effects. It should be clear what the other effects do.

DELAY and the delays of MULTIDELAY have a HOLD switch. When activated FEEDBACK is set to 1.0 and no more input is inserted into the delaylines, resulting in a loop of what is currently in the delay.

Delaytime can be synced to trackloops and LFO cycles and vice versa as described above.
SMOOTH determines interpolation time between internal delaybuffers when delaytime is changed.

LPC VOCODER is based on Linear Predictive Coding. It computes a resonant filter from the modulator by lpc estimation and passes the source through that filter. ORDER determines the order of the filter. When WHITE is active the so called residual of the source is taken, which is obtained by lpc analysis, not the source as it is. The internet has further information on lpc.

SPECTRAL FILTER tries to slow down spectral evolution by filtering each of the channels of successive FFTs taken from the signal with special resonant low pass filters. A shape can be drawn with the horizontal axis representing frequency and the vertical axis representing the sharpness of the filters. (A higher curve means the filter smoothes spectral variations more sharply).

SPECTRAL DELAY delays the spectrum. RAMP UP means lineary increasing delay from low to high frequencies, RAMP down means lineary decreasing delay from low to high frequencies and DISPERSION applies a random to delay to each frequency. It is not possible to draw an arbitrary delay curve like SPECTRAL DELAY2, as spectral delay 1 is based on convolution and impulse responses that are computed from scratch.

SPECTRAL DELAY2 delays channels of successive FFTs taken from the signal. An arbitray curve can be drawn with the horizontal axis representing frequency and the vertical axis representing delay. If FOLLOW is active individual FFT channels are delayed according to their magnitude.

PVAMPS Phase Vocoder Amplitudes. Selects the loudest channels of successive FFTs of the incoming signal to drive up to six bandlimited oscillators at the frequencies of the computed loudest channels. SMOOTH smoothes rapid variations of channel indices in between successive FFTs.

SPECTRAL FILTER, SPECTRAL DELAY2 and PVAMPS work on the computed outputbuffer. The granulation part of the app is finished at the point these effects are invoked. As they are Phase Vocoder effects, the signal on which they operate has to be granulated, and they do their own granulation internally. These effects introduce a delay of 256 samples.

FM Six Frequency Modulation units each consisting of up to six operators. Chords can be built by adjusting SEMITONES for each unit. Their frequency is changed in relation to CPS. DETUNE and FB affects beating. FB is routed to the gain of a low pass filter by which the self modulation signal of some of the operators is filtered. DETUNELR gives binaural beating if the app is in stereo mode. INDEX of modulation can be set as envelope follower target. With FOLLOW enabled pitch is taken from PDETECT.

PDETECT does the same as the pitch detector from the grain effects, with the difference that it takes as input the outputbuffer after granulation and before the mono effects instead of the unmodified source sound.

4 Miscellaneous

4.1 Audio buffer size, control rate and latency

Control rate depends on audio buffer size as all control parameters are read before each new audio buffer is computed. This is then sent to the DAC. This means that by reducing audio buffer size the application gets more responsive at the cost of slightly increased computational overhead. The smallest possible buffer size is different from device to device and it has to be tried to see which buffersize will work. On some devices there may be software dependent irregularities in the period the DAC requests new buffers which results in clipping. In such cases larger buffer sizes have to be used.

4.2 Load

The bottom status bar shows the sum of time the audio engine has taken to provide new buffers in percentage of the total time there has been available to provide new buffers. The value is updated every 250ms. It should never get close to 100%, otherwise audio clipping will occur. As the value gives the sum over 250ms there may have been xruns in between and the value can still indicate less than 100%.

4.3 Performance

The audio engine is multithreaded, based on the fact that modern CPUs have various cores and work can be distributed. Each channel of each track runs in its own thread. If your CPU has various fast cores full load on all four tracks in stereo may be possible. Load should stay the same regardless if wether one or multiple tracks are active. It may help to deactivate internet and close other applications that run in the background and may temporarily require access to the CPU if you encounter glitches.

4.4 RAM Usage

In raw state, when no audiodata is loaded Grainstorm uses about 50mb of ram on a cell phone and about 70mb on a 10 inch tablet. As all audiodata is loaded into ram for fast access, a five minute, 16bit, 48kHz stereo file increases these values by about 60mb. If Android is running low on memory it shuts down applications. If you encounter problems with large audio files (like the app being randomly shut down) close other applications to free memory or switch to mono mode. If nothing helps you will have to use shorter audio files.

4.5 MIDI

If you use a physical controller, connect it from the Main menu under "MIDI Connection" or for virtual controllers/software choose Grainstorm as MIDI Receiver from their preferences.

Learning mode is entered by pressing the small rect at the bottom of the left sidebar. Currently only "Note On" events for the buttons and "Control Change" events for rotary controls/sliders are supported. Assignable controls will change color when learning mode is active. When touching any of these a pop up window will open. On moving a slider or pressing a button on your controller or on receiving any MIDI event, it should indicate the channel and control value received. Press commit to assign.

Some controllers send two "Note On" events, one when a button gets pressed and one when a button gets released. There is an optional switch in the Settings Menu to handle these controllers.

If you own the upgrade the LFOs can be synced to an external midiclock signal. If it is active on an LFO its phase is no longer updated internally. Instead it is updated when the app receives midiclock events. The phase advance can be modified with the "MIDI x" parameter. For example, if your controller is set to 120 BPM and this parameter is set to 0.1 the LFO will complete 12 cycles in one minute.