OK...not long back, I did an explanation of what a synthesizer is made up of in the course of a series of posts. So rather than sorting around to find that, I figured it might make sense to do a better essay on the concept. Ergo...
Synthesizers, for as long as they've been around, really only consist of four 'parts'. In fact, you could extend this concept to even some of the early electronic instruments through a little bit of conceptual stretching.
Those parts? 'Generators', 'Modifiers', 'Controllers', and 'Processors'. Now, yes, in a few cases there's devices that overlap a couple of categories, but by and large everything in a synthesizer fits into these basic types. So, what this essay is about is explaining a few things about these four parts, why they have to be there, and how to use them effectively.
'Generators'
Anything that creates an initial audio signal goes into this category. Obviously, modules such as oscillators fit here, but so do noise sources, samplers, dedicated modules like drums or drone modules, and various exotic widgets like physical modelers and such. If you get some sort of sound from it, it fits here.
Now, one thing that people neglect is that, in order to really make these sources cook, certain ones need doubling, in particular simpler VCOs. This is because when you double a sound, you bring a fairly complicated set of circumstances into play, all of which relates to a desirable level of imperfection...with the end-result being described by a familiar term: 'chorusing'.
Usually, when we talk about chorusing, we're going to be discussing the electronic effect, which isn't quite the same thing. In that case, a sound goes into a circuit where there's a bypass circuit (the 'dry' channel) and a very short modulated delay (the 'wet' channel). The delay for this purpose tends to be too short for us to hear it as a typical delay effect, but when its time is modulated in various amounts and varying frequencies, we seem to hear a 'thicker' sound. Why this happens is because the modulated delay creates the necessary imperfection to the sound.
But in the case of multiple instruments or voices, the imperfection arises from the fact that these sources are never identical from one to the other, nor can they be played precisely the same. That is, in fact, where the term originates; in vocal music, having a single voice or a scant few voices on a part doesn't sound the same as what results when you have, say, a chorus of 20-30 voices on the same part. Certainly, it's not a case of increasing the volume, since the aim of the conductor is to maintain the dynamic level that a given score calls for. What actually gets increased is a certain indeterminacy; no ones' attack is precisely identical, different voices have slightly different timbral spectra, infinitesimal mistunings always happen, and so on, and none of this is ever 100% predictable. Anything with a simple waveform compliment, simple transient compliment, and the like works the same.
Like VCOs, for instance. When you read accounts of early synth designers, you always find them musing on what made their synth so 'musical', and invariably they wind up talking about tiny imperfections...component mismatches, design compromises and so on...that they pin down as the reason. And there's a lot of truth to this; back when the Minimoog was still in its initial production run up into the early 1980s, devotees of these synths discussed how certain serial number runs sounded 'better' than others (in fact, they still talk about this!) and a lot of that came down to tiny 'mistakes' that, in the end-analysis, weren't all that 'mistaken' after all!
But even if you make these things to precise tolerances (such as the Curtis and SSM chipsets), you still have to contend with 'operator error', from which you can still get plenty of accidental (or deliberate) misadjustments that result in that same voice-doubling outcome. And this is why, if you have one VCO in a build, having two or three...even of the same model...is even better. We all have heard that super-fat Moog bass sound (you know the one) that you get from a tiny amount of detuning of one VCO against the others...or more recently, the Roland 'Supersaw', which is a circuit that reliably emulates the 'problem' that would be the actual cause of that huge, sweeping sound.
But note: this will not work with anything that has a complex and constantly-changing timbral component, numerous transient elements, etc. You can't 'chorus' noise, for example, in this way, because noise consists of differently-weighted spectra in a constant, rapid state of change and, as a result, there's nothing there to 'line up' so that proper doubling can happen. Or a sample, because there's too much going on altogether to get a cohesive doubled result. No, in those cases you actually should be using a time-based chorus effect to achieve the desired doubling result by using the modulated delay to cause the sound to act against a 1:1 copy of itself with a tiny time offset.
One more point: generators that output most anything beyond noise (as well as a few noise generators, in fact) have several ways to be controlled. Either a control voltage at a steady voltage level is used for this, or control voltages of changing levels of various sorts, the latter being what we refer to as 'modulation'. In generators, this tends to be something related to pitch, but can also involve synchronization of waveform start-points and, in the case of a number of more elaborate sound sources, the actual spectra of the generator itself. Plus, with samplers, dedicated drum modules, and the like, you also have the on-off digital gates and triggers that make the sound itself start (and/or stop). Even modulating one generator with another at audio frequencies is fair game and, actually, that method of cross-modulative synthesis is a big part of the 'West Coast' sound as pioneered by Don Buchla all the way back at the beginning of synthesizers as we know them.
'Modifiers'
Now that you have that audio signal, you're going to want to screw around with it. And anything that alters the different parts of an audio signal fits into this category. Even something as dirt-simple as a ring modulator, which has been around for decades and actually originates in radio technology from decades prior to the creation of electronic music, is a modifier. In this circuit's case, two signals get combined to generate a set of 'sum' and 'difference' frequencies derived from the sounds' fundamentals and harmonics. And yeah, this sounds reeeeeally modified!
Then there's filters and waveshapers, which are two sides of the same coin. Waveshaping involves all sorts of methods of altering the incoming waveform; since the harmonic content of a waveform determines its waveshape and, hence, its sound, the methods of altering the shape of the waveform tend to increase or restructure the harmonic content we hear. Folding the waveform creates various types of timbral shifts, or you can use various methods of 'degrading' the purer waveform to create clipping, waveform stepping, and so on which usually result in distortive timbral changes. On the other hand, filters work by removing parts of the incoming waveform, often also increasing the amplitude of a certain harmonic or set of harmonics in that signal by electronically forcing the filter into 'resonance' at a given frequency and by a given amount. But filters do the inverse of waveshapers, and are key to 'subtractive' synthesis, or what we tend to term 'East Coast'. And the 'West Coast' method tended to emphasize waveshaping, of course, since Buchla et al's methods of synthesis were based on building up very complex spectra and then 'gating' these without resonance playing much of a part.
Speaking of which, that main West Coast device is known as the 'timbral gate'. With these, you classically have a voltage-controlled amplifier and a non-resonant filter (low-pass, as a rule) controlled in tandem. With this strange modifier, the amplitude AND timbre falls under the same control signal; the idea here is that this would emulate the decay of a sound if it were produced by a physical instrument. In physical devices, as the overall amplitude of a sound diminishes, so does the higher harmonic content along a similar decay curve. Don Buchla's idea here was to create a way to electronically mimic that behavior and to make his timbral gates have a somewhat-familiar sort of sonic behavior; to this day, people still refer to vactrol-based lowpass gates as having a certain 'woody' sound, like tuned percussion, or describe them as having a classic 'plook'-type character to their behavior on incoming sounds.
And about VCAs...yes, these are also modifiers. But instead of altering timbre, they alter amplitude. In a way, you could view them as 'level filters'...controlling the amplitude of an incoming signal according to a certain control signal, in much the same way as a filter controls the passage of a signal's frequency bandwidth according to the control over its cutoff frequency. In fact, both VCFs and VCAs are the prime 'customers' for what envelope generators output as their control signals, and LFO modulation of a VCA changes amplitude in the same way as timbre changes when a VCF is modulated in the same way...or, just as well, VCO frequencies (and so forth) from the selfsame LFO (or envelope generator). This is also what makes VCAs invaluable for modifying control signal amplitudes, such as LFO amplitudes so that vibrato or tremolo modulation signals can build or drop in intensity when passed through a VCA controlled by another LFO or EG.
But it's important to remember that there are two distinct types of VCAs, and you really can't use one in place of the other!
Linear VCAs are optimal for controlling the amplitude of control voltages, such as modulation signals from an LFO or the height of an envelope. These VCAs treat their control signals in a linear fashion: if you want the throughput amplitude to increase by 10%, just feed 10% more voltage to the VCA's control input. This tends to make sense when you want a 1:1 degree of modulation signal control. And since these are more optimal for control signal modification, most linear VCAs are also DC-coupled, meaning that they can pass signals whose frequencies extend all the way down to DC, as well as anything else of a lower frequency than audio. But these can also be used with audio, especially for basic mixing processes before signals reach the final processing stage.
For that stage, you have to have exponential VCAs. These tend to react to control signals in a 'law of squares'-type of manner; the resulting curve of amplification is shaped exponentially, hence the name. Now, why these are a must-have for the final parts of the signal chain has to do with how we perceive loudness. Our hearing processes are set up so that we also perceive changes in apparent volume, or loudness, as an exponential psychoacoustic function. So when an envelope decays that's controlling an exponential VCA, the passthrough signal's level appears to our ears to change in volume in a 'more correct' manner. Loud sounds are clearly loud, while soft sounds are clearly soft, and so forth. Yes, you can also use a linear VCA there...but if you do, then you have to use an exponential control source to get it to behave in the same way. Otherwise, output sounds passed through a linear VCA, controlled with a linear EG, lack a lot of 'punch' and the end-result is that your synthesizer sounds...well, pretty lame, without a lot of loudness differentiation to the listener's perspective. Because of this important usage, exponential VCAs tend to NOT pass DC or much of anything below 1-2 Hz, because DC in an audio signal results in an annoying issue known as 'DC offset'. This issue can damage amps, speakers, give false level readings when recording, and so on, and it's very much NOT desirable. Note also that this 'DC offset' is not the same as a 'DC offset voltage' coming from some sort of control module. In THAT case, you want that extra DC amount to define a certain level or tuning or whatever. But outside of the synth...nuh-uh. Not good. Also, this is why a goodly number of output modules incorporate some sort of DC isolation, to prevent stray DC from escaping into your audio chain outside of the synth. So, make sure your VCAs are exponential and AC-coupled (usually, they are) if they're going to be at the very end of your synth's signal chain!
'Controllers'
This is a huge category of devices of different sorts, and not everything that seems like a 'control' module actually is that. In fact, anything that involves logic actually belongs as part of the 'modifier' group, although what they modify are gate/trigger timing signals and not audio (although you can use logic devices as a type of audio waveshaper, too). Actual controllers are the devices in a synthesizer that make the other three main components do what it is that they do. And actually, this group can be split even further into three significant subcategories. We'll treat each one in turn...
First, there are things that are really, actually, controllers. Devices such as sequencers, keyboards, photocell controls, FSRs...and a whole host of other things that output control signals under more or less manual control, these make up this group. The idea with all of these is that a determinate output, under the synthesist's control, is being generated by these devices. But also, indeterminate control devices fit here, too; the whole gamut of randomness modules go in this slot because, while that behavior isn't usually directly under the synthesist's control, the synthesist has made the programming choice to include control via whatever sort of random factor that they know the module tends to be capable of, ergo it's just as much a 'control option' as using a keyboard, joystick, etc. A Euclidean sequencer is a good example of this: while the output of such a module has a randomness to it, it's a 'gamed' randomization under a certain degree of control by the synthesist by their choice of control functions applied to or within it. Even sample-and-hold modules fed by totally random, indeterminate signals such as white noise still have a given behavior by how the synthesist chooses to control the randomness generated by the noise generator through the 'psuedorandomness' of the S&H. So, if you make 'setting A' on a device and know it'll do 'action B' every time (more or less, in the case of random devices), you're dealing with a controller. One other key controller is the quantizer; a quantizer is actually a type of sample-and-hold circuit with a very determinant pitch-scaled output which can transform incoming changing voltages of any type to held voltages to control other modules that require fixed control voltages, such as VCOs. But if you feed white noise to a quantizer...well, you still get scalar steps, although the distribution of those will be random albeit specifically pitched, and not the same 'psuedorandom' output of an unscaled sample-and-hold.
Then we have modulation sources. In this class, you find modules that run on their own or via a control signal of another sort, and which output control voltage signals as a result. LFOs, envelope generators, function generators, sample-and-holds fed by determinate signals (such as repeating LFO waveforms, envelopes, etc) to create 'stepped' curves...all of these are modulation sources, along with a few other specialized examples which behave in much the same way.
Last, there's timing sources. This gamut of devices is comprised of everything in a modular synth that outputs the various on-off gates or triggers in some way for the use of modules that require these to do what they do. Envelope generators, for example, require triggers or gates to fire (and gates specifically to deal with 'hold' behavior) and 'one-shot' through their envelope parameters. Clock generators and modulators create and alter timing signals for all sorts of actions, ranging from synchronization of larger processes, clocking sequencer or sample and hold stepping, and to generate pulses for logic. But again, logic circuits are NOT controllers. Instead, the various gates, combiners, etc actually operate in much the same way as the varying modifier devices in the audio chain, to alter the fundamental behavior of inputted timing pulses. Case in point: the AND gate. In this case, you have two inputs for timing pulses and one output that generate a pulse only when the logic case for the gate is met. If there's a timing signal at either the A or B input, nothing happens; only when A AND B see a pulse does the gate output its pulse. Because logic gates and their relatives all function in this manner, they are actually something akin to a 'timing filter'...and, as such, they're modifier devices.
This isn't the only example of how modifiers can exist in the controller 'gamut'. Control signals, especially periodically-repeating ones, can be waveshaped in various ways not unlike how the same processes work in the audio chain. A good example is rectification, which results in very distorted audio results by altering the waveform's polarization to shift all of the waveform above the DC level. But in control waveforms, such as from an LFO, the result actually alters the waveshape in the same way, but the result when this is used as a modulation signal is actually quite different, since the signal has been 'half-waved' to create an 'above-zero-set-point' modulation signal. It's also possible to invert this (another modifier...and in this case, usable in audio to cause phase cancellation effects by combining a 'normal' and 'inverted' signal) and cause all of the modulation curve behavior to happen downward, below the zero-set. And of course, the example mentioned above of linear VCAs and their uses on modulation signal amplitudes.
People seem to ignore quite a bit about controller modules. And that's a mistake; controller modules are an important part of the 'dark arts' behind making sounds that behave with incredible complexity. By creating multiple control layers, it's possible to generate elaborate control methods that can result in sounds that, by themselves, qualify as whole compositions. For instance...take three LFOs that have voltage control capability. Feed the first one into the CV input of the second, then that into the CV input of the third. The output from the third will then behave in a very non-repetitive curve...or perhaps more repetitive...depending on how the various LFO frequencies were set. Now, feed that last output into a multiple (we'll get to those), and split it to three comparators, which generate a timing signal when a voltage threshold gets crossed. Set each one to a different threshold, feed their gate/trigger outputs to three different EGs. Then have those EGs control the amplitude of three exponential VCAs that are being fed by different and complex audio signals. Voila! You're starting off into the domain of 'generative music'...albeit, a rather simple part of that. But this illustrates why it's important to NOT neglect the wide range of controller possibilities. They bring the fun into your modular's _fun_ctionality!
'Processors'
Now, these also don't get a lot of respect. Processors are the 'everything else' that takes signals from the audio or control chains and makes them into something...else. These are different from modifiers in that they don't actually impart any change to the signal, but that they change the way the signal(s) behaves. The simplest one is, yep, the multiple. A multiple actually replicates a signal fed into it and sends copies back out. Buffered multiples, of course, are active devices which contain circuits that precisely duplicate and regenerate signals fed into them...but even a passive multiple, which has no active circuitry at all, still fits into the definition of a 'processor' because of what it does.
Mixers, also...these do the opposite by combining signals into a single signal, either monophonic, stereophonic, or even crazier sorts of spatializations. But these don't change the signals fed into them, optimally...the signals are all still there, still audible, just in a composite form at the mixer's output. Anything that works this way fit here. Also, anything that works in the context of signal mixing, such as panning, crossfading, muting, auxiliary signal send/return actions, group level control...all of these functions fit the criteria of processing since, again, the signals' character isn't being changed...only how they behave in the signal chain. And this works for both control signals and audio, since the objective doesn't involve changing what's present, only combining it, even if the resultant combination might appear different on its face value.
Also, anything that is a time-based processor, such as a delay, reverb, or chorus counts as a processor. The signal inputted to these devices is still technically intact beneath the process imposed on them by the module; even super-deep reverbs, while smearing out the sound's transients, are still outputting the original sound even though the overall temporal contour has been altered by generating hundreds or thousands of early and late-reflection copies. And delays, of course, just create singular copies of the signal and repeat them at given intervals for a given period of time.
But processors that actually alter the timbral character of a signal...in short, devices that could just as well be placed into the 'modifier' category...these aren't processors per se, but modifiers, albeit modifiers that are more appropriate at the end of a signal chain. This would include phase shifters, flangers, equalizers, and the like. And that's where this gets a little touchy, because some devices that should be pure processors DO alter the character of a sound. Spring reverbs, for example, are more akin to modifiers than a nice, clean, high-bit-resolution digital reverb because they impart coloration, whereas the digital reverb can alter the temporal factors without 'true' change to the sonic character of the inputted signal. Tape delays, also, when used to impart a tape saturation character in addition to their time-based use, fit here.
Then there's the last...and first...bits of the synth: external modules. Again, these fit in the category of 'processors' because their job isn't to alter the character of the sound, just to either get it into or out of the synthesizer environment. In the case of input modules, they step the signal's level up, and in some cases also derive some control signals from, either through envelope followers that track...to some degree...the incoming signal's amplitude and generate a control voltage from that, pitch-to-voltage converters that turn the incoming signal's pitch into a control voltage, or comparators that fire a gate or trigger when the signal crosses a certain amplitude threshold. In all of these cases, the incoming signal isn't being changed sonically, just used as a source from which the signals can be derived. Output modules are simpler still: they just step the signal level back down to the 'real world' line-level standard, maybe with the addition of a level control or maybe an auxiliary input.
But in actuality, anything that gets a signal of some sort in or out of a modular synthesizer environment is a processor. MIDI, for example, has to be turned into the requisite CV and gate/trigger signals for the synth to be able to make any sense of the incoming MIDI control signal. As such, MIDI usually isn't part of the modular environment (although a few cases do seem to exist, they themselves also do MIDI processing internally to effect the necessary signal compatibility) and has to be turned into the proper signal. In essence, this is a second sort of 'processor'; while processing in the audio chain tries to NOT affect the signal character, MIDI, OSC, etc has to be processed into something a synthesizer can use in the first place. As such, the function of these devices is more akin to a 'translator', even though they 'process' their respective incoming signals into something else as far as signal format. Despite that, the information being 'processed' isn't actually being changed informationally, just as processors in an audio change also avoid changing the audio's 'information'. So, ultimately, they're both a 'controller' and a 'processor' all at the same time, falling into that tiny category of 'a few different things at the same time' which I mentioned back at the beginning.
So, aside of how to power all of this crap and what to put it in, that's the four parts of the synthesizer. And yes, this even applies to synthesizers that are purely digital confabulations, because while the physical devices might be absent, the coding still contains data which contains these four functionalities. So, by keeping this in mind, and knowing what you need to do in terms of where YOU want to go with your instrument, hopefully this pile of info can help you in properly allocating what needs to be in a modular system, how to potentially assemble it into something that works like an instrument, and how to 'get gud' when you're staring down that panel of knobs, wires, lights, and patchcords. Yeah, long essay, I know...but useful, hopefully!