wrote this a few nights ago.. thinking to add a simple walkthru to the end. feedback welcome :)
having a go at making your own custom sounds may be easier than you anticipated. this article describes a few basic considerations and methods for doing so from a synthesis perspective.
to create sounds using synthesis requires some concepts to informedly translate our intent to the tools we have available. perceptually, we categorise sounds to be pitched (tuned) or unpitched (noisy). in pitched sounds we look for a fundamental periodicity in the signal. generally, natural sound sources are complex, or frequency rich, accompanying our perceived fundamental with a full spectrum of frequencies that define the timbre.
quite often in electronics and in nature, the spectrum consists of integer multiples of the fundamental frequency, known as the harmonic series. the harmonics are reinforced and cancelled out by various reactions within an acoustic system to produce a characteristic timbre. dense mediums or materials that transmit sound can have a perceptible frequency dependent affect on transmission, such as water or metal. this is apparent in the "chirp" sound of striking a stretched steel cable, or the sound of tightening a guitar string. recognising the presence and absence of the harmonic series is informative to the analysis of sounds you intend to recreate.
we'll assume you have a free wav editor like wavosaur or audacity. fine if you have something better. if you haven't done so, taking a bit of time to experiment with soundfiles will help you to translate what you hear and imagine into data. using spectral analysis will help to confirm your awareness of harmonicity in timbre. using a parametric equaliser to boost or cut a specific frequency in an audio signal will help you to listen more critically for spectral content.
unless you have other recourse, obtain some free synthesizers in vst format, such as the elementary "synth1" vst by ichiro toda. you will need a vst host application, hermann seib's vst host and tobybear's minihost are minimal applications that will suffice for being able to trigger a note, and adjust and record the sound. either should include instructions for using the vst .dlls.
elementary synthesizers like synth1 offer a signal generator stage consisting of one or more oscillators, a filter stage for boosting or removing parts of the spectrum, and an amplifier stage which is shaped by an envelope. the oscillators usually offer a few primitive waveform contours, and the filters usually provide lowpass, highpass and bandpass modes which allow you to select the more desirable part of the signal for the sound you are creating.
in addition to periodic waveforms, hopefully the synth includes white noise. many sounds of the natural world can be approximated simply using noise and a filter, with some creative shaping or modulation of the filter and amplifier parameters. the difference between a snow footstep or gravel can be a simple matter of adjusting the filter cutoff frequency.
if a sound is too complex to create with one instance, careful layering to composit a more complex signal is often convincing.
the periodic waveforms on a basic synth can also be used to emulate acoustic sounds, perhaps with a few strategic spots of attention to detail. modulating or changing the pitch with an envelope or lfo to rise and fall can be very expressive if abstract or cartoon aesthetic is allowable.
after introductory use of basic synths for emulation, you are likely to wish there was more crossover between noise and periodic signals. the two available techniques to achieve this are to use noise with a bandpass filter with a very high Q or resonance, so that the only frequencies passed are in a very narrow range, approximating a pitched signal. the other technique is to use modulation on a periodic waveform to make it less stable and overtly synthetic. neither of these are particularly satisfying.
before proceeding, it is worth noting that you should feel potentiated after exploring simple oscillators and filters. all discretised sounds can be decomposed into sinewaves (a single frequency), and the depth of additive synthesis, or summing oscillators, is obviated by the more dramatic effects heard in the industry for decades. if you don't think white noise through a bandpass sounds like a footstep in the snow, try additional filtering or additional layering. removing highs and lows creates a feel for where the sound is located. when you believe that you can sufficiently approximate your targets with these tools, you are ready to create.
foley effects are often easy to produce by using a wave editor to massage similar source material into the sound you want, using the editor's capabilities.. time stretching, pitch bending, and filtering are powerful transformative tools, and the context of multimedia allows us to be very forgiving in what source we bend to what purpose, eg. the classic godzilla screech produced by scraping a guitar string.
if you are short on devices or sources to record, and not able to convincingly render with synth primitives, another audio synthesis approach is to physically model the phenomenon. this is where the author's own line of freeware vst tools come into play ;)
some familiarity with the vst synthesizer paradigm is recommended, so that you understand basic filter modes and ADSR envelopes. these tools are unfamiliar to most musicians, so it is better not to be challenged by the medium before going ahead.
"physical modeling" synthesis uses various approaches to try and recreate or emulate various acoustic systems. the same mass-springs used to model cloth in 3d simulation can be used to model acoustic interactions. the array of physical modeling resources is not overwhelming because most of the sounds people want, such as circular membranes, are still intensive to model. in many cases, such as the plucked string, the system does translate to an abstraction well enough to achieve popularity.
the more common plucked string emulation models the string as a 1 dimensional waveguide with two termini. the acoustic transmission is modeled with a delay, damping and dispersion filters are part of the canonical karplus-strong plucked string model. with additional processing and mixing, this technique makes at least a fascinating approximation of an acoustic string that may be sufficiently convincing when observed in context.
i have created a series of (free, widely used and acknowledged so safe) vst resources modeling general acoustic phenomenon useful for foley. for instance, "friction vst" uses a simple parameterised equation to model catch-slip surface interaction, augmented with some resonators to add definition. the presets on the friction vst approximate squeaking doors, hinges, axles, sliding a finger on glass, and skidding tires to some efficacy. customising the presets to a required sound will take some independent effort, but if you have followed this article this far, it ought to be feasible :) of course, these are simple, generalised simulations which have finite capabilities ;)
the more widely appreciable models are "virtual machine," which applies cyclic parameterisation to noise sources and filters to achieve an array of motor and device sounds from car engines to blenders to nameless industrial abominations.
and fauna. fauna is a simple vocal tract model composed of five segments which are parameterised as reflective junctions. instead of consisting of a source oscillator like a conventional synthesizer, "air" is injected into the tube at one end, some of which is reflected back to this point and a mass-spring "reed" to simulate the glottis. because of the simple build, "fauna vst" could sound a bit like a sock puppet version of a real animal, but modulation adds enough detail to the sound for often convincingly organic realism.
as fauna is also the most abstract instrument, let's pick through a patch so that you have a better idea of how to customise creature vocalisations: