I'll go over more formats in a minute, but first let's start with
Ogg Vorbis (Ogg is the 'container', vorbis is the audio codec standard.) Vorbis is nice in that it's unencumbered by patents (Open Source) but in my tests it's only on par with mp3's when using older mp3 codecs or using a codec for formats for which it isn't optimized (Fraunhofer is optimized for 320k CBR & higher pre-Mp3Pro, LAME--post 3.97.xx--is optimized for VBR formats with V0 to V2 being the 'standards' as per the presets). A few people I know at the BBC deride the Vorbis codec, but you'll find many Open Source supporters prefer it.
Now onto other codecs...
Most of the
lossless codecs support up to 32bit files as source (INT or FLT depends on codec), though ALS/SLS (mpeg4 lossless & apple's derivative) seems to only support 24bit. Some encoders & decoders have a hard time with even 24bit files and higher samplerates, but most current codecs support up to at least 192khz (FLAC for instance supports linear sample rates from 1Hz - 655350Hz in 1Hz increments.) So lossless can have the best & worst aspects of whatever the source was...as it should decode to an identical bitstream.
When dealing with
Mpeg4 audio codecs specifically, there are several types of 'default' audio encodings specified in the Mpeg4 spec. AAC (mpeg4 'part3' lossy audio compression) is the most commonly known Mpeg4 audio codec, but is actually 5 different 'flavors' or 'guidelines' encoding. I think that Sony may have also bifurcated the AAC format with its own variant as well.
AAC supports up to 192khz with most encoders & decoders, and the files do not have a 'bit-depth' but instead it's a block-based codec where each block decodes to 1024 time-domain samples but is actually stored in a transform (and unlike MP3 below, each frame stands alone and doesn't depend on other frames before it.) So AAC files can be decoded to any bit-depth, and will depend entirely on the decoder and output device (more on this later.) In fact it seems to follow that AAC files can be encoded from more than 16bits as well. I would assume most encodings are 16bit/44.1khz unless otherwise stated, though when combined with h.264 & surround formats you may have higher samplerates & bit-depths (and may also have other audio encoding layers for the surround codecs.)
Technically speaking, Mpeg4 is a set of standard 'methods' (ie, standards) that define how to package various types of media together (video, audio etc) but you also get support for things like VRML, XML and other object data. One of those methods is
MPEG-4 Part 14, otherwise known as "MP4" or the ".mp4" file extensions. Mpeg4 part14 aka. ".mp4" is a 'wrapper' (containers) for whatever you choose to include inside (or alongside if you're just containing the reference.) Normally an MP4 will contain mpeg4 data from the other defined 'methods' within mpeg4, but it can contain whatever the maker chooses (including mp3 or even vorbis-encoded audio data). For 'mp4' and 'm4a' type audio files you'll typically find one of the AAC variants is used.
Apple uses a different extension for their 'Mpeg4 audio' files:
M4A or ".m4a". M4A files are identical to ".mp4" files, though typically ".m4a" indicates that the file is non-protected, whether it's from Apple/iTunes or someone else (ie, from iTunes Plus). The older FairPlay DRM encoded iTunes format is M4P or ".m4p" and is always 'protected' (encumbered by DRM). Apple also has a lossless format ALAC, which I believe uses linear prediction encoding methods similar to FLAC.
Mpeg4 also has the lossless codec called
ALS (".als" file extension for Mpeg4 Lossless). ALS, the mpeg4 lossless layer, uses an integer format with 32bits of data (first bit is assumed 1 and thus used to specify block type, either zero block or normal block aka. contains data.) Output formats seem to be typically anywhere from 16bit to 32bit depending on decoder based on what I've read, and I would assume the input format can be whatever the encoder allows. I've never used ALS so I don't know for sure.
ALAC (Apple lossless) format seems to be found in either 16bit or 24bit format, but I'm not an iTunes user so can't comment more. I also don't author iTunes content (at least right now) so can't say much there either. One would presume at least 24bit input is available, especially considering Apple's marketing as a cutting edge content production platform, but I would assume most encodings are 16bit/44.1khz unless otherwise stated.
Additionally
MP3 or ".mp3" files are "
MPEG1 audio Layer III" files, defined in the Mpeg1 ISO spec. By the original standard Mp3 files can use 32khz, 44.1khz & 48khz samplerates and bitrates between 32kbit/s up to 385kbits/s. Also the MP3 format is NOT 16bit (or 24bit etc)...
Mpeg1 audio Layer III defines
32bit frame headers but bit depth is NOT stored in the frame header. The output from an mp3 is generated from transform coefficient in the mp3, and the output bit-depth is entirely up to the decoder you use. 24bit decoders are available that implement replay-gain properly so you get full 24bit output scaled to your intended output gain. Many decoders that implement 24bit output also allow 32bit output, int or flt, depending on implementation (int would simply pad with 0's). And interestingly enough you can feed some mp3 encoders with a 24bit file (or even 32bit linear, but it will be converted to 32bit FLT) for improved encoding dynamics, especially percussion (this will depend on encoding options & encoder tuning!)
Mp3's major sound quality compromises come not from the just the lossy heuristic algorithm (derived from
ASPEC) but also from the bitstream compatibility with Mpeg1 Layer 2. The short answer is that the mp3 frames are adapted to fit into the mpeg1/layer2 standard's transport and this results in the aliasing, smearing & pre-echo when decoded (a result of the hybrid time domain (filter bank) / frequency domain (MDCT) model). Hydrogenaudio forums has more info on this than I could ever absorb myself, so feel free to defer there for more info.
Now that's what can be encoded & decoded, but you also have to consider the output stream (from media player to soundcard)...
ASIO will depend on soundcard maker, and most should be relatively familiar with it here.
Directsound output will usually not take more than 16bits within the AC97 spec (16bits up to 48khz) and is usually fixed @ 48khz with a SRC algorithm always-on to handle lower samplerate data. This means when using directsound drivers (or WAV drivers that do not bypass the OS's directaudio/directsound stack) you are ALWAYS hearing the SRC even with material where the source is the same as the output format (ie, 48khz>48khz). This results in a 'clouding'/smearing of the audio material & noticeable shift in the phase response (ripple in the frequency domain). The newer "
HDAudio" spec allows up to 24bits/192khz (96khz is more typical) and I don't know all the details on the sound stack in Vista & Win7. Something tells me that they probably do 24bit/96khz 'all the time' (inline SRC again) to account for modern HD video formats, but I haven't coded directsound recently to know what is actually exposed. Most soundcards that support "HDAudio" tend to be consumer affairs, with horribly cheap converters & poor frequency response/time response.
Obviously with WIndows ASIO is usually your best bet (note that my RME's WAV drivers seem to bypass the directsound stack even on XP and are actually bit-transparent, and I suspect that Scope's may be the same.)
Core audio on mac is something that's supposed to be bit transparent & pain-free for the end user and professional alike, though I could probably dig on Apple's site a bit then Hydrogenaudio forums and find a few niggles here & there. I would assume the biggest concern here is when using a Mac's onboard soundcard (and for 3rd party soundcards defer to their literature.) I've never noticed any issues with my RME under OSX, files are bit-transparent as long as I don't adjust gain on the soundcard's routing interface from nominal (0dBfs change.)
And portable media players? There are some that fare relatively well in listening tests (especially older/larger devices) but iPods and the like are NOT for critical listening, they're for convenience. Use whatever formats & compression settings sound transparent 'enough' to you when listening with these players, which is usually lower than what you would choose for studio monitors & headphones (which means you can get by with smaller files if you encode for your portable player directly.)