guessing... it's rather likely that you have the vocals on several channels parallel, at least during mixing.
If you don't care about each and every possible (ultra) small 'delay' (latency) of each single process operating on the vocals, you will exactly end with what you observed.
There's nothing wrong with your original source, but for some obscure reasons it comes out flat and a bit washed, does it ?
As an exercise (hands-on is better than read) use a good yet simple vocal/piano track (something like Norah Jones), put it on 2 channels of the mixer, one channel supplied with a module to delay the channel in the sub- to low millisecond range. PhaseFixer for example.
Play both channels simultaneously and then observe the change in the voice while slightly increasing delay.
You'll be stunning what each 10th a millisecond will do...
The reason is phase extinction of certain parts of the signal which are very important for the impression of a human voice.
And right, the STW verbs by Warp69 are incredibly precise in that context
Once you've understood the process on a 'perfect' record, examine your processing chain for possible unintended 'latencies'.
For example the TransientDesigner introduces 0.8 ms.
If in doubt, do the double channel mixer test as above: the 'dry' channel gets the delay, the 'to be measured'-FX channel goes straight into the mixer.
The channels will play without artifacts at zero delay. Else increase the delay until artifacts vanish and you'll know how long the (exact) processing takes.
You could modify the setup by inversing one channel's phase to extinguish the signal as an indicator, but I'd rather suggest the way 'by ear'.
You'll get quite some experience about the sonical representations of phase problems this way.
Don't forget that this process is reversible: what makes your voices sound thin, can also beef them up (to a degree), as signals may as well add up due to phase problems.
At least it's another proof that all those low ms latency discussions are completely pointless...
cheers, Tom