A synchronized spontaneous otoacoustic emission paradigm was used to measure the response in time intervals of 80 ms following a click stimulus. The responses obtained were decomposed into basic waveforms by means of adaptive approximations using a matching pursuit algorithm. High-resolution time-frequency distributions of signal energy were calculated and showed three types of component: (1) purely evoked of duration less than 5 ms, (2) longer lasting and decaying, with exponentially decreasing amplitude, and (3) long lasting and stable. The distributions of the frequencies of components of different durations were similar, with most components falling within the 1-2 kHz interval. It is shown that the presence of long-lasting components may influence the estimation of the latency of evoked emissions, especially at higher frequencies where the evoked part has a very short duration.