About this resource:
The TED-LIUM corpus (mirrored here) is English-language TED talks, with transcriptions, sampled at 16kHz. It contains about 118 hours of speech.
The original page requests that you cite the following paper if you make use of this corpus:
A. Rousseau, P. Deléglise, and Y. Estève, "TED-LIUM: an automatic speech recognition dedicated corpus",
in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), May 2012.