(2008), who provided numerous justifications, one of which is particularly illuminating. De-noising auto-encoders were introduced by Vincent et al. I shall describe these in the following sections.Ī de-noising auto-encoder is a function optimized to map a corrupted sample from some dataset to the original un-corrupted sample. The probability the decoder assigns to a sentence is then the product of the probabilities computed for each word in this manner. x is the target, C(x) is the noisy input, \hat), the LSTM outputs a probability distribution over words, which should be interpreted as the distribution of the next word according to the decoder. On the left, the model is trained to reconstruct a sentence from a noisy version of it in the same language. The key idea here is to build a common latent space between languages.
A de-noising auto-encoder loss encourages the latent-space representations to be insensitive to noise.Sentences from the source and target language are mapped to a common latent vector space by an encoder, and then mapped to probability distributions over sentences in the target or source language by a decoder.The word-vector embeddings of the source and target languages are aligned in an unsupervised manner.The unsupervised translation scheme has the following outline: Overview of unsupervised translation system A corpus may contain texts in a single language (monolingual corpus) or text data in multiple language (multilingual corpus). They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory. In linguistics, a corpus (plural corpora) or text corpus and structured set of texts (nowadays usually electronically stored and processed). To provide a strong lower bound that any semi-supervised machine translation system is supposed to yield.To translate between languages for which large parallel corpora does not exist.
The authors offer two motivations for their work: The paper Unsupervised Machine Translation Using Monolingual Corpora Only by Guillaume Lample, Ludovic Denoyer, and Marc'Aurelio Ranzato proposes an unsupervised neural machine translation system, which can be trained without such parallel data. Neural machine translation systems are usually trained on large corpora consisting of pairs of pre-translated sentences. 3 Overview of unsupervised translation system.2.1 Note: What is a corpus (plural corpora)?.