Source:Computer Speech & Language
Author(s): Young-Bum Kim, Karl Stratos, Ruhi Sarikaya
In this paper, we introduce a simple unsupervised framework for pre-training hidden-unit conditional random fields (HUCRFs), i.e., learning initial parameter estimates for HUCRFs prior to supervised training. Pre-training is generally important for models with non-convex training objectives such as deep neural nets. Our framework exploits the model structure of HUCRFs to make effective use of unlabeled data. The key idea is to use the separation of HUCRF parameters between observations and labels: this allows us to pre-train observation parameters independently of label parameters on resources such as unlabeled data or data labeled for non-target tasks. Pre-training is achieved by creating pseudo-labels from such resources. In the case of unlabeled data, we cluster observations and use the resulting clusters as pseudo-labels. Observation parameters can be trained on these external resources and then transferred to initialize the supervised training process on the target labeled data. Experiments on various sequence labeling tasks demonstrate that the proposed pre-training method consistently yields significant improvement in performance. The core idea could be extended to other learning techniques including deep learning. We applied the proposed technique to recurrent neural networks (RNN) with long short term memory (LSTM) architecture and obtained similar gains.
from #ORL-AlexandrosSfakianakis via ola Kala on Inoreader http://ift.tt/2tJcz69
via IFTTT
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου
Σημείωση: Μόνο ένα μέλος αυτού του ιστολογίου μπορεί να αναρτήσει σχόλιο.