×

Home Top Podcaster Networks By Language By Country By Category About Us Contact Us Faqs Features News & Blogs Privacy Policy Terms Of Use

☰

Search

Home > Machine Learning Guide > 22. Deep NLP 1


	Podcast:		Machine Learning Guide
	Episode:		22. Deep NLP 1
	Category:		Technology
	Duration:		00:49:21
	Publish Date:		2017-07-28 20:55:06
	Description:		Recurrent Neural Networks (RNNs) and Word2Vec. ## Resources - Overview Articles: Unreasonable Effectiveness of RNNs (http://karpathy.github.io/2015/05/21/rnn-effectiveness/) `article:easy` Deep Learning, NLP, and Representations (http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/) `article:medium` Understanding LSTM Networks (http://colah.github.io/posts/2015-08-Understanding-LSTMs/) `article:medium` - Stanford cs224n: Deep NLP (https://www.youtube.com/playlist?list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6) `course:medium` (replaces cs224d) - TensorFlow Tutorials (https://www.tensorflow.org/tutorials/word2vec) `tutorial:medium` (start at Word2Vec + next 2 pages) - Deep Learning Resources (http://ocdevel.com/podcasts/machine-learning/9) ## Episode Deep NLP pros - Language complexity & nuances Feature engineering / learning ** Salary = degreefield, not + * Multiple layers: pixels => lines => objects Multiple layers of language - Once model to rule them all; E2E models Sequence vs non-sequence - DNN = ANN = MLP = Feed Forward - RNNs for sequence (time series) RNNs - Looped hidden layers, learns nuances by combined features - Carries info through time: language model - Translation, sentiment, classification, POS, NER, ... - Seq2seq, encode/decode Word2Vec (https://www.tensorflow.org/tutorials/word2vec) - One-hot (sparse) doesn't help (plus sparse = compute) - Word embeddings Euclidean distance for synonyms / similar, Cosine for "projections" . king + queen - man = woman t-SNE (t-distributed stochastic neighbor embedding) - Vector Space Models (VSMs). Learn from context, predictive vs count-based - Predictive methods (neural probabilistic language models) - Learn model parameters which predict contexts Word2vec CBOW / Skip-Gram (cbow predicts center from context, skip-gram context from center. Small v large datasets) DNN, Softmax hypothesis fn, NCE loss (noise contrastive estimation) - Count-based methods / Distributional Semantics - (compute the statistics of how often some word co-occurs with its neighbor words in a large text corpus, and then map these count-statistics down to a small, dense vector for each word) GloVe Linear algebra stuff (PCA, LSA, SVD) ** Pros (?): faster, more accurate, incremental fitting. Cons (?): data hungry, more RAM. More info (http://blog.aylien.com/overview-word-embeddings-history-word2vec-cbow-glove/) - DNN for POS, NER (or RNNs)
	Total Play:		0

Users also like

200+ Episodes

Data Science .. 300+ 20+

300+ Episodes

Revolutions 2K+ 50+

2 Episodes

Anxiety & De .. 20+

100+ Episodes

Fisicast 800+ 60+