Search

Home > Data Skeptic > Building the howto100m Video Corpus
Podcast: Data Skeptic
Episode:

Building the howto100m Video Corpus

Category: Religion & Spirituality
Duration: 00:22:38
Publish Date: 2019-08-19 15:12:43
Description:

Video annotation is an expensive and time-consuming process. As a consequence, the available video datasets are useful but small. The availability of machine transcribed explainer videos offers a unique opportunity to rapidly develop a useful, if dirty, corpus of videos that are "self annotating", as hosts explain the actions they are taking on the screen.

This episode is a discussion of the HowTo100m dataset - a project which has assembled a video corpus of 136M video clips with captions covering 23k activities.

Related Links

The paper will be presented at ICCV 2019

@antoine77340

Antoine on Github

Antoine's homepage

Total Play: 0

Users also like

300+ Episodes
Good Law | B .. 80+     10+
400+ Episodes
The Knowledg .. 200+     10+
1K+ Episodes
Entrepreneur .. 600+     50+