Hearst: Machine Learning And Metadata Frameworks for News Media

The promise of automation.

Media companies are experimenting with software and systems that reduce the cost and improve the workflow of high quality content generation. From news to entertainment, the promise of automation and providing journalists and content producers with automatically generated building blocks to synthesize higher value content while driving efficiencies, is an area ripe for experiment. Open-source tools, ranging from natural language processing to deep learning to computer vision techniques and automatic video annotation, make the generation, optimization and recombination of content more compelling than ever.

Hearst sought to develop partnerships with universities and startups that will expose its data science team to new ideas emerging in data science disciplines, including machine learning and artificial intelligence. This program is meant to facilitate and accelerate the process of creating local news stories from raw video footage by creating a tool to extract metadata using an ontology of knowledge elements.


The program is led by Shih-Fu Chang, Senior Executive Vice Dean and the Richard Dicker Professor of The Fu Foundation School of Engineering and Applied Science at Columbia University. Chang also directs the Digital Video and Multimedia Lab, with research focused on multimedia information retrieval, computer vision, and machine learning.