Show simple item record

dc.contributor.authorGrauman, Kristenen_US
dc.date.accessioned2013-03-26T20:03:48Z
dc.date.available2013-03-26T20:03:48Z
dc.date.issued2013-03-06
dc.identifier.urihttp://hdl.handle.net/1853/46501
dc.descriptionPresented on March 6, 2013 from 12:00 pm - 1:00 pm in Room 1116 of the Marcus Nanotechnology building.en_US
dc.descriptionRuntime: 64:26 minutes.en_US
dc.descriptionKristen Grauman is an Associate Professor in the Department of Computer Science at the University of Texas at Austin. Her research in computer vision and machine learning focuses on visual search and object recognition. Before joining UT-Austin in 2007, she received her Ph.D. in the EECS department at MIT, in the Computer Science and Artificial Intelligence Laboratory. She is an Alfred P. Sloan Research Fellow and Microsoft Research New Faculty Fellow, a recipient of NSF CAREER and ONR Young Investigator awards, and the recipient of the 2013 Computers and Thought Award from the International Joint Conference on Artificial Intelligence. She and her collaborators were recognized with the CVPR Best Student Paper Award in 2008 for their work on hashing algorithms for large-scale image retrieval, and the Marr Best Paper Prize at ICCV in 2011 for their work on modeling relative visual attributes.en_US
dc.description.abstractWidespread visual sensors and unprecedented connectivity have left us awash with visual data---from online photo collections, home videos, news footage, medical images, or surveillance feeds. How can we efficiently browse image and video collections based on semantically meaningful criteria? How can we bring order to the data, beyond manually defined keyword tags? We are exploring these questions in our recent work on interactive visual search and summarization. I will first present a novel form of interactive feedback for visual search, in which a user helps pinpoint the content of interest by making visual comparisons between his envisioned target and reference images. The approach relies on a powerful mid-level representation of interpretable relative attributes to connect the user’s descriptions to the system’s internal features. Whereas traditional feedback limits input to coarse binary labels, the proposed “WhittleSearch” lets a user state precisely what about an image is relevant, leading to more rapid convergence to the desired content. Turning to issues in video browsing, I will then present our work on automatic summarization of egocentric videos. Given a long video captured with a wearable camera, our method produces a short storyboard summary. Whereas existing summarization methods define sampling-based objectives (e.g., to maximize diversity in the output summary), we take a “story-driven” approach that predicts the high-level importance of objects and their influence between subevents. We show this leads to substantially more accurate summaries, allowing a viewer to quickly understand the gist of a long video. This is work done with Adriana Kovashka, Yong Jae Lee, Devi Parikh, and Lu Zheng.en_US
dc.format.extent64:26 minutes
dc.language.isoen_USen_US
dc.publisherGeorgia Institute of Technologyen_US
dc.relation.ispartofseriesIRIM Seminar Seriesen_US
dc.subjectRoboticsen_US
dc.subjectIntelligent machinesen_US
dc.subjectVisualen_US
dc.subjectDataen_US
dc.subjectVisual searchen_US
dc.subjectVisual dataen_US
dc.titleVisual Search and Summarizationen_US
dc.typeLectureen_US
dc.typeVideoen_US
dc.contributor.corporatenameGeorgia Institute of Technology. Center for Robotics and Intelligent Machinesen_US
dc.contributor.corporatenameUniversity of Texas at Austinen_US


Files in this item

This item appears in the following Collection(s)

  • IRIM Seminar Series [106]
    Each semester a core seminar series is announced featuring guest speakers from around the world and from varying backgrounds in robotics.

Show simple item record