When Data's Deep, Dark Places Need to be Illuminated




Supersized Science show

Summary: The World Wide Web is like an iceberg, with most of its data hidden below the surface. There lies the 'deep web,' estimated at 500 times bigger than the 'surface web' that most people see through search engines like Google. A innovative data-intensive supercomputer at TACC called Wrangler is helping researchers get meaningful answers from the hidden data of the public web. Wrangler uses 600 terabytes of flash storage that speedily reads and write files. This lets it fly past bottlenecks with big data that can slow down even the fastest computers. Podcast host Jorge Salazar interviews graduate student Karanjeet Singh; and Chris Mattmann, Chief Architect in the Instrument and Science Data, Systems Section of NASA's Jet Propulsion Laboratory at the California Institute of Technology. Mattmann is also an adjunct Associate Professor of Computer Science at the University of Southern California and a member of the Board of Directors for the Apache Software Foundation.