O'Reilly Data Show Podcast show

O'Reilly Data Show Podcast

Summary: The O'Reilly Data Show Podcast explores the opportunities and techniques driving big data, data science, and AI.

Join Now to Subscribe to this Podcast

Podcasts:

 Metadata services can lead to performance and organizational improvements | File Type: audio/mpeg | Duration: 00:41:36

In this episode of the O’Reilly Data Show, I spoke with one of the most popular speakers at Strata+Hadoop World:  Joe Hellerstein, professor of Computer Science at UC Berkeley and co-founder/CSO of Trifacta. We talked about his past and current academic research (which spans HCI, databases, and systems), data wrangling, large-scale distributed systems, and his recent work on metadata services.

 Building a business that combines human experts and data science | File Type: audio/mpeg | Duration: 00:35:28

I spoke with Eric Colson, chief algorithms officer at Stitch Fix, and former VP of data science and engineering at Netflix. We talked about building and deploying mission-critical, human-in-the-loop systems for consumer Internet companies. Knowing that many companies are grappling with incorporating data science, I also asked Colson to share his experiences building, managing, and nurturing, large data science teams at both Netflix and Stitch Fix.

 Is 2016 the year you let robots manage your money? | File Type: audio/mpeg | Duration: 00:41:39

I sat down with Vasant Dhar, a professor at the Stern School of Business and Center for Data Science at NYU, founder of SCT Capital Management, and editor-in-chief of the Big Data Journal (full disclosure: I'm a member of the editorial board). We talked about the early days of AI and data mining, and recent applications of data science to financial investing and other domains.

 Investing in big data technologies | File Type: audio/mpeg | Duration: 00:18:29

In this special holiday episode of the O’Reilly Data Show, I look back at two conversations I had earlier this year at the Spark Summit in San Francisco. The first segment is an on-stage fireside chat with Ben Horowitz, co-founder of Andreessen Horowitz and author of The Hard Thing About Hard Things.

 Building a scalable platform for streaming updates and analytics | File Type: audio/mpeg | Duration: 00:38:52

In this episode of the O’Reilly Data Show, I sit down with Evan Chan, distinguished engineer at Tuplejump. We talk about the early days of Spark (particularly his contributions to Spark/Cassandra integration), his interesting new open source project (FiloDB), and recent trends in cloud computing.

 Graph databases are powering mission-critical applications | File Type: audio/mpeg | Duration: 00:50:30

In this episode of the O'Reilly Data Show, I sat down with Emil Eifrem, CEO and co-founder of Neo Technology. We talked about the early days of NoSQL, applications of graph databases, cloud computing, and company culture in the U.S. and Sweden.

 Jai Ranganathan on architecting big data applications in the cloud | File Type: audio/mpeg | Duration: 00:49:57

In this episode of the O'Reilly Data Show, I sat down with Jai Ranganathan, senior director of product management at Cloudera. We talked about the trends in the Hadoop ecosystem, cloud computing, the recent surge in interest in all things real time, and hardware trends:

 Building systems for massive scale data applications | File Type: audio/mpeg | Duration: 00:39:08

In this episode of the O'Reilly Data Show, I sat down with Tyler Akidau one of the lead engineers in Google's streaming and Dataflow technologies. He recently wrote an extremely popular article that provided a framework for how to think about bounded and unbounded data processing (a follow-up article is due out soon). We talked about the evolution of stream processing, the challenges of building systems that scale to massive data sets, and the recent surge in interest in all things real time:

 Turning big data into actionable insights | File Type: audio/mpeg | Duration: 00:51:52

Evangelos Simoudis has spent many years interacting with entrepreneurs and executives at major global corporations. Most recently, he's been advising companies interested in developing long-term strategies pertaining to big data, data science, cloud computing, and innovation. He began his career as a data mining researcher and practitioner, and is counted among the pioneers who helped data mining technologies get adopted in industry. In this episode of the O'Reilly Data Show, I sat down with Simoudis and we talked about his thoughts on investing, data applications and products, and corporate innovation.

 Resolving transactional access and analytic performance trade-offs | File Type: audio/mpeg | Duration: 00:48:34

In recent months, I've been hearing about hybrid systems designed to handle different data management needs. At Strata + Hadoop World NYC last week, Cloudera's Todd Lipcon unveiled an open source storage layer — Kudu — that's good at both table scans (analytics) and random access (updates and inserts). During the latest episode of the O'Reilly Data Show Podcast, I sat down with Lipcon to discuss his new project a few weeks before it was released.

 Building enterprise data applications with open source components | File Type: audio/mpeg | Duration: 00:48:13

For this Data Show Podcast, I spoke with O'Reilly author and Typesafe's resident big data architect Dean Wampler about Scala and other programming languages, the big data ecosystem, and his recent interest in real-time applications. Dean has years of experience helping companies with large software projects, and over the last several years, he's focused primarily on helping enterprises design and build big data applications.

 From search to distributed computing to large-scale information extraction | File Type: audio/mpeg | Duration: 00:53:53

February 2016 marks the 10th anniversary of Hadoop — at a point in time when many IT organizations actively use Hadoop, and/or one of the open source, big data projects that originated after, and in some cases, depend on it. During the latest episode of the O'Reilly Data Show Podcast, I had an extended conversation with Mike Cafarella, assistant professor of computer science at the University of Michigan. Along with Strata + Hadoop World program chair Doug Cutting, Cafarella is the co-founder of both Hadoop and Nutch. In addition, Cafarella was the first contributor to HBase.

Comments

Login or signup comment.