The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Summary: Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.

Podcasts:

What’s Next in LLM Reasoning? with Roland Memisevic - #646 | File Type: audio/mpeg | Duration: 3540

Today we’re joined by Roland Memisevic, a senior director at Qualcomm AI Research. In our conversation with Roland, we discuss the significance of language in humanlike AI systems and the advantages and limitations of autoregressive models like Transformers in building them. We cover the current and future role of recurrence in LLM reasoning and the significance of improving grounding in AI—including the potential of developing a sense of self in agents. Along the way, we discuss Fitness Ally, a fitness coach trained on a visually grounded large language model, which has served as a platform for Roland’s research into neural reasoning, as well as recent research that explores topics like visual grounding for large language models and state-augmented architectures for AI agents. The complete show notes for this episode can be found at twimlai.com/go/646.

Is ChatGPT Getting Worse? with James Zou - #645 | File Type: audio/mpeg | Duration: 2537

Today we’re joined by James Zou, an assistant professor at Stanford University. In our conversation with James, we explore the differences in ChatGPT’s behavior over the last few months. We discuss the issues that can arise from inconsistencies in generative AI models, how he tested ChatGPT’s performance in various tasks, drawing comparisons between March 2023 and June 2023 for both GPT-3.5 and GPT-4 versions, and the possible reasons behind the declining performance of these models. James also shared his thoughts on how surgical AI editing akin to CRISPR could potentially revolutionize LLM and AI systems, and how adding monitoring tools can help in tracking behavioral changes in these models. Finally, we discuss James' recent paper on pathology image analysis using Twitter data, in which he explores the challenges of obtaining large medical datasets and data collection, as well as detailing the model’s architecture, training, and the evaluation process. The complete show notes for this episode can be found at twimlai.com/go/645.

Why Deep Networks and Brains Learn Similar Features with Sophia Sanborn - #644 | File Type: audio/mpeg | Duration: 2715

Today we’re joined by Sophia Sanborn, a postdoctoral scholar at the University of California, Santa Barbara. In our conversation with Sophia, we explore the concept of universality between neural representations and deep neural networks, and how these principles of efficiency provide an ability to find consistent features across networks and tasks. We also discuss her recent paper on Bispectral Neural Networks which focuses on Fourier transform and its relation to group theory, the implementation of bi-spectral spectrum in achieving invariance in deep neural networks, the expansion of geometric deep learning on the concept of CNNs from other domains, the similarities in the fundamental structure of artificial neural networks and biological neural networks and how applying similar constraints leads to the convergence of their solutions. The complete show notes for this episode can be found at twimlai.com/go/644.

Inverse Reinforcement Learning Without RL with Gokul Swamy - #643 | File Type: audio/mpeg | Duration: 2035

Today we’re joined by Gokul Swamy, a Ph.D. Student at the Robotics Institute at Carnegie Mellon University. In the final conversation of our ICML 2023 series, we sat down with Gokul to discuss his accepted papers at the event, leading off with “Inverse Reinforcement Learning without Reinforcement Learning.” In this paper, Gokul explores the challenges and benefits of inverse reinforcement learning, and the potential and advantages it holds for various applications. Next up, we explore the “Complementing a Policy with a Different Observation Space” paper which applies causal inference techniques to accurately estimate sampling balance and make decisions based on limited observed features. Finally, we touched on “Learning Shared Safety Constraints from Multi-task Demonstrations” which centers on learning safety constraints from demonstrations using the inverse reinforcement learning approach. The complete show notes for this episode can be found at twimlai.com/go/643.

Explainable AI for Biology and Medicine with Su-In Lee - #642 | File Type: audio/mpeg | Duration: 2294

Today we’re joined by Su-In Lee, a professor at the Paul G. Allen School of Computer Science And Engineering at the University Of Washington. In our conversation, Su-In details her talk from the ICML 2023 Workshop on Computational Biology which focuses on developing explainable AI techniques for the computational biology and clinical medicine fields. Su-In discussed the importance of explainable AI contributing to feature collaboration, the robustness of different explainability approaches, and the need for interdisciplinary collaboration between the computer science, biology, and medical fields. We also explore her recent paper on the use of drug combination therapy, challenges with handling biomedical data, and how they aim to make meaningful contributions to the healthcare industry by aiding in cause identification and treatments for Cancer and Alzheimer's diseases. The complete show notes for this episode can be found at twimlai.com/go/642.

Transformers On Large-Scale Graphs with Bayan Bruss - #641 | File Type: audio/mpeg | Duration: 2316

Today we’re joined by Bayan Bruss, Vice President of Applied ML Research at Capital One. In our conversation with Bayan, we covered a pair of papers his team presented at this year’s ICML conference. We begin with the paper Interpretable Subspaces in Image Representations, where Bayan gives us a dive deep into the interpretability framework, embedding dimensions, contrastive approaches, and how their model can accelerate image representation in deep learning. We also explore GOAT: A Global Transformer on Large-scale Graphs, a scalable global graph transformer. We talk through the computation challenges, homophilic and heterophilic principles, model sparsity, and how their research proposes methodologies to get around the computational barrier when scaling to large-scale graph models. The complete show notes for this episode can be found at twimlai.com/go/641.

The Enterprise LLM Landscape with Atul Deo - #640 | File Type: audio/mpeg | Duration: 2228

Today we’re joined by Atul Deo, General Manager of Amazon Bedrock. In our conversation with Atul, we discuss the process of training large language models in the enterprise, including the pain points of creating and training machine learning models, and the power of pre-trained models. We explore different approaches to how companies can leverage large language models, dealing with the hallucination, and the transformative process of retrieval augmented generation (RAG). Finally, Atul gives us an inside look at Bedrock, a fully managed service that simplifies the deployment of generative AI-based apps at scale. The complete show notes for this episode can be found at twimlai.com/go/640.

BloombergGPT - an LLM for Finance with David Rosenberg - #639 | File Type: audio/mpeg | Duration: 2212

Today we’re joined by David Rosenberg, head of the machine learning strategy team in the Office of the CTO at Bloomberg. In our conversation with David, we discuss the creation of BloombergGPT, a custom-built LLM focused on financial applications. We explore the model’s architecture, validation process, benchmarks, and its distinction from other language models. David also discussed the evaluation process, performance comparisons, progress, and the future directions of the model. Finally, we discuss the ethical considerations that come with building these types of models, and how they've approached dealing with these issues. The complete show notes for this episode can be found at twimlai.com/go/639

Are LLMs Good at Causal Reasoning? with Robert Osazuwa Ness - #638 | File Type: audio/mpeg | Duration: 2901

Today we’re joined by Robert Osazuwa Ness, a senior researcher at Microsoft Research, Professor at Northeastern University, and Founder of Altdeep.ai. In our conversation with Robert, we explore whether large language models, specifically GPT-3, 3.5, and 4, are good at causal reasoning. We discuss the benchmarks used to evaluate these models and the limitations they have in answering specific causal reasoning questions, while Robert highlights the need for access to weights, training data, and architecture to correctly answer these questions. The episode discusses the challenge of generalization in causal relationships and the importance of incorporating inductive biases, explores the model's ability to generalize beyond the provided benchmarks, and the importance of considering causal factors in decision-making processes. The complete show notes for this episode can be found at twimlai.com/go/638.

Privacy vs Fairness in Computer Vision with Alice Xiang - #637 | File Type: audio/mpeg | Duration: 2261

Today we’re joined by Alice Xiang, Lead Research Scientist at Sony AI, and Global Head of AI Ethics at Sony Group Corporation. In our conversation with Alice, we discuss the ongoing debate between privacy and fairness in computer vision, diving into the impact of data privacy laws on the AI space while highlighting concerns about unauthorized use and lack of transparency in data usage. We explore the potential harm of inaccurate AI model outputs and the need for legal protection against biased AI products, and Alice suggests various solutions to address these challenges, such as working through third parties for data collection and establishing closer relationships with communities. Finally, we talk through the history of unethical data collection practices in CV and the emergence of generative AI technologies that exacerbate the problem, the importance of operationalizing ethical data collection and practice, including appropriate consent, representation, diversity, and compensation, and the need for interdisciplinary collaboration in AI ethics and the growing interest in AI regulation, including the EU AI Act and regulatory activities in the US. The complete show notes for this episode can be found at twimlai.com/go/637.

Unifying Vision and Language Models with Mohit Bansal - #636 | File Type: audio/mpeg | Duration: 2888

Today we're joined by Mohit Bansal, Parker Professor, and Director of the MURGe-Lab at UNC, Chapel Hill. In our conversation with Mohit, we explore the concept of unification in AI models, highlighting the advantages of shared knowledge and efficiency. He addresses the challenges of evaluation in generative AI, including biases and spurious correlations. Mohit introduces groundbreaking models such as UDOP and VL-T5, which achieved state-of-the-art results in various vision and language tasks while using fewer parameters. Finally, we discuss the importance of data efficiency, evaluating bias in models, and the future of multimodal models and explainability. The complete show notes for this episode can be found at twimlai.com/go/636.

Data Augmentation and Optimized Architectures for Computer Vision with Fatih Porikli - #635 | File Type: audio/mpeg | Duration: 3151

Today we kick off our coverage of the 2023 CVPR conference joined by Fatih Porikli, a Senior Director of Technology at Qualcomm. In our conversation with Fatih, we covered quite a bit of ground, touching on a total of 12 papers/demos, focusing on topics like data augmentation and optimized architectures for computer vision. We explore advances in optical flow estimation networks, cross-model, and stage knowledge distillation for efficient 3D object detection, and zero-shot learning via language models for fine-grained labeling. We also discuss generative AI advancements and computer vision optimization for running large models on edge devices. Finally, we discuss objective functions, architecture design choices for neural networks, and efficiency and accuracy improvements in AI models via the techniques introduced in the papers.

Mojo: A Supercharged Python for AI with Chris Lattner - #634 | File Type: audio/mpeg | Duration: 3442

Today we’re joined by Chris Lattner, Co-Founder and CEO of Modular. In our conversation with Chris, we discuss Mojo, a new programming language for AI developers. Mojo is unique in this space and simplifies things by making the entire stack accessible and understandable to people who are not compiler engineers. It also offers Python programmers the ability to make it high-performance and capable of running accelerators, making it more accessible to more people and researchers. We discuss the relationship between the Modular Engine and Mojo, the challenge of packaging Python, particularly when incorporating C code, and how Mojo aims to solve these problems to make the AI stack more dependable. The complete show notes for this episode can be found at twimlai.com/go/634

Stable Diffusion and LLMs at the Edge with Jilei Hou - #633 | File Type: audio/mpeg | Duration: 2409

Today we’re joined by Jilei Hou, a VP of Engineering at Qualcomm Technologies. In our conversation with Jilei, we focus on the emergence of generative AI, and how they've worked towards providing these models for use on edge devices. We explore how the distribution of models on devices can help amortize large models' costs while improving reliability and performance and the challenges of running machine learning workloads on devices, including model size and inference latency. Finally, Jilei we explore how these emerging technologies fit into the existing AI Model Efficiency Toolkit (AIMET) framework. The complete show notes for this episode can be found at twimlai.com/go/633

Modeling Human Behavior with Generative Agents with Joon Sung Park - #632 | File Type: audio/mpeg | Duration: 2798

Today we’re joined by Joon Sung Park, a PhD Student at Stanford University. Joon shares his passion for creating AI systems that can solve human problems and his work on the recent paper Generative Agents: Interactive Simulacra of Human Behavior, which showcases generative agents that exhibit believable human behavior. We discuss using empirical methods to study these systems and the conflicting papers on whether AI models have a worldview and common sense. Joon talks about the importance of context and environment in creating believable agent behavior and shares his team's work on scaling emerging community behaviors. He also dives into the importance of a long-term memory module in agents and the use of knowledge graphs in retrieving associative information. The goal, Joon explains, is to create something that people can enjoy and empower people, solving existing problems and challenges in the traditional HCI and AI field.

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Podcasts:

Comments

Directory

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Podcasts:

Comments

Directory

Click for all Categories