The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Summary: Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.

Podcasts:

Edutainment for AI and AWS PartyRock with Mike Miller - #661 | File Type: audio/mpeg | Duration: 1786

Today we’re joined by Mike Miller, director of product at AWS responsible for the company’s “edutainment” products. In our conversation with Mike, we explore AWS PartyRock, a no-code generative AI app builder that allows users to easily create fun and shareable AI applications by selecting a model, chaining prompts together, and linking different text, image, and chatbot widgets together. Additionally, we discuss some of the previous tools Mike’s team has delivered at the intersection of developer education and entertainment, including DeepLens, a computer vision hardware device, DeepRacer, a programmable vehicle that uses reinforcement learning to navigate a track, and lastly, DeepComposer, a generative AI model that transforms musical inputs and creates accompanying compositions. The complete show notes for this episode can be found at twimlai.com/go/661.

Data, Systems and ML for Visual Understanding with Cody Coleman - #660 | File Type: audio/mpeg | Duration: 2307

Today we’re joined by Cody Coleman, co-founder and CEO of Coactive AI. In our conversation with Cody, we discuss how Coactive has leveraged modern data, systems, and machine learning techniques to deliver its multimodal asset platform and visual search tools. Cody shares his expertise in the area of data-centric AI, and we dig into techniques like active learning and core set selection, and how they can drive greater efficiency throughout the machine learning lifecycle. We explore the various ways Coactive uses multimodal embeddings to enable their core visual search experience, and we cover the infrastructure optimizations they’ve implemented in order to scale their systems. We conclude with Cody’s advice for entrepreneurs and engineers building companies around generative AI technologies. The complete show notes for this episode can be found at twimlai.com/go/660.

Patterns and Middleware for LLM Applications with Kyle Roche - #659 | File Type: audio/mpeg | Duration: 2158

Today we’re joined by Kyle Roche, founder and CEO of Griptape to discuss patterns and middleware for LLM applications. We dive into the emerging patterns for developing LLM applications, such as off prompt data—which allows data retrieval without compromising the chain of thought within language models—and pipelines, which are sequential tasks that are given to LLMs that can involve different models for each task or step in the pipeline. We also explore Griptape, an open-source, Python-based middleware stack that aims to securely connect LLM applications to an organization’s internal and external data systems. We discuss the abstractions it offers, including drivers, memory management, rule sets, DAG-based workflows, and a prompt stack. Additionally, we touch on common customer concerns such as privacy, retraining, and sovereignty issues, and several use cases that leverage role-based retrieval methods to optimize human augmentation tasks. The complete show notes for this episode can be found at twimlai.com/go/659.

AI Access and Inclusivity as a Technical Challenge with Prem Natarajan - #658 | File Type: audio/mpeg | Duration: 2506

Today we’re joined by Prem Natarajan, chief scientist and head of enterprise AI at Capital One. In our conversation, we discuss AI access and inclusivity as technical challenges and explore some of Prem and his team’s multidisciplinary approaches to tackling these complexities. We dive into the issues of bias, dealing with class imbalances, and the integration of various research initiatives to achieve additive results. Prem also shares his team’s work on foundation models for financial data curation, highlighting the importance of data quality and the use of federated learning, and emphasizing the impact these factors have on the model performance and reliability in critical applications like fraud detection. Lastly, Prem shares his overall approach to tackling AI research in the context of a banking enterprise, including prioritizing mission-inspired research aiming to deliver tangible benefits to customers and the broader community, investing in diverse talent and the best infrastructure, and forging strategic partnerships with a variety of academic labs. The complete show notes for this episode can be found at twimlai.com/go/658.

Building LLM-Based Applications with Azure OpenAI with Jay Emery - #657 | File Type: audio/mpeg | Duration: 2603

Today we’re joined by Jay Emery, director of technical sales & architecture at Microsoft Azure. In our conversation with Jay, we discuss the challenges faced by organizations when building LLM-based applications, and we explore some of the techniques they are using to overcome them. We dive into the concerns around security, data privacy, cost management, and performance as well as the ability and effectiveness of prompting to achieve the desired results versus fine-tuning, and when each approach should be applied. We cover methods such as prompt tuning and prompt chaining, prompt variance, fine-tuning, and RAG to enhance LLM output along with ways to speed up inference performance such as choosing the right model, parallelization, and provisioned throughput units (PTUs). In addition to that, Jay also shared several intriguing use cases describing how businesses use tools like Azure Machine Learning prompt flow and Azure ML AI Studio to tailor LLMs to their unique needs and processes. The complete show notes for this episode can be found at twimlai.com/go/657.

Visual Generative AI Ecosystem Challenges with Richard Zhang - #656 | File Type: audio/mpeg | Duration: 2440

Today we’re joined by Richard Zhang, senior research scientist at Adobe Research. In our conversation with Richard, we explore the research challenges that arise when regarding visual generative AI from an ecosystem perspective, considering the disparate needs of creators, consumers, and contributors. We start with his work on perceptual metrics and the LPIPS paper, which allow us to better align human perception and computer vision and which remain used in contemporary generative AI applications such as stable diffusion, GANs, and latent diffusion. We look at his work creating detection tools for fake visual content, highlighting the importance of generalization of these detection methods to new, unseen models. Lastly, we dig into his work on data attribution and concept ablation, which aim to address the challenging open problem of allowing artists and others to manage their contributions to generative AI training data sets. The complete show notes for this episode can be found at twimlai.com/go/656.

Deploying Edge and Embedded AI Systems with Heather Gorr - #655 | File Type: audio/mpeg | Duration: 2316

Today we’re joined by Heather Gorr, principal MATLAB product marketing manager at MathWorks. In our conversation with Heather, we discuss the deployment of AI models to hardware devices and embedded AI systems. We explore factors to consider during data preparation, model development, and ultimately deployment, to ensure a successful project. Factors such as device constraints and latency requirements which dictate the amount and frequency of data flowing onto the device are discussed, as are modeling needs such as explainability, robustness and quantization; the use of simulation throughout the modeling process; the need to apply robust verification and validation methodologies to ensure safety and reliability; and the need to adapt and apply MLOps techniques for speed and consistency. Heather also shares noteworthy anecdotes about embedded AI deployments in industries including automotive and oil & gas. The complete show notes for this episode can be found at twimlai.com/go/655.

AI Sentience, Agency and Catastrophic Risk with Yoshua Bengio - #654 | File Type: audio/mpeg | Duration: 2880

Today we’re joined by Yoshua Bengio, professor at Université de Montréal. In our conversation with Yoshua, we discuss AI safety and the potentially catastrophic risks of its misuse. Yoshua highlights various risks and the dangers of AI being used to manipulate people, spread disinformation, cause harm, and further concentrate power in society. We dive deep into the risks associated with achieving human-level competence in enough areas with AI, and tackle the challenges of defining and understanding concepts like agency and sentience. Additionally, our conversation touches on solutions to AI safety, such as the need for robust safety guardrails, investments in national security protections and countermeasures, bans on systems with uncertain safety, and the development of governance-driven AI systems. The complete show notes for this episode can be found at twimlai.com/go/654.

Delivering AI Systems in Highly Regulated Environments with Miriam Friedel - #653 | File Type: audio/mpeg | Duration: 2645

Today we’re joined by Miriam Friedel, senior director of ML engineering at Capital One. In our conversation with Miriam, we discuss some of the challenges faced when delivering machine learning tools and systems in highly regulated enterprise environments, and some of the practices her teams have adopted to help them operate with greater speed and agility. We also explore how to create a culture of collaboration, the value of standardized tooling and processes, leveraging open-source, and incentivizing model reuse. Miriam also shares her thoughts on building a ‘unicorn’ team, and what this means for the team she’s built at Capital One, as well as her take on build vs. buy decisions for MLOps, and the future of MLOps and enterprise AI more broadly. Throughout, Miriam shares examples of these ideas at work in some of the tools their team has built, such as Rubicon, an open source experiment management tool, and Kubeflow pipeline components that enable Capital One data scientists to efficiently leverage and scale models. The complete show notes for this episode can be found at twimlai.com/go/653.

Mental Models for Advanced ChatGPT Prompting with Riley Goodside - #652 | File Type: audio/mpeg | Duration: 2398

Today we’re joined by Riley Goodside, staff prompt engineer at Scale AI. In our conversation with Riley, we explore LLM capabilities and limitations, prompt engineering, and the mental models required to apply advanced prompting techniques. We dive deep into understanding LLM behavior, discussing the mechanism of autoregressive inference, comparing k-shot and zero-shot prompting, and dissecting the impact of RLHF. We also discuss the idea that prompting is a scaffolding structure that leverages the model context, resulting in achieving the desired model behavior and response rather than focusing solely on writing ability. The complete show notes for this episode can be found at twimlai.com/go/652.

Multilingual LLMs and the Values Divide in AI with Sara Hooker - #651 | File Type: audio/mpeg | Duration: 4719

Today we’re joined by Sara Hooker, director at Cohere and head of Cohere For AI, Cohere’s research lab. In our conversation with Sara, we explore some of the challenges with multilingual models like poor data quality and tokenization, and how they rely on data augmentation and preference training to address these bottlenecks. We also discuss the disadvantages and the motivating factors behind the Mixture of Experts technique, and the importance of common language between ML researchers and hardware architects to address the pain points in frameworks and create a better cohesion between the distinct communities. Sara also highlights the impact and the emotional connection that language models have created in society, the benefits and the current safety concerns of universal models, and the significance of having grounded conversations to characterize and mitigate the risk and development of AI models. Along the way, we also dive deep into Cohere and Cohere for AI, along with their Aya project, an open science project that aims to build a state-of-the-art multilingual generative language model as well as some of their recent research papers. The complete show notes for this episode can be found at twimlai.com/go/651.

Scaling Multi-Modal Generative AI with Luke Zettlemoyer - #650 | File Type: audio/mpeg | Duration: 2324

Today we’re joined by Luke Zettlemoyer, professor at University of Washington and a research manager at Meta. In our conversation with Luke, we cover multimodal generative AI, the effect of data on models, and the significance of open source and open science. We explore the grounding problem, the need for visual grounding and embodiment in text-based models, the advantages of discretization tokenization in image generation, and his paper Scaling Laws for Generative Mixed-Modal Language Models, which focuses on simultaneously training LLMs on various modalities. Additionally, we cover his papers on Self-Alignment with Instruction Backtranslation, and LIMA: Less Is More for Alignment. The complete show notes for this episode can be found at twimlai.com/go/650.

Pushing Back on AI Hype with Alex Hanna - #649 | File Type: audio/mpeg | Duration: 2966

Today we’re joined by Alex Hanna, the Director of Research at the Distributed AI Research Institute (DAIR). In our conversation with Alex, we discuss the topic of AI hype and the importance of tackling the issues and impacts it has on society. Alex highlights how the hype cycle started, concerning use cases, incentives driving people towards the rapid commercialization of AI tools, and the need for robust evaluation tools and frameworks to assess and mitigate the risks of these technologies. We also talked about DAIR and how they’ve crafted their research agenda. We discuss current research projects like DAIR Fellow Asmelash Teka Hadgu’s research supporting machine translation and speech recognition tools for the low-resource Amharic and Tigrinya languages of Ethiopia and Eritrea, in partnership with his startup Lesan.AI. We also explore the “Do Data Sets Have Politics” paper, which focuses on coding various variables and conducting a qualitative analysis of computer vision data sets to uncover the inherent politics present in data sets and the challenges in data set creation. The complete show notes for this episode can be found at twimlai.com/go/649.

Personalization for Text-to-Image Generative AI with Nataniel Ruiz - #648 | File Type: audio/mpeg | Duration: 2662

Today we’re joined by Nataniel Ruiz, a research scientist at Google. In our conversation with Nataniel, we discuss his recent work around personalization for text-to-image AI models. Specifically, we dig into DreamBooth, an algorithm that enables “subject-driven generation,” that is, the creation of personalized generative models using a small set of user-provided images about a subject. The personalized models can then be used to generate the subject in various contexts using a text prompt. Nataniel gives us a dive deep into the fine-tuning approach used in DreamBooth, the potential reasons behind the algorithm’s effectiveness, the challenges of fine-tuning diffusion models in this way, such as language drift, and how the prior preservation loss technique avoids this setback, as well as the evaluation challenges and metrics used in DreamBooth. We also touched base on his other recent papers including SuTI, StyleDrop, HyperDreamBooth, and lastly, Platypus. The complete show notes for this episode can be found at twimlai.com/go/648.

Ensuring LLM Safety for Production Applications with Shreya Rajpal - #647 | File Type: audio/mpeg | Duration: 2452

Today we’re joined by Shreya Rajpal, founder and CEO of Guardrails AI. In our conversation with Shreya, we discuss ensuring the safety and reliability of language models for production applications. We explore the risks and challenges associated with these models, including different types of hallucinations and other LLM failure modes. We also talk about the susceptibility of the popular retrieval augmented generation (RAG) technique to closed-domain hallucination, and how this challenge can be addressed. We also cover the need for robust evaluation metrics and tooling for building with large language models. Lastly, we explore Guardrails, an open-source project that provides a catalog of validators that run on top of language models to enforce correctness and reliability efficiently. The complete show notes for this episode can be found at twimlai.com/go/647.

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Podcasts:

Comments

Directory

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Podcasts:

Comments

Directory

Click for all Categories