Screaming in the Cloud show

Screaming in the Cloud

Summary: Screaming in the Cloud with Corey Quinn features conversations with domain experts in the world of Cloud Computing. Topics discussed include AWS, GCP, Azure, Oracle Cloud, and the "why" behind how businesses are coming to think about the Cloud.

Join Now to Subscribe to this Podcast

Podcasts:

 Episode 22: The Chaos Engineering experiment that is us-east-1 | File Type: audio/mp3 | Duration: 00:32:19

Trying to convince a company to embrace the theory and idea of Chaos Engineering is an uphill battle. When a site keeps breaking, Gremlin’s plan involves breaking things intentionally. How do you introduce chaos as a step toward making things better? Today, we’re talking to Ho Ming Li, lead solutions architect at Gremlin. He takes a strategic approach to deliver holistic solutions, often diving into the intersection of people, process, business, and technology. His goal is to enable everyone to build more resilient software by means of Chaos Engineering practices. Some of the highlights of the show include: Ho Ming Li previously worked as a technical account manager (TAM) at Amazon Web Services (AWS) to offer guidance on architectural/operational best practices Difference between and transition to solutions architect and TAM at AWS Role of TAM as the voice and face of AWS for customers Ultimate goal is to bring services back up and make sure customers are happy Amazon Leadership Principles: Mutually beneficial to have the customer get what they want, be happy with the service, and achieve success with the customer Chaos Engineering isn’t about breaking things to prove a point Chaos Engineering takes a scientific approach Other than during carefully staged DR exercises, DR plans usually don’t work Availability Theater: A passive data center is not enough; exercise DR plan Chaos Engineering is bringing it down to a level where you exercise it regularly to build resiliency Start small when dealing with availability Chaos Engineering is a journey of verifying, validating, and catching surprises in a safe environment Get started with Chaos Engineering by asking: What could go wrong? Embrace failure and prepare for it; business process resilience Gremlin’s GameDay and Chaos Conf allows people to share experiences Links: Ho Ming Li on Twitter Gremlin Gremlin on Twitter Gremlin on Facebook Gremlin on Instagram Gremlin: It’s GameDay Chaos Engineering Slack Chaos Conf Amazon Leadership Principles Adrian Cockcroft and Availability Theater Digital Ocean

 Episode 22: The Chaos Engineering experiment that is us-east-1 | File Type: audio/mp3 | Duration: 00:32:19

Trying to convince a company to embrace the theory and idea of Chaos Engineering is an uphill battle. When a site keeps breaking, Gremlin’s plan involves breaking things intentionally. How do you introduce chaos as a step toward making things better? Today, we’re talking to Ho Ming Li, lead solutions architect at Gremlin. He takes a strategic approach to deliver holistic solutions, often diving into the intersection of people, process, business, and technology. His goal is to enable everyone to build more resilient software by means of Chaos Engineering practices. Some of the highlights of the show include: Ho Ming Li previously worked as a technical account manager (TAM) at Amazon Web Services (AWS) to offer guidance on architectural/operational best practices Difference between and transition to solutions architect and TAM at AWS Role of TAM as the voice and face of AWS for customers Ultimate goal is to bring services back up and make sure customers are happy Amazon Leadership Principles: Mutually beneficial to have the customer get what they want, be happy with the service, and achieve success with the customer Chaos Engineering isn’t about breaking things to prove a point Chaos Engineering takes a scientific approach Other than during carefully staged DR exercises, DR plans usually don’t work Availability Theater: A passive data center is not enough; exercise DR plan Chaos Engineering is bringing it down to a level where you exercise it regularly to build resiliency Start small when dealing with availability Chaos Engineering is a journey of verifying, validating, and catching surprises in a safe environment Get started with Chaos Engineering by asking: What could go wrong? Embrace failure and prepare for it; business process resilience Gremlin’s GameDay and Chaos Conf allows people to share experiences Links: Ho Ming Li on Twitter Gremlin Gremlin on Twitter Gremlin on Facebook Gremlin on Instagram

 Episode 21: Remember when RealNetworks used to-- BUFFERING | File Type: audio/mp3 | Duration: 00:31:54

Are you about to head off to college? Interested in DevOps and the Cloud? Is there a good way for someone like you who is starting out in the world of technology to absorb the necessary skills? The Open Source Lab (OSL) at Oregon State University (OSU) is one program that helps students and serves as a career accelerator. OSL is a unicorn because OSU is willing to invest in open source. Today, we’re talking to Lance Albertson, director of OSL at OSU. OSL does a variety of projects to provide private Clouds that are neutrally hosted on its premises. The lab also gives undergraduate students hands-on experience with DevOps skills, including dealing with configuration management, deploying applications, learning how applications deploy, working with projects, and troubleshooting issues. OSL is for any student who has a general interest or passion for it, and a willingness to learn. Some of the highlights of the show include: Workflow focuses on what students need to learn about Linux and giving access to various repos; then they experience the lab’s configuration management suite Interview Process: Put out a posting, student submits an application online, each candidate is reviewed, student is given a screening quiz, If a student passes the screening process, they are brought in for an in-person interview for personality and technical questions Students tend to initially have the least amount of experience and most difficulty with a repository that has multiple people committing to it and dealing with PRs Spinning up VMs and understanding how configuration management is connected, how services communicate, and how to set up an application Round-Robins and System Sprint Meetings: Focus on discussing and documenting processes, issues, suggestions, comments, and other information Younger students are mentored by Lance and the older students; every generation has to evolve because the environment and industry evolve OSL made OpenStack work on POWER8, PowerPC, and PowerPC little-endian; gateway into Cloud - having OpenStack instance to offer services Vast majority of OSL’s revenue comes from donations; no direct support from the university; finding companies to serve as sponsors is beneficial to all Future of OSL: Providing more Cloud-like services; creating a more internal, private Cloud’ and containerized ways of running or deploying applications Links: Apache Software Foundation BusyBox Buildroot Chef Ruby Freenode OpenStack Sphinx Docker Neutron Seth Rackspace CoreOS Kubernetes Digital Ocean

 Episode 21: Remember when RealNetworks used to-- BUFFERING | File Type: audio/mp3 | Duration: 00:31:54

Are you about to head off to college? Interested in DevOps and the Cloud? Is there a good way for someone like you who is starting out in the world of technology to absorb the necessary skills? The Open Source Lab (OSL) at Oregon State University (OSU) is one program that helps students and serves as a career accelerator. OSL is a unicorn because OSU is willing to invest in open source. Today, we’re talking to Lance Albertson, director of OSL at OSU. OSL does a variety of projects to provide private Clouds that are neutrally hosted on its premises. The lab also gives undergraduate students hands-on experience with DevOps skills, including dealing with configuration management, deploying applications, learning how applications deploy, working with projects, and troubleshooting issues. OSL is for any student who has a general interest or passion for it, and a willingness to learn. Some of the highlights of the show include: Workflow focuses on what students need to learn about Linux and giving access to various repos; then they experience the lab’s configuration management suite Interview Process: Put out a posting, student submits an application online, each candidate is reviewed, student is given a screening quiz, If a student passes the screening process, they are brought in for an in-person interview for personality and technical questions Students tend to initially have the least amount of experience and most difficulty with a repository that has multiple people committing to it and dealing with PRs Spinning up VMs and understanding how configuration management is connected, how services communicate, and how to set up an application Round-Robins and System Sprint Meetings: Focus on discussing and documenting processes, issues, suggestions, comments, and other information Younger students are mentored by Lance and the older students; every generation has to evolve because the environment and industry evolve OSL made OpenStack work on POWER8, PowerPC, and PowerPC little-endian; gateway into Cloud - having OpenStack instance to offer services Vast majority of OSL’s revenue comes from donations; no direct support from the university; finding companies to serve as sponsors is beneficial to all Future of OSL: Providing more Cloud-like services; creating a more internal, private Cloud’ and containerized ways of running or deploying applications Links: Apache Software Foundation BusyBox Buildroot Chef Ruby

 Episode 20: The Wizard of AWS | File Type: audio/mp3 | Duration: 00:51:16

Today, we’re talking to Jeff Barr, vice president and chief evangelist at Amazon Web Services (AWS). He founded the AWS Blog in 2004 and has written more than 2,900 posts for it and another 1,100 for his personal blog. As chief evangelist, Jeff strives to explain the benefits of Cloud computing and Web services to anyone who will listen. Jeff is the voice of AWS. He does what he does best - exploits his superpower of explaining technology in ways that people can understand it. Jeff tries to be the same person all the time. He loves to meet people and go out of his way to say “Hello.” So, if you see him at re:Invent, say “Cheese” and take a selfie with him! Some of the highlights of the show include: Jeff uses AWS Workspaces for his blog; one of Jeff’s blogging principles is to not take anybody else's word for anything to the absolute best of his technical ability Zero Client: Jeff has no rotating hardware, disk drives, just a zero client; wherever he is, it's the same workspace AWS has something for everyone; it build things in response to customers’ questions, requests, and feedback Naming Services and Products: Is it helpful? Is it descriptive? Does it have any hidden meanings? Amazonian DNA and Dog Friendly Workspace: Jeff went from super fearful to accepting, to now thinking of dogs as incredible creations because they add fun and excitement to the office As part of hiring, each interviewer is assigned Amazon leadership principles (LPs) to ask questions that measure a candidate against those LPs What is the secret to getting hired at Amazon? Study the LPs to understand what they're about and be able to express your philosophies and history with LPs re:Invent makes sure customers understand services - What is it? What does it do? How do they put it to work? What are the best use cases for it? Things can never be too simple; you start from zero, put a lot of different things in there, and then you need the feedback to build in simplicity AWS is following a more on-demand approach than traditional reserve instances; it opens the door to being used in a lot of ways AWS does a lot of work before a launch to make sure it’s got infrastructure, scaling, monitoring, and capacity in place If you are a customer, talk to AWS and let them know what they're doing right or wrong; write a blog post, tweet about it, share it with them in some way Is the breadth of product offerings from AWS too vast? Is it offering too many things?  AWS was not explicit about where it was going with Cloud computing or do analyses or projections about it; it simply launched SQS and let it speak for itself Customer feedback shapes what Amazon works on; customers share and then AWS re-prioritizes to make sure it’s delivering the right thing at the right time Remember: It's not just bits and bytes, it's about the organic life form Links: Jeff Barr on Twitter Jeff Barr on LinkedIn AWS AWS Blog Jeff Barr’s Blog Amazon Machine Images Zero Client AWS Workspaces AWS Lambda Amazon Leadership principles re:Invent The Robot Uprising Will Have Very Clean Floors Serverlessly Storing My Dad Jokes in a Dadabase Days Until re:Invent

 Episode 20: The Wizard of AWS | File Type: audio/mp3 | Duration: 00:51:16

Today, we’re talking to Jeff Barr, vice president and chief evangelist at Amazon Web Services (AWS). He founded the AWS Blog in 2004 and has written more than 2,900 posts for it and another 1,100 for his personal blog. As chief evangelist, Jeff strives to explain the benefits of Cloud computing and Web services to anyone who will listen. Jeff is the voice of AWS. He does what he does best - exploits his superpower of explaining technology in ways that people can understand it. Jeff tries to be the same person all the time. He loves to meet people and go out of his way to say “Hello.” So, if you see him at re:Invent, say “Cheese” and take a selfie with him! Some of the highlights of the show include: Jeff uses AWS Workspaces for his blog; one of Jeff’s blogging principles is to not take anybody else's word for anything to the absolute best of his technical ability Zero Client: Jeff has no rotating hardware, disk drives, just a zero client; wherever he is, it's the same workspace AWS has something for everyone; it build things in response to customers’ questions, requests, and feedback Naming Services and Products: Is it helpful? Is it descriptive? Does it have any hidden meanings? Amazonian DNA and Dog Friendly Workspace: Jeff went from super fearful to accepting, to now thinking of dogs as incredible creations because they add fun and excitement to the office As part of hiring, each interviewer is assigned Amazon leadership principles (LPs) to ask questions that measure a candidate against those LPs What is the secret to getting hired at Amazon? Study the LPs to understand what they're about and be able to express your philosophies and history with LPs re:Invent makes sure customers understand services - What is it? What does it do? How do they put it to work? What are the best use cases for it? Things can never be too simple; you start from zero, put a lot of different things in there, and then you need the feedback to build in simplicity AWS is following a more on-demand approach than traditional reserve instances; it opens the door to being used in a lot of ways AWS does a lot of work before a launch to make sure it’s got infrastructure, scaling, monitoring, and capacity in place If you are a customer, talk to AWS and let them know what they're doing right or wrong; write a blog post, tweet about it, share it with them in some way Is the breadth of product offerings from AWS too vast? Is it offering too many things?  AWS was not explicit about where it was going with Cloud computing or do analyses or projections about it; it simply launched SQS and let it speak for itself Customer feedback shapes what Amazon works on; customers share a

 Episode 19: I want to build a world spanning search engine on top of GCP | File Type: audio/mp3 | Duration: 00:39:26

Some companies that offer services expect you to do things their way or take the highway. However, Google expects people to simply adapt the tech company’s suggestions and best practices for their specific context. This is how things are done at Google, but this may not work in your environment. Today, we’re talking to Liz Fong-Jones, a Senior Staff Site Reliability Engineer (SRE) at Google. Liz works on the Google Cloud Customer Reliability Engineering (CRE) team and enjoys helping people adapt reliability practices in a way that makes sense for their companies. Some of the highlights of the show include: Liz figures out an appropriate level of reliability for a service and how a service is engineered to meet that target Staff SRE involves implementation, and then identifying and solving problems Google’s CRE team makes sure Google Cloud customers can build seamless services on the Google Cloud Platform (GCP) Service Level Objectives (SLOs) include error budgets, service level indicators, and key metrics to resolve issues when technology fails Learn from failures through instant reports and shared post-mortems; be transparent with customers and yourself GCP: Is it part of Google or not? It’s not a division between old and new. Perceptions and misunderstandings of how Google does things and how it’s a different environment Google’s efforts toward customer service and responsiveness to needs Migrating between different Cloud providers vs. higher level services How to use Cloud machine learning-based products GCP needs to focus on usability to maintain a phase of growth Offer sensible APIs; tear up, turn down, and update in a programmatic fashion Promotion vs. Different Job: When you’ve learned as much as you can, look for another team to teach something new What is Cloud and what isn’t? Cloud deployments require SRE to be successful but SREs can work on systems that do not necessarily run in the Cloud. Links: Cloud Spanner Kubernetes Cloud Bigtable Google Cloud Platform blog - CRE Life Lessons Google SRE on YouTube

 Episode 19: I want to build a world spanning search engine on top of GCP | File Type: audio/mp3 | Duration: 00:39:26

Some companies that offer services expect you to do things their way or take the highway. However, Google expects people to simply adapt the tech company’s suggestions and best practices for their specific context. This is how things are done at Google, but this may not work in your environment. Today, we’re talking to Liz Fong-Jones, a Senior Staff Site Reliability Engineer (SRE) at Google. Liz works on the Google Cloud Customer Reliability Engineering (CRE) team and enjoys helping people adapt reliability practices in a way that makes sense for their companies. Some of the highlights of the show include: Liz figures out an appropriate level of reliability for a service and how a service is engineered to meet that target Staff SRE involves implementation, and then identifying and solving problems Google’s CRE team makes sure Google Cloud customers can build seamless services on the Google Cloud Platform (GCP) Service Level Objectives (SLOs) include error budgets, service level indicators, and key metrics to resolve issues when technology fails Learn from failures through instant reports and shared post-mortems; be transparent with customers and yourself GCP: Is it part of Google or not? It’s not a division between old and new. Perceptions and misunderstandings of how Google does things and how it’s a different environment Google’s efforts toward customer service and responsiveness to needs Migrating between different Cloud providers vs. higher level services How to use Cloud machine learning-based products GCP needs to focus on usability to maintain a phase of growth Offer sensible APIs; tear up, turn down, and update in a programmatic fashion Promotion vs. Different Job: When you’ve learned as much as you can, look for another team to teach something new What is Cloud and what isn’t? Cloud deployments require SRE to be successful but SREs can work on systems that do not necessarily run in the Cloud. Links: Cloud Spanner Kubernetes Cloud Bigtable Google Cloud Platform blog - CRE Life Lessons Google

 Episode 18: Sitting on the curb clapping as serverless superheroes go by | File Type: audio/mp3 | Duration: 00:36:23

What’s serverless? Are you serverless now? Is going from enterprise to serverless a natural evolution? Or, is it a “that was fun, now let’s go ride our bikes” moment? Is serverless “just a toy?” Is it a wide and varied ecosystem, or is it Lambda plus some other randos? What's up with serverless vs. containers? Today, Forrest Brazeal is here to answer those questions and discuss pros and cons of serverless. He was a senior Cloud architect prior to joining Trek10. Forrest spent several years leading AWS and serverless engineering projects at Infor. He understands the challenges faced by enterprises moving to the Cloud and enjoys building solutions that provide maximum business value at a minimal cost.  Some of the highlights of the show include: Bimodality: Backend development going away and being replaced by managed services; undifferentiated items are being moved to the Cloud Serverless is application designs with “Backend as a Service” (BaaS) and/or “Functions as a Service” (FaaS) platforms; everything is managed for you AWS Lambda: Is it today’s trend or a bias that everyone is using it; Lambda makes up 80% of current FaaS adoption Serverless Ecosystem: You can build it however you want, and you’re doing it right; but don’t take that at face-value; no two Lambda environments are alike Cloud services at this scale have not been knitted together to form applications that are serving major workloads; best practices need to be established Native Cloud providers will consolidate, and individual frameworks will be created with components of application stacks tied together to build systems Serverless vs. Containers: No need for disparity - we can learn to get along; people use containers because it is easier than going serverless Serverless Heroes series features people thinking out-of-the-box and helps identify emerging trends; serverless is growing, and it’s not just about startups Went from working with a Sharpie to Procreate for the FaaS and Furious cartoon series; serverless component of process is for invoicing     Changes? Packaging to handle sharing; more knobs on console; unified process needed because too many building own workflow and tooling Certification: Proof-positive that you know what you’re talking about or is it questionable value if not backing up expertise in the real world? Links: Forrest Brazeal on Twitter Invoiceless Summon the vast power of certification - Dilbert cartoon Trek10 blog A Cloud Guru ThinkfaaS podcast A Cloud Guru - Serverless Superheros Why We’re Excited About AWS AppSync Serverless Architectures with Mike Roberts AWS Lambda AWS Serverless Application Model (SAM) Procreate AWS Certified Cloud Practitioner Serverlessconf Digital Ocean

 Episode 18: Sitting on the curb clapping as serverless superheroes go by | File Type: audio/mp3 | Duration: 00:36:23

What’s serverless? Are you serverless now? Is going from enterprise to serverless a natural evolution? Or, is it a “that was fun, now let’s go ride our bikes” moment? Is serverless “just a toy?” Is it a wide and varied ecosystem, or is it Lambda plus some other randos? What's up with serverless vs. containers? Today, Forrest Brazeal is here to answer those questions and discuss pros and cons of serverless. He was a senior Cloud architect prior to joining Trek10. Forrest spent several years leading AWS and serverless engineering projects at Infor. He understands the challenges faced by enterprises moving to the Cloud and enjoys building solutions that provide maximum business value at a minimal cost.  Some of the highlights of the show include: Bimodality: Backend development going away and being replaced by managed services; undifferentiated items are being moved to the Cloud Serverless is application designs with “Backend as a Service” (BaaS) and/or “Functions as a Service” (FaaS) platforms; everything is managed for you AWS Lambda: Is it today’s trend or a bias that everyone is using it; Lambda makes up 80% of current FaaS adoption Serverless Ecosystem: You can build it however you want, and you’re doing it right; but don’t take that at face-value; no two Lambda environments are alike Cloud services at this scale have not been knitted together to form applications that are serving major workloads; best practices need to be established Native Cloud providers will consolidate, and individual frameworks will be created with components of application stacks tied together to build systems Serverless vs. Containers: No need for disparity - we can learn to get along; people use containers because it is easier than going serverless Serverless Heroes series features people thinking out-of-the-box and helps identify emerging trends; serverless is growing, and it’s not just about startups Went from working with a Sharpie to Procreate for the FaaS and Furious cartoon series; serverless component of process is for invoicing     Changes? Packaging to handle sharing; more knobs on console; unified process needed because too many building own workflow and tooling Certification: Proof-positive that you know what you’re talking about or is it questionable value if not backing up expertise in the real world? Links: Forrest Brazeal on Twitter Invoiceless Summon the vast power of certification - Dilbert cartoon

 Episode 17: Pouring Kubernetes on things with reckless abandon | File Type: audio/mp3 | Duration: 00:49:18

DevOps as a service describes what Reactive Ops is trying to do, who it’s trying to help, and what problems it’s trying to solve. It’s passion to deliver service where human beings help other human beings is done through a group of engineers who are extremely good at solving problems. Sarah Zelechoski is the vice president of engineering at Reactive Ops, which defines the world’s problems and solves them by pouring Kubernetes on top of them. The team focuses on providing expert-level guidance and a curated framework using Kubernetes and other open source tools. Sarah's greatest passion is helping others, which encompasses advocating for engineers and rekindling interest in the lost art of service in the tech space. Some of the highlights of the show include: Kubernetes is changing the way people work; it offers a way to release a product, provide access to it, and behaviors when you deploy it Any person/business can use Kubernetes to mold their workflow Kubernetes is complex and has sharp edges; it has only recently become productive because of its community finding and reporting issues Business value of deploying Kubernetes to a new environment: Flexibility and uniform system of management; and it can provide a context shift Implementation Challenges with Workshops/Tutorials: Valuable entry level strategy for people learning Kubernetes; but the translation is not easy About 85% of the work Reactive Ops does is helping its customers get on to Kubernetes is spent on application architecture If thinking about moving to Kubernetes, how well will your current applications translate? Do you want to start over from scratch? Value in paying someone to do something for you Using Defaults: Try initially until you realize what you need; Kubernetes gives you options, but it’s a challenging path to go from defaults to advanced Deploying a workload between all major Cloud providers is possible, but there are challenges in managing multiple regions or locations Cluster Ops: Managed Kubernetes clusters where Reactive Ops stays on the map, watches them, and puts them on pager, so you can continue your work without having to worry Links: Sarah Zelechoski on Twitter Reactive Ops Kubernetes GKE from GCB AKS from Azure EKS from AWS Kops Terraform Slack

 Episode 17: Pouring Kubernetes on things with reckless abandon | File Type: audio/mp3 | Duration: 00:49:18

DevOps as a service describes what Reactive Ops is trying to do, who it’s trying to help, and what problems it’s trying to solve. It’s passion to deliver service where human beings help other human beings is done through a group of engineers who are extremely good at solving problems. Sarah Zelechoski is the vice president of engineering at Reactive Ops, which defines the world’s problems and solves them by pouring Kubernetes on top of them. The team focuses on providing expert-level guidance and a curated framework using Kubernetes and other open source tools. Sarah's greatest passion is helping others, which encompasses advocating for engineers and rekindling interest in the lost art of service in the tech space. Some of the highlights of the show include: Kubernetes is changing the way people work; it offers a way to release a product, provide access to it, and behaviors when you deploy it Any person/business can use Kubernetes to mold their workflow Kubernetes is complex and has sharp edges; it has only recently become productive because of its community finding and reporting issues Business value of deploying Kubernetes to a new environment: Flexibility and uniform system of management; and it can provide a context shift Implementation Challenges with Workshops/Tutorials: Valuable entry level strategy for people learning Kubernetes; but the translation is not easy About 85% of the work Reactive Ops does is helping its customers get on to Kubernetes is spent on application architecture If thinking about moving to Kubernetes, how well will your current applications translate? Do you want to start over from scratch? Value in paying someone to do something for you Using Defaults: Try initially until you realize what you need; Kubernetes gives you options, but it’s a challenging path to go from defaults to advanced Deploying a workload between all major Cloud providers is possible, but there are challenges in managing multiple regions or locations Cluster Ops: Managed Kubernetes clusters where Reactive Ops stays on the map, watches them, and puts them on pager, so you can continue your work without having to worry Links: Sarah Zelechoski on Twitter Reactive Ops Kubernetes GKE from GCB AKS from Azure EKS from AWS Kops Terraform Slack

 Episode 16: There are Still Servers, but We Don't Care About Them | File Type: audio/mp3 | Duration: 00:33:25

Are you interested in going beyond basic monitoring and visibility? Need tools to build and operate serverless applications and extract business intelligence? IOpipe provides extended visibility and metrics around AWS Lambda, including profiling, core dumps, and incoming input events. Today, we’re talking to Erica Windisch, who is the founder and CTO of IOpipe. She brings her experience in building developer and operational tooling to serverless applications. Erica also has more than 17 years of experience designing and building Cloud infrastructure management solutions. She was an early and longtime contributor to OpenStack and maintainer of the Docker project. Some of the highlights of the show include: Nomenclature Battle: Serverless vs. stateless Building a window of visibility into Lambda: Talking to users and assessing needs/pain points Observability of the infrastructure: Necessary evil to get to automated healing Using Lambda at significant levels of scale; some companies grow usage, others go all in right away Current state of Lambda ecosystem Is Lambda stable? Indications and no formal SLA How issues manifest and are exposed Trends include cold starts, hours-long failures, and multiple function evokes Infrastructure powering IOpipe: Lambda issues may impact performance of monitoring system, but IOpipe is not necessarily dependent on Lambda Future of Lambda: Builds applications a specific way, but there are limitations What would Erica change about Lambda? Run function and define handlers Lambda functions can be difficult to understand; some developers do not have familiarity and create bottlenecks Capacity limits around Lambda can be difficult to establish Links: Erica Windisch on Twitter Erica Windisch on Twitch IOpipe 12-Factor App Cloud Custodian in Lambda Velocity London ServerlessConf London re:Invent AWS Glue

 Episode 16: There are Still Servers, but We Don't Care About Them | File Type: audio/mp3 | Duration: 00:33:25

Are you interested in going beyond basic monitoring and visibility? Need tools to build and operate serverless applications and extract business intelligence? IOpipe provides extended visibility and metrics around AWS Lambda, including profiling, core dumps, and incoming input events. Today, we’re talking to Erica Windisch, who is the founder and CTO of IOpipe. She brings her experience in building developer and operational tooling to serverless applications. Erica also has more than 17 years of experience designing and building Cloud infrastructure management solutions. She was an early and longtime contributor to OpenStack and maintainer of the Docker project. Some of the highlights of the show include: Nomenclature Battle: Serverless vs. stateless Building a window of visibility into Lambda: Talking to users and assessing needs/pain points Observability of the infrastructure: Necessary evil to get to automated healing Using Lambda at significant levels of scale; some companies grow usage, others go all in right away Current state of Lambda ecosystem Is Lambda stable? Indications and no formal SLA How issues manifest and are exposed Trends include cold starts, hours-long failures, and multiple function evokes Infrastructure powering IOpipe: Lambda issues may impact performance of monitoring system, but IOpipe is not necessarily dependent on Lambda Future of Lambda: Builds applications a specific way, but there are limitations What would Erica change about Lambda? Run function and define handlers Lambda functions can be difficult to understand; some developers do not have familiarity and create bottlenecks Capacity limits around Lambda can be difficult to establish Links: Erica Windisch on Twitter Erica Windisch on Twitch IOpipe 12-Factor App Cloud Custodian in Lambda Velocity London ServerlessConf London

 Episode 15: Nagios was the Original Call of Duty | File Type: audio/mp3 | Duration: 00:27:38

Let’s chat about the Cloud and everything in between. The people in this world are pretty comfortable with not running physical servers on their own, but trusting someone else to run them. Yet, people suffer from the psychological barrier of thinking they need to build, design, and run their own monitoring system. Fortunately, more companies are turning to Datadog. Today, we’re talking to Ilan Rabinovitch, Datadog’s vice president of product and community. He spends his days diving into container monitoring metrics, collaborating with Datadog’s open source community, and evangelizing observability best practices. Previously, Ilan led infrastructure and reliability engineering teams at various organizations, including Ooyala and Edmunds.com. He’s active in the open source and DevOps communities, where he is a co-organizer of events, such as SCALE and Texas Linux Fest. Some of the highlights of the show include: Datadog is well-known, especially because it is a frequent sponsor More organizations know their core competency is not monitoring or managing servers Monitoring/metrics is a big data problem; Datadog takes monitoring off your plate Alternate ways, other than using Nagios, to monitor instances and regenerate configurations Datadog is first to identify patterns when there is a widespread underlying infrastructure issue Trends of moving from on-premise to Cloud; serverless is on the horizon How trends affect evolution of Datadog; adjusting tools to monitor customers’ environments Datadog’s scope is enormous; the company tries to present relevant information as the scale of what it’s watching continues to grow Datadog’s pricing is straightforward and simple to understand; how much Cloud providers charge to use Datadog is less clear Single Pane of Glass: Too much data to gather in small areas (dashboards)   Why didn’t monitoring catch this? Alerts need to be actionable and relevant How to use Datadog’s workflow for setting alerts and work metrics Datadog’s first Dash user conference will be held in July in New York; addresses how to solve real business problems, how to scale/speed up your organization Links: Ilan Rabinovitch on Twitter Datadog Docker Adoption Survey Results   Rubric for Setting Alerts/Work Metrics Dash Conference re:Invent Nagios

Comments

Login or signup comment.