Data Crunch | Big Data | Data Analytics | Data Science show

Data Crunch | Big Data | Data Analytics | Data Science

Summary: Whether you like it or not, your world is shaped by data. We explore how it impacts people, society, and llamas perched high on Peruvian mountain peaks—through interviews, inquest, and inference. Buckle up.

Join Now to Subscribe to this Podcast

Podcasts:

 No PhD Necessary | File Type: audio/mpeg | Duration: 13:45

The ubiquity of and demand for data has increased the need for better data tools, and as the tools get better and better, they ease the entry into data work. In turn, as more people enjoy the ease of use, data literacy becomes the norm.   Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” “We have a gift for you this holiday season. We’re giving you, our listeners, a website . . . it’s a website of all the AI applications we come across or hear about in our daily research. We post bite-size snippets about the interesting applications we are finding that we can’t feature on the podcast so that you can stay informed and see how AI is changing the world right now. There are so many interesting ways that AI is being used to change the way people are doing things. For example, did you know that there is an AI application for translating chicken chatter? Or using drones to detect and prevent shark attacks on coastal waters? To experience your holiday gift, go to datacrunchpodcast.com/ai.” Curtis: “If you’ve listened to our History of Data Science series, you know about the amazing advances in technology behind the leaps we’ve seen in data science over the past several years, and how AI and machine learning are changing the way people work and live. “But there is another trend that’s also been happening that isn’t talked about as much, and it’s playing an increasingly important role in the story of how data science is changing the world. “To introduce the topic, we talked with someone who is part of this trend, Nick Goodhartz.” Nick Goodhartz: “So I went to school at Baylor University, and I studied finance and entrepreneurship and a minor in music. I ended up taking a job with a start-up as a data analyst essentially. So it was an ad technology company that was a broker between websites and advertisers, and so I analyzed all the transactions between those and tried to find out what we are missing. “We were building out these reports in Excel, but there was a breaking point when we had this report that we all worked off of, but it got too big to even email to each other. It was this massive monolith of an Excel report, and we figured there's got to be a better way, and someone else on our team had heard of Tableau, and so we got a trial of it. In 14 days we—actually less than 14 days—we were able to get our data into Tableau, take a look at some things we were curious about, and pinpointed a possible customer who had popped their head out and then disappeared. We approached them and signed a half million dollar deal, and that paid for Tableau a hundred times over, so it was one of those moments where you really realize, ‘man, there’s something to this.’ “That's what got me into Tableau and what changed my mind about data analysis because at school analyzing finance it was nothing but Excel and mindless tables of stock capitalization and all this stuff and what made it fascinating was finding a way to look at it and answer questions on the fly, and then it actually changed the way I look at things around me. I find myself now watching a television show and thinking ‘well this episode wasn't as interesting. I wonder what the trends of the ratings look like.’ It really has changed the way I think about data because of how easy it's been to access it.” Ginette: “Nick is a member of a growing portion of people who didn’t think they’d end up doing analytics. He didn’t have the specific training for it, he doesn’t have a computer science or statistics degree, and he doesn’t spend nights and weekends writing code. And yet, he was able to produce extremely useful insights from his company’s data...

 How to Succeed at IoT—Amid Increasing Complexity | File Type: audio/mpeg | Duration: 17:43

Episode Summary The growth of the Internet of Things, or IoT, is often compared with the industrial revolution. A completely new phase of existence. But what does it take to be part of this revolution by building an IoT product? It's complex, and Daniel Elizalde gives us a peek into what the successful process looks like. For the full episode, listen by selecting the Play button above or by selecting this link, or you can also listen to the podcast through Apple Podcasts, Google Play, Stitcher, and Overcast. Donate 15 Seconds If you liked this episode, please consider giving us a review on iTunes! It helps other people find the show and lets us know how we’re doing. Partial Transcript (for the full episode, select play above or go here) Ginette: “So, today, we’re defining an IoT product, or an Internet of Things product, as “a product that has a combination of hardware and software. It acquires signals from the real world, sends that information to the cloud through the Internet, and it provides some value to your customers. ”Okay, so before we introduce you to our guest, consider this: The IoT Market is infernally hot. In 2016, we had 6.4 billion connected ‘things’ in use worldwide, and Gartner research firm projects that number will nearly double to 11.2 billion in 2018, and then nearly doubling again to 20.4 billion IoT products in 2020. For context, this last number is about 2 and a half times the number of people on earth. “Let’s look at an example of IoT at work. Let’s say you’re an oyster farmer, and you need to keep your oysters under a certain temperature because harmful bacteria might grow if you don’t—which would result in people getting very sick after eating your product. If that happened, the FDA could shut your operation down. “This is where IoT products can help you. You can track water temperature with sensors. Those sensors can send that data to the cloud, where you can access it. The system will even send you an alert if the temperature ranges outside your chosen temperature criteria. You can use cameras that show when the oysters are harvested and how long the oysters are out of cold water before they’re put on ice. By using these sensors and cameras to record harvest date, time, location, and temperature at all stages of harvest, you have recorded evidence that you’ve properly handled the harvest. “So, for the purposes of today’s episode, let’s now switch to the other perspective—to the perspective of someone who wants to make and sell an IoT product. Imagine you and two of your friends recently launched an IoT startup—you’re able to secure funding to build your IoT product, and you’ve hired some team members to help you get your beta version off the ground. But you’re new to building products like this, and the rest of your team is also pretty new to it as well. So you decide to talk with someone who is an expert in the IoT space who can give you and your team pointers—and you’re lucky enough to find this man.” Daniel: “My name is Daniel Elizalde. I am the founder of Tech Product Management. My company focuses on providing training for companies building IoT products, specifically I focus on training product managers. I've been doing IoT really for over 18 years,

 After Disaster Strikes: Data in Disaster Recovery | File Type: audio/mpeg | Duration: 26:29

  Episode Summary We’ve seen photos of disasters depicting fearful and fleeing victims, ravaged properties, and despondent survivors. In this episode, we explore two ways data can help survivors heal and how data also tells their stories. For the full episode, listen by selecting the Play button above or by selecting this link, or you can also listen to the podcast through Apple Podcasts, Google Play, Stitcher, and Overcast.   Donate 15 Seconds If you liked this episode, please consider giving us a review on iTunes! It helps other people find the show and lets us know how we're doing!   Partial Transcript (for the full episode, select play above or go here) Aaron Titus: “I almost disbelieved my own numbers, even though I chose the most conservative ones. It's just outrageous. I'm like, ‘Really? A 233x ROI?’ That's insane.” Ginette: “I’m Ginette." Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” “Today’s episode is brought to you by Lightpost Analytics. Data skills are in intense demand and are key for organizations to remain competitive; in fact, Forbes listed the industry’s leading data visualization software, Tableau, as the number three skill with the most explosive growth in demand, so investing in yourself to stay relevant in today's hyper-competitive, data-rich, but insights-hungry world is extremely important. Lightpost Analytics is a trusted training partner to help you develop the Tableau skills you need to stay relevant. Check them out at lightpostanalytics.com and let them know that Data Crunch sent you."  “Today, we look at what it takes to understand a larger story—when many disparate voices come together to tell you something much more powerful, and specifically how it can help people deal with the large scale devastation of natural disasters. Let’s jump into how one man did something about his pet peeve, and it produced $300,000,000.00 dollars in savings. And then we’ll pop over to New Zealand to explore how a disaster situation affected Christchurch and what people did about it.” Aaron: “I was a disaster relief volunteer in New Jersey during hurricanes Irma (Ginette: Here Aaron actually means Irene) and Sandy, and my area got very hard hit by Irma, and I started off as a relief volunteer and ended up directing a lot of those relief efforts for my church, and while I was there, I remember standing in very long lines, and a thousand of us would gather together at a field command center and spend an hour and a half waiting to get checked in, which is lightning speed for 1,000 people, but it's still an hour and a half. “And while everybody was waiting, they’d pull out their phones and would start playing Angry Birds, and the technologist in me would just scream inside, “I could have you all checked in with your work orders in 30 seconds, not an hour and a half!” “And I abhor inefficiency—to a fault—like it's almost a little bit of a sickness. I really ought to be better,

 The Complex World of Data Scientists and Black-Box Algorithms | File Type: audio/mpeg | Duration: 25:12

Hilary Mason is a huge name in the data science space, and she has an extensive understanding of what's happening in this space. Today, she answers these questions for us: * What are the backgrounds of your typical data scientists? * What are key differences between software engineering and data science that most companies get wrong? * How should you measure the effectiveness of your work or your team's work as a data scientist for the best results? * What is a good approach for creating a successful data product? * How can we peak behind the curtain of black-box deep learning algorithms? Below is a partial transcript. For the full interview, listen to the podcast episode by selecting the Play button above or by selecting this link, or you can also listen to the podcast through Apple Podcasts, Google Play, Stitcher, and Overcast. Curtis: Today we hear from one of the biggest thinkers in the data science space, someone who DJ Patil endorses on LinkedIn for data science skills. She worked at bit.ly, the url shortener, and is a data scientist in residence at venture capital firm Accel Partners, a firm that helped fund some companies you may know, like Facebook, Slack, Etsy, Venmo, Vox Media, Lynda.com, Cloudera, Trifacta—and you get the picture. Ginette: The partner of this VC firm said that Accel wouldn’t have brought on just any data scientist. This position was specifically created because this particular data scientist might be able to join their team. Curtis: But beyond her position as data in residence with Accel, she founded a company that’s doing very interesting research, and today, she shares with us some of her experiences and perspective on where AI is headed. Ginette: I’m Ginette. Curtis: And I’m Curtis. Ginette: And you are listening to Data Crunch. Curtis: A podcast about how data and prediction shape our world. Ginette: A Vault Analytics production. Hilary: I'm Hilary Mason, and I'm the founder and CEO of Fast Forward Labs (Please note that Hilary is now the VP of Research at Cloudera). In addition to that, I'm a data science in residence for Accel Partners. And I've been working in what we now call data science, or even now call AI, for about twenty years at this point. Started my career in academic machine learning and decided startups were more fun and have been doing that for about 10,   12 years depending on how you count now, and it's a lot of fun! Ginette: Something I’d like to note here is there’s been a very recent change: Hilary’s company, Fast Forward Labs, and Cloudera recently joined forces, and Hilary’s new position is Vice President of Research at Cloudera. Now, one thing that Hilary talks to is where the data scientists she works with come from, which is a great example of the different paths people take to get into this field. Hilary I am a computer scientist, and I have studied computer science. It's funny because now at Fast Forward, our team only has only two computer scientists on it, and one of them is our general counsel, and one is me, and I'm running the business, so most of the people doing data science here come from very different backgrounds. We have a bunch of physicists, mathematicians, a   neuroscientist, a person who does brilliant machine learning design who was an English major, and so data science is one of those fields where one of the things I really l...

 Deep Learning—A Powerful Tool, with a Name that Means Nothing | File Type: audio/mpeg | Duration: 16:55

Tesla isn’t the only car brand in the world producing or aiming to produce self-driving cars. Every single car brand is working on developing self-driving cars. But what does this mean for our future? We talk about this and other interesting deep learning projects and history with Ran Levi, science and technology observer and podcaster, who explains in thought-provoking ways what we have to look forward to. Below is a partial transcript. For the full interview, listen to the podcast episode by selecting the Play button above or by selecting this link, or you can also listen to the podcast through Apple Podcasts, Google Play, Stitcher, and Overcast. Ran Levi: “I actually had the pleasure of being invited to Google's Mountain View headquarters, and they took me for a drive in one of their autonomous vehicles, and it was, to tell you about that drive because it was boring—boring in a good way. Nothing happened! We were just driving around. The car was driving itself all around Mountain View. And it worked. “The first time I entered such a car, I didn't know what to expect. I mean, I didn't know how reliable are those kinds of cars. So I had the idea that maybe I should sit somewhere where I can maybe jump and grab the wheel if necessary. You know, I was a bit dumb. They don't need me, really. And probably if I touch the steering wheel, I would probably make some mistake and ruin the car. It drives better without me.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Ginette: “We have a great live show planned that we hope to give at SXSW 2018. It's a really awesome show about the power of niche artificial intelligence, and we’re going to share details from our research into what amazing things AI is doing right now on the fringe and in mainstream AI projects. We're really excited to share it, so if you’re going to SXSW, or you just want to be good hearted and help us out, please vote on our dual panel by going to panelpicker.sxsw.com, signing in, and liking our topic, which you can find by searching for ‘The Power of Niche AI: From Cucumbers to Cancer.’ “Today we get to talk to Ran Levi, who’s been researching and reporting on science and technology for the past 10 years. He’s a hugely successful science and tech podcaster in Israel, producing a Hebrew-language show called Making History, and he’s also producing two English podcasts right now for an international audience, so since he’s steeped in the subject, he has a lot of very interesting insights for us.” Ran: “I'm actually an electronics engineer by trade. I was an engineer for 15 years. I was both a hardware and software developer for several companies in Israel. And during my day job as an engineer, I wrote some books about the history of science and technology, which was always a big hobby of mine. And actually, I started a podcast about this very subject about 10 years ago, and it became quite a hit in Israel I’m happy to say. So about four years ago, I quit my day job, and I actually started my own podcasting company, and now we are podcasting both in Israel and in the U.S. for international audience and actually launched my brand new podcast last week. It's called Malicious Life about the history of malware and cybersecurity,

 When Song Lyrics and British Lit Meet Tidy Text | File Type: audio/mpeg | Duration: 17:48

When Julia Silge's personal interests meet her professional proficiencies, she discovers new meaning in Jane Austen's literature, and she gauges the cultural influence of locations in pop songs. Even more impressive than these finds, though, is that she and her collaborator, Dave Robinson, have developed some new, efficient ways to mine text data. Check out the book they've written called Tidy Text Mining with R. Below is a partial transcript. For the full interview, listen to the podcast episode by selecting the Play button above or by selecting this link, or you can also listen to the podcast through Apple Podcasts, Google Play, Stitcher, and Overcast. Transcript Julia Silge: “One that I worked on that was really fun was about song lyrics. The last 50 years or so of pop songs, we have all these lyrics, so all this text data, and I wanted to ask the question, what places are mentioned more or less often in these pop songs.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Curtis: “Brought to you by data.world, the social network for data people. Discover and share cool data, connect with interesting people, and work together to solve problems faster at data.world. Whether you’re already a frequent dataset contributor or totally new to data.world, there are several resources you can use to stay in the loop on the latest features, learn new skills, and get support. Check out docs.data.world for up-to-date API documentation, tutorials on SQL, and other query techniques, and much more!” Ginette: “We hope you’re enjoying some vacation time this summer. We just did, and now Data Crunch is back! To hear the latest from us, add us on Twitter, @datacrunchpod. Today we hear from an exciting guest—someone who is on the cutting edge of data science tool creation, someone exploring and developing new ways to slice and dice difficult data.” Julia: “My name is Julia Silge, and I'm a data scientist at Stack Overflow. My academic background is in physics and astronomy, but I’ve worked in academia, teaching and doing research, I worked at an ed tech start up, and I've made a transition now into data science.” Ginette: “Stack Overflow, where Julia works, is the largest online community for programmers to learn, share knowledge, and build their careers. It's a great resource when you need to solve a coding problem or develop new skills.” Curtis: “Now there are basically two main camps in data science: people who program with R, a statistical programming language, and people who program with Python, a high-level, general purpose language. Both languages have devoted followers, and both do excellent work. Today, we’re looking at R, and Julia is a big name in this space, as is her collaborator Dave Robinson.” Julia: “Text is increasingly a really important part of our work as people who are involved in data. Text is being generated all the time, at ever faster rates. This unstructured data is becoming a really important part of things that we do. I also am somebody that—my academic background is not in text or literature or natural language processing or anything like that, but I am somebody who's always been a reader and always been interested in language,

 How Data Is Eradicating Malaria in Zambia | File Type: audio/mpeg | Duration: 17:16

According to the CDC, people have been writing descriptions of malaria—or a disease strikingly similar to it—for over 4,000 years. How is data helping Zambian officials eradicate these parasites? Tableau Foundation's Neal Myrick opens the story to us. Below is a partial transcript. For the full interview, listen to the podcast episode by selecting the Play button above or by selecting this link or you can also listen to the podcast through Apple Podcasts, Google Play, Stitcher, and Overcast. Neal: “When somebody walks from their village to their clinic because they're sick, health officials can see that person now as the canary in a coal mine.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Curtis: “This episode is brought to you by data.world, the social network for data people. Discover and share cool data, connect with interesting people, and work together to solve problems faster at data.world. Looking for a lightweight way to deliver a collection of tables in a machine-readable format? Now you can easily convert any tabular dataset into a Tabular Data Package on data.world. Just upload the file to your dataset, select 'Tabular Data Package' from the 'Download' drop-down, and now your data can be effortlessly loaded into analytics environments. Get full details at meta.data.world.” Ginette: “Today we’re talking about something that can hijack different cells in your body for what we’ve deemed nefarious purposes. It enters your bloodstream when a mosquito transfers it from someone else who has it, to you. Once it’s in your body, it makes a B-line for your liver, and when safely inside your liver, it starts creating more of itself. “Sometimes, this parasite stays dormant for a long time, but usually it only takes a few days for it to get to work. It starts replicating, and there are suddenly thousands of new babies that burst into your bloodstream from your liver. When this happens, you might get a fever because of this parasite surge. As these new baby parasites invade your bloodstream, they hunt down and hijack red blood cells. They use these blood cells to make more of themselves, and once they’ve used the red blood cells, they leave them for dead and spread out to find more. Every time a wave of new parasites leaves the cells, it spikes the number of parasites in your blood, which may cause you to have waves of fever since it happens every few days. “This parasite can causes very dangerous side effects, even death. It can cause liver, spleen, or kidney failure, and it can also cause brain damage and a coma. To avoid detection, the parasites cause a sticky surface to develop on the red blood cell so the cell gets stuck in one spot so that it doesn’t head to the spleen where it’d probably get cleaned out. When the cells stick like this, they can clog small blood vessels, which are important passageways in your body. You may have guessed it, we’re describing malaria. “It plagues little children, pregnant women, and other vulnerable people. Children in particular are incredibly vulnerable, something that’s reflected in the statistics: one child dies every two minutes from malaria. “But often outbreaks are treatable, trackable,

 How Artificial Intelligence Might Change Your World | File Type: audio/mpeg | Duration: 20:17

What does the creation of new artificial intelligence products look like today, and what do experts in this field foresee realistically happening in the near future? One thing's for sure, the way we work and function in life will change as a result of growth in this field. Listen and find out more. Below is a partial transcript. For the full interview, listen to the podcast episode by selecting the Play button above or by selecting this link or you can also listen to the podcast through Apple Podcasts, Google Play, Stitcher, and Overcast. Transcript Irmak Sirer: “It’s kind of like a Where’s Waldo of finding an expert in this entire giant ocean of people.”   Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Curtis: “Brought to you by data.world, the social network for data people. Discover and share cool data, connect with interesting people, and work together to solve problems faster at data.world. A complex dataset with a ton of files can quickly become scary and unwieldy, but you need not fear! Now you can use file labels and descriptions to manage and organize your many files on data.world. With file labels and descriptions, you can quickly see what type of file it is, view a short description, and also filter down by file type. Wanna see an example of how data.world users are using file labels and descriptions to keep their dataset organized? Search "data4democracy/drug-spending" on data.world. Ginette: “Today we’re taking a closer look at something that is starting to seep into our daily lives. In one of its forms, it’s something Stephen Hawking, Bill Gates, and Elon Musk are concerned will eventually be a threat to mankind. In another form, though, you’re probably already using it, and it’s becoming a major game changer, kind of like the early days of the desktop computer. We’re talking about artificial intelligence. You use AI when you talk to Siri or your in-home assistant, Alexa or Echo, and some people are using it in the form of a self-driving car. “So daily applications of artificial intelligence are on the rise, becoming much more of a staple in our society, but AI’s definition shifts according to the source. Popular movies depict AI as having a consciousness, emotions, and exhibiting human-like characteristics. Usually it’s involved in some sort of world-domination plot to kill all the humans. Although most experts agree that artificial intelligence will never actually think and feel like a human, the existential threat still exists. This kind of apocalyptic AI is known as ‘general AI.’ But that’s a topic for another episode. Today, we’re focusing on the kind of AI that currently exists, otherwise known as narrow AI.” Curtis: “A narrow AI is called narrow because it’s usually focused on one specific task, where as a general AI would be able to be good pretty much any task thrown its way. The Google search bar is probably the most ubiquitous example of a narrow AI that most people use on a daily basis. The process usually goes like this: you give it an input like ‘How to own a llama as a pet.’ It does its processing. It gives you an output in the form of the 10 most relevant web pages to answer your questions (along, of course,

 Preventing a Honeybee Fallout | File Type: audio/mpeg | Duration: 17:48

What would the world look like without honeybees? In theory, if there were no honeybees, it could drastically change our lives. Bjorn Lagerman, though, never wants to know the actual answer to that question. but the honeybees current worst foe, Varroa Destructor, is killing off honeybee hives at intense rates. Bjorn's in the middle of a machine learning project to save the bees from the vampirish Varroa. Below is a partial transcript. For the full interview, listen to the podcast episode by selecting the Play button above or by selecting this link or you can also listen to the podcast through iTunes, Google Play, Stitcher, and Overcast. Bjorn Lagerman: “My name is Bjorn Lagerman. I live in the middle of Sweden. When I look back in my younger days, I remember, I sat in school, looked outside the window and decided I wanted to be outside. You know, I was raised in a stone desert in the middle of Stockholm in the old town; that's a medieval town. And inside the blocks, there were sort of an oasis of water and fountains and green in this stone desert, but the streets were very old streets. And then the contrast was that in the summertime, I spent that in the countryside, and that was total freedom—you kow, lakes, rivers, forests, and my parents let us do what we wished during all the days, just come home for dinner. So when I was 22, I thought bees might be a reason to spend more time in nature. So I went to the nearest beekeeper, . . . and he sold me my first colony, and from there on, I was really hooked.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Curtis: “This episode is brought to you by data.world, the social network for data people. Discover and share cool data, connect with interesting people, and work together to solve problems faster at data.world. A complex dataset with a ton of files can quickly become scary and unwieldy, but you need not fear! Now you can use file labels and descriptions to manage and organize your many files on data.world. With file labels and descriptions, you can quickly see what type of file it is, view a short description, and also filter down by file type. Wanna see an example of how data.world users are using file labels and descriptions to keep their dataset organized? Search ‘data4democracy/drug-spending’ on data.world.” Ginette: “Imagine for a minute what the world would look like without bees. The image is potentially pretty bleak: we’d have much less guacamole, fruit smoothies, chocolate everything, various vegetables, pumpkin pie, peach cobbler, almond butter, cashews, watermelons, coconuts, lemon, limes, and many more food products. Let’s not forget the obvious—we wouldn’t have honey, which man can’t replicate well. “But fruits, vegetables, and chocolate aren’t the only food stuffs that would be affected. Bees support other animal life. They pollinate alfalfa, which helps feed dairy cows and boost their milk production, and on a more limited basis, alfalfa helps feed beef cows, sheep, and goats. Statistics vary, but bee pollination affects somewhere between one to two thirds of food on American’s plates. Beyond food, bees help grow cotton, so without bees, we’d have to rely more on synthetics for our cloth.

 When a Picture Is Worth a Life | File Type: audio/mpeg | Duration: 25:11

What if you found out your infant had eye cancer? That news would rock anyone’s world. But what if you had a tool that helped you catch it early enough that your baby didn’t have to lose his or her eye and didn’t have to go through chemo? You’d probably do almost anything to get it. Bryan Shaw has dedicated his time to helping parents detect this cancer sooner so their children don't have to go through what his son went through—and he’s doing it for free. With computer scientists from Baylor University, he's harnessed the power of a machine learning algorithm to detect cancer that no human eye can detect. Below is a partial transcript. For the full interview, listen to the podcast episode by selecting the Play button above or by selecting this link or you can also listen to the podcast through iTunes, Google Play, Stitcher, and Overcast.  Bryan Shaw: “The very first person who ever contacted me because our app helped them was a gentleman in Washington State, and his little girl had myelin retinal nerve fiber layer, which is an abnormal myelination of the retina, and it can cause blindness, but it presents with white eye. And his little girl was five years old, and he kept seeing white-eye pics. He heard our story. He downloaded our app. Our app detected the white-eye pics. That emboldened him enough to grill the child's doctor. You know, 'My camera's telling me this. Look, this app. I heard this story . . .’ The doctor takes a close look. The girl had been 75 percent blind in one of her eyes for years, and nobody had ever caught it.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Curtis: “Data Crunch is again brought to you by data.world, the social network for data people. Discover and share cool data, connect with interesting people, and work together to solve problems faster at data.world. Did you know that you can add files via URL to your data sets on data.world? Data.world APIs allow you to pull live survey data into your data set, enable automatic file updates, and more. Get the full details on data.world APIs at docs.data.world, or search ‘Austin Cycling Survey’ on data.world to see live survey sync in action in Rafael Pereira's data set!” Ginette: “One quick reminder that our data competition is currently up on data.world. Be sure to post your submissions by May 5. “Okay, now back to the story. If you know someone who’s about to have a child, has a child five or under, or plans to have children, you need to send them this episode, and you’re about to find out why from this man, Bryan Shaw.” Bryan: “When Noah was three-months-old, we started noticing that a lot of his pictures had white pupillary reflections, what doctors call leukocoria, white core, white pupil, and that can be a symptom of a lot of different eye diseases.” Ginette: “You probably put this together, but Noah is Bryan’s son. And to add in Noah’s mom’s perspective here, when she started noticing this strange white reflection in Noah’s eyes, like most moms today, she aggressively searched the Internet for answers. Like Bryan said, leukocoria could indicate a disease, or it could indicate nothing, but the Shaws decided they needed to tell their pediatrician about what they’d found.”

 How Many Slaves Work for You? | File Type: audio/mpeg | Duration: 20:15

If someone came up to you and randomly asked you, "How many slaves work for you?" maybe you'd think, "Slavery ended a long time ago, Bro." Or maybe you would take the question seriously. With 20 million to 46 million people enslaved in the world, it is a serious question, and while we don't see it daily, some of these enslaved people make things for us. Even if we're judicious about what we buy, we would be surprised just how much global slavery goes into producing the goods we do buy. But how can we quantify it? How can we solve this? Justin Dillon, who has worked with the U.S. State Department and hundreds of businesses, thinks he has the answer. Transcript: Ginette: “Our world today is an extremely vast, complicated, and interconnected web of 7.5 billion people. We’re directly connected to some, and it’s really easy to see those connections on Facebook, Instagram, Twitter, LinkedIn. But there’s a whole other group of people we are much more subtly connected to—people who are basically (who are essentially working for us) invisible to us, 20 to 46 million of them. “Our guest today deals with this invisible web every day.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production . . .” Ginette: “Today’s episode is brought to you by data.world, the social network for data people. Discover and share cool data, connect with interesting people, and work together to solve problems faster at data.world. Quickly locating data, understanding it, and combining it with other sources can be difficult. The data.world Python library allows you to bring data.world datasets straight into your workflow. Easily work with data and metadata in your Python scripts and Jupyter notebooks. Ready to dive in? Learn how to use data.world’s Python library at meta.data.world. Curtis: “Before we get going, one other note about data.world—starting today until May 5th, we are hosting a data competition on their site, and we’d love your participation. Donald Trump’s tweets have been the source of a lot of media attention recently—many high profile news outlets have asserted his tweets show signs of authoritarianism, some say he’s using his twitter account to shape the new cycle, and some have even built algorithms to make stock market decisions based on his tweets. Whatever your stance is on the subject, we’ve uploaded a dataset of every single one of his Tweets up to data.world, and we want to see what you can make of the data. This is a create competition by nature—submissions can be of any format, but the point is we want to see what you can learn, assert, or create with this data set. It’s easy to participate—just go to data.world/datacrunch, and you’ll find the dataset and all of the details. Submit by May 5, and we’re going to take all the submissions that tell the most compelling stories, we want to feature them on a future podcast episode.” Ginette: “Now back to the story. A few months ago, I ran across a website. It sucked me in. It asked me a provocative question, which we’ll get to in just a second, but first, we’ll introduce you to the man who’ll situate the story for you—the main person behind the website.”   Justin: “My name’s Justin Dillon. I’m the founder and CEO of Made in a Free World. We started off years ago. I would say probably the genesis for us was me getting a call from the State Department in about 2010. I’d already been doing some projects, a few websites and, films that I was producing, around human trafficking and modern-day slavery.” Curtis: “Justin directed a documentary he released in 2008 called ‘Call + Response,’ which ranked as one of the top documentaries in 2011.

 Predicting the Unpredictable | File Type: audio/mpeg | Duration: 21:16

We now know black swans exist, but Europeans once believed that spying one of their kind would be like stumbling across a unicorn in the woods—impossible. Then, Willem de Vlamingh spotted black swans in Australia, and this black bird, which once represented the impossible to Europeans, shifted to represent the unpredictable. One company now dons the name "Black Swan." Find out how it aims to predict what we currently consider to be unpredictable. Transcript Ginette: “Submerse yourself in early 1600s London culture for a minute. Shakespeare’s alive and in his late career. The first permanent English settlement in the Americas just happened. Oxygen hasn’t been discovered yet. But a lesser known cultural idiosyncrasy has to do with a large white bird, the swan. In Europe, the only swans anyone had seen or heard about were white, so of course, in their minds, a swan couldn’t be any other color. From this concept, a popular saying develops, originally stemming from a poem. You use it when you want to make a point that something either doesn't exist or couldn’t happen. You’d say something like this: ‘you’re not going to find out because it’s about as likely as seeing a black swan,’ meaning that, that thing or event was impossible. “But then a discovery blows everyone’s minds. Dutch explorer Willem de Vlamingh is sent on a highly important rescue mission. A lost ship with 325 people on it probably ran aground near Australia, and they needed him to go rescue these people and the goods on board. While Willem and the three ships under his command go and search Australia for this lost ship, they find lots of fish; unique trees; quokka, a cat-sized kangaroo-like creature; and . . . black swans. This last discovery inevitably permanently shifts the meaning of this saying. After this, people start using it more to say when something’s highly unlikely or an unpredictable moment. “Now this concept of an unpredictable moment is why Steve King named his company Black Swan, because they predict the seemingly unpredictable.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Steve King: “I am Steve King; I’m the CEO of Black Swan. Black Swan is 250 people who focus on trying to predict consumer behavior using data science, artificial intelligence, and big data. We have lots of large clients. We mostly work with big companies that have big problems to solve. Our work sort of splits across the US and the UK. Black Swan is absolutely full of stories. A lot of the work we really do is finding a hard problem that no one’s really solved before and then using data science to crack it, but there always quite interesting stories because, you know, they’re stories of a little bit of adventure, luck, and skill.’” Ginette: “The UK’s Sunday Times has consistently placed Black Swan on its lists: in 2014, it was on the ‘Ones to Watch’ list in its Tech Track. In 2015, it was ranked number one on the Start-Up Track. And in 2016, it was ranked number one in the Export Track 100, because it had the fastest growing international sales for the UK’s small to medium enterprises. “So what’s the secret sauce to the rapid growth and success of Black Swan, a company that solves problems for large companies in many different industries? It turns out, they aim to be better than anyone else at accessing and crunching a specific datasource.” Steve: “The reason we’re quite broad is it actually sits on one simple idea, and the simple idea really is that the Internet is really the world’s biggest data source, and we call, we call the Internet the world’s biggest focus group.

 The Golden Age of Data Science | File Type: audio/mpeg | Duration: 25:07

How did one boy's stuffed yellow elephant permanently intertwine itself in history? What is a data scientist? Why is right now the golden age for data science? We take a crack at all three of these questions—the second two, with the help of Gregory Piatetsky-Shapiro and Ryan Henning. Transcript Ginette: “Over the past few years, we’ve seen these news flashes: “An article in Harvard Business Review in 2014, titled: Data Scientist: the Sexiest Job of the 21st Century “Mashable’s article in 2015: So You Wanna Be a Data Scientist? A Guide to 2015’s Hottest Profession “Business Insider, 2016: Data Science was the #1 Profession as Rated by Glassdoor “A data science industry observer, KDnuggets, 2017: Data Scientist: Best Job in America, Again, which cites the most recent Glassdoor report outlining the very top jobs in America: “It turns out, four of the five top US jobs deal with data. In descending order, we find data scientist, devops engineer, data engineer, and analytics manager.” Curtis: “With four out of five of these top jobs orbiting data, clearly something’s going on here.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Ginette: “Today is a culmination of everything we’ve talked about in our series on the history of data science. This is where all the contributions of Florence Nightingale, William Playfair, Ronald Fisher, Ada Lovelace, and many others come together in one place. We’ll add a couple more people to this list to answer these two questions: ‘What is a data scientist? And why is right now the golden age of data science?’” Curtis: “According to IBM, ‘everyday, we create 2.5 quintillion bytes of data.’ But what does a quintillion actually look like? “Well, if you take one quintillion pennies, you could actually place them face up end to end can and blanket the entire surface of the earth 1.5 times over. Or think about one quintillion ants. That would be like taking all of the ants that exist today on planet earth according to some estimates, and then you have to take that number and multiply it by 100. So, that ant pile in your front yard becomes 100 ant piles in your front yard. Basically ants take over the earth. And we make 2.5 quintillion bytes every single day! “The next question is, how much information does that actually represent? It’s 250,000 times the amount of information that all the printed material in the Library of Congress contains. And we make that every single day.” Ginette: “In 2013, SINTEF published this stat, quote: ‘90% of the world’s data has been created in the preceding two years.’ According to one Ph.D. technologist, this has been true for the last 30 years because every two years, we produce 10 times as much data.” Curtis: “This exponential growth is insane. Just as an example of this type of growth rate, if you take a hypothetical scenario, and you take the world’s population, and say it starts growing as rapidly as data is growing now, it would look like this: Currently, the world’s population, 7 billion people, could fit in the size of Texas if they were living as densely as they do in New York City. Now, in two year’s time with this growth rate,

 The Curated History of Data Science, Part 3 | File Type: audio/mpeg | Duration: 19:06

From a small building in Pennsylvania to widespread usage across the world, we track the compelling story of one of the greatest technological innovations in history, setting the stage for the age of data science. Transcript: Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Ginette: “Today our story starts at a business building.” Curtis: “The building is in Philadelphia, Pennsylvania, on Broad and Spring Garden Streets to be precise. Envision the late 1940s.” Ginette: “You see a man absorbed in thought entering the building, and you decide to follow him in.” Curtis: “When you walk through his office, you find some bright engineering minds working on a fairly new startup in town: the Eckert-Mauchly Computer Corporation, or EMCC. It turns out, this is the very first large-scale computer business in the United States.” Ginette: “While this business environment on the surface is vibrant and innovative, behind the scenes, it’s a pressure cooker full of confusion.” Curtis: “The owners, John Mauchly, who you followed into the office, and his business partner, J. Presper Eckert, are talking about something strange that’s been happening: most of their clients had been from the government, and now they’re quietly pulling away from doing business with EMCC without any explanation, which is both alarming and confusing to the business owners. It’d be one thing if the government gave a reason each time it pulled out of a contract, but without one, they have no idea what’s wrong or how to try and fix the situation. It’s like going through several breakups where the only explanation offered is, ‘it’s not you; it’s me.’ “So what’s actually going on here?” Ginette: “The answer is woven into John’s backstory, a backstory that also includes the story of the ENIAC, the very first fully electric general purpose computer. “In John’s earlier career, he was involved with scientific clubs and academia. He started as an engineer and eventually became a professor at the prestigious Moore School of Engineering at UPENN. At one point, he got lucky. He asked essentially this question to the right military person on campus: what if I could build a machine that would significantly reduce your trajectory calculation time for projectiles?” Curtis: “So the military ends up formally accepting his proposal, and John and Presper team up for three years on this top-secret military project to build the ENIAC.  “At the time, the ENIAC is really impressive in both size and ability. It weighs about the same as nine adult elephants, which is 27 tons, and it has about 17,500 vacuum tubes, each about the size of your average household light bulb. It has 5,000,000 hand-melted joints. And it’s the size of a small house—about 1,800 square feet. And in today’s dollars, it costs about $7 million. “It’s the very first of its kind. It’s both completely electric and a general purpose machine, meaning you can use it to calculate almost anything as long as you give it the right parameters. The bottom line is that it’s a lot faster than anything before it. It’s 2,400 times faster than human computers, and 1,000 times faster than any other type of machine computer at the time. For example, it took the calculation of a 60-second projectile down from 20 hours to just 30 seconds. To understand the magnitude of this, it's like moving from an average snail’s pace to the average speed of a car on a highway.” Ginette: “Here’s another way to look at this: if you drive your car (the ENIAC) across the country from L.A.

 The Curated History of Data Science, Part 2 | File Type: audio/mpeg | Duration: 22:38

She isn’t your typical English girl from the early 1800s. She’s a girl who, because of her fortunate and unfortunate family circumstances, ends up perfectly situated to become part of something that will revolutionize the world. Transcript: Ginette: “For many reasons, she isn’t your typical English girl from the early 1800s. She’s a girl who at one point examines birds to discover their body-to-wing ratio so she can invent a flying machine and write a book about it. These are goals that show mathematical skill, creativity, and initiative. She’s also a girl who, because of her fortunate and unfortunate family circumstances, ends up perfectly situated to become part of something that will revolutionize the world.” Ginette: “I’m Ginette.” Curtis: “And I’m Curtis.” Ginette: “And you are listening to Data Crunch.” Curtis: “A podcast about how data and prediction shape our world.” Ginette: “A Vault Analytics production.” Curtis: “In our last episode on the history of data science, we talked about the origins of charts and data visualization, which are an important to data science, but in today’s story, we’re going to start a new thread that’s absolutely essential to the fabric of this history. We’re going to talk about some brilliant inventors that gave rise to an idea that would change the course of history—arguably one of the most powerful ideas that has shaped our modern world. It’s a story of triumph and innovation, but also of tragedy, because even though the ideas they moved forward had a dramatic effect on all of us in the long run, in the short term, many of these people saw their dreams fall apart before their eyes. So today and in our next episode, we pay homage to some key people who started the wave that gave us technology that makes our modern lives possible. And we’re gonna to do that first by getting back to the story of the girl we mentioned in the intro.” Ginette: “Interestingly enough, this episode ties into our last episode in an unexpected way. The little girl we introduced to you earlier is born about the same time as Florence Nightingale. She’s about five years older. “We have to understand a little bit about her parents, Annabella and George, to have a better insight into her, so here’s a peek into their lives: They’re both highly intelligent, capable, and well-educated, and they’re from high society. George is more verbal and artistic, and Annabella is more logical and mathematical. “From the start, the pair is not a good match. Annabella sees George’s flaws, but she also sees George’s potential. Beyond that, Annabella is probably attracted to his very handsome (as a lot of people describe him), bad-boy, wild-and-wooly type. One good example of his rebellious nature and disdain for authority is how he exploits a loophole in college to flout what he considers is an absolutely outrageous school rule: since the university won’t let him bring his cherished pet dog with him, he defiantly keeps in his Cambridge University apartments a tame pet bear. Essentially, as loopholes work, the rule doesn’t explicitly say no pet bears, so the university in his mind can’t immediately do anything about it—this may be partly why he only lasts there a term. Anyway, these are the types of things Annabella thinks she can change about George. “On George’s side of things, he notices Annabella’s sharp intellect. She’s incredibly smart. From early childhood, her parents recognize her natural brilliance and essentially give her what most women can’t get in those days—the equivalent of a Cambridge University education. Something else George likes about Annabella is that she’s down to earth. So eventually, he proposes to her, and probably against her better judgement, she says ‘yes’, and they get married, but within a year, things get messy.

Comments

Login or signup comment.