Distributed Data Show show

Distributed Data Show

Summary: The Distributed Data Podcast is your weekly source for the latest news and technical expertise to help you succeed in building large-scale distributed systems. Brought to you by the Developer Advocate team, we go in-depth with DataStax engineers and special guests from the broader data community. New episodes each Tuesday.

Join Now to Subscribe to this Podcast

Podcasts:

 Distributed Data Show Episode 54: Graph Processing Trends With Jonathan Lacefield | File Type: audio/mpeg | Duration: 00:35:46

We sit down with Jonathan Lacefield to discuss the latest trends in graph processing, from DataStax perspective? Talk through how the challenges being associated with graph are evolving, current tips/tricks, tools being seen, etc.

 Distributed Data Show Episode 53: Disruptive Innovation with Matthias Brocheler | File Type: audio/mpeg | Duration: 00:20:39

There’s more to the lean product strategy than just building skateboards. Matthias Brocheler joins guest host Kathryn Erickson to discuss Disruptive Innovation. We’ll discuss and provide practical examples of how to explore an idea you have or a problem you want to solve, how to size your potential market, and when to start writing code.

 Distributed Data Show Episode 52: Benchmarking with Nitsan Wakart | File Type: audio/mpeg | Duration: 00:31:35

Are all benchmarks lies? Nitsan Wakart joins the show to explain the discipline of performance engineering, the ingredients of an effective benchmark, why you should always create custom benchmarks based on your expected workload, and the benchmarking effort we undertook for DSE 6.

 Distributed Data Show Episode 51: Graph Tips and Tricks with Ted Wilmes | File Type: audio/mpeg | Duration: 00:22:49

David and Jeff talk with Ted Wilmes from Expero about best practices regarding DSE Graph and the importance of proper data modeling.

 Distributed Data Show Episode 50: Think Like A Support Engineer With Sequoyha Pelletier | File Type: audio/mpeg | Duration: 00:19:26

We talk with support engineer Sequoyha Pelletier about the support team, they’re training gauntlet, and get a bunch of tips and tricks to use in troubleshooting clusters and submitting support tickets. Highlights! 0:16 - David introduces Sequoyha Pelletier to the Distributed Data Show 0:42 - Sequoyha gives an overview of the support team experience 1:35 - Common issues support handles regularly 4:02 - Data modeling challenges and the largest partition contest 5:49 - DSE clients should engage the services team when they first dive into DSE 6:36 - How to become a support engineer, training gauntlet, and Sequoyha playing chaos monkey with engineer clusters 11:07 - Best tips on useful information to provide in support tickets 13:03 - When issues occur ensure to take a look at logs and submit those with tickets 15:47 - Good tools to use when troubleshooting a cluster and definitely learn about nodetool 18:46 - A possible Modern Family reference because David gets stopped in the airport all of the time

 Distributed Data Show Episode 49: Bulk Loading with Brian Hess | File Type: audio/mpeg | Duration: 00:25:22

Brian Hess joins the show to explain why the bulk loader is a vital tool for a distributed database, the history of bulk loaders for Apache Cassandra, and the virtues of the new DSBulk.

 Distributed Data Show Episode 48: DataStax Drivers with Chris Splinter | File Type: audio/mpeg | Duration: 00:37:03

We talk with DataStax product manager for developer solutions, Chris Splinter about new DSE 6 driver features and peer into bright the future of driver development.

 Distributed Data Show Episode 47: NodeSync with Sylvain Lebresne | File Type: audio/mpeg | Duration: 00:13:09

Sylvain Lebresne shares what’s new and awesome with NodeSync in DSE 6, including what simplicity it brings to operational operations, improvements in performance for repair and how well it is integrated with DSE through CQL and OpsCenter.

 Distributed Data Show Episode 46: DSE 6 Analytics with Brian Hess | File Type: audio/mpeg | Duration: 00:31:25

Distributed Data Show Episode 46: DSE 6 Analytics with Brian Hess by DataStax Developers

 Distributed Data Show Episode 45: Search in DSE 6 with Nick Panahi | File Type: audio/mpeg | Duration: 00:25:00

Nick Panahi shares what’s new and awesome with Search in DSE 6, including what you can do with search-enabled CQL queries, the performance enhancements you can expect to see, and why configuring search just got easier.

 Distributed Data Show Episode 44: Thread Per Core with Jake Luciani | File Type: audio/mpeg | Duration: 00:28:57

Jake Luciani takes us behind the scenes to explain how the principle of mechanical sympathy was applied to DataStax Enterprise 6 in the new Thread Per Core feature. DSE 6 is demonstrating 2x improvements in read/write latency compared to DSE 5.1 / open source Apache Cassandra.

 Distributed Data Show Episode 43: Introducing DSE 6 with Robin Schumacher | File Type: audio/mpeg | Duration: 00:35:35

Robin Schumacher joins the show to take us behind the scenes of the brand new DataStax Enterprise 6 release, sharing how a focus on customer value, operational simplicity and building a unified platform led to new features like Advanced Performance, NodeSync and many others.

 Distributed Data Show Episdoe 42: Updating KillrVideo for DSE Search and Docker | File Type: audio/mpeg | Duration: 00:23:59

Cedrick Lunven interviews David Gilardi and Jeff Carpenter about their recent additions to KillrVideo, a reference application for Apache Cassandra and DataStax Enterprise. David upgraded the existing search feature based on CQL to use DSE Search, while Jeff configured the desktop deployment of KillrVideo to use the official DSE Docker images.

 Distributed Data Show Episode 41: Graph-based Genealogy with Dave Bechberger | File Type: audio/mpeg | Duration: 00:21:02

Migrating from a Relational application to a Graph based application is an undertaking that takes forethought, planning and the right use case. The challenges with taking a team used to working in a relational world and transitioning them to a distributed, eventually consistent system based on a graph are many. In this episode we talk to Dave Bechberger who is the Chief Software for Gene By Gene which is a Bioinformatics company specializing in Genetic Genealogy. Dave will share his experience and learnings from migrating their relational application to a graph based one. Highlights! 0:15 - We welcome Dave Bechberger to the show and learn how he got into big data technologies like Apache Cassandra and DSE Graph 1:38 - Dave introduces his current work with Gene by Gene applying graph technology to genealogy applications 3:43 - The performance of legacy systems was database bound, so they began to look at using non-relational databases, especially graph, starting with a family tree application. 6:19 - Dave describes how his team identified a graph datastore as the best approach for the family tree functionality. Family tree queries can be very recursive - for example: how are these two people related? 7:43 - The biggest challenge in adopting graph technology was training - Gremlin queries require a different way of thinking. At the same time, they were also migrating to a microservice architecture style based on asynchronous event passing. 10:04 - The benefits of all this change outweighed cost. The biggest “do different” they identified was to Invest in upfront training on pragmatic approaches to graph. 11:24 - Why specialization can be beneficial to a team learning multiple new technologies. 13:06 - The graph schema evolved over time - while their initial schema was based on an industry standard, they ended up adding additional vertex and edge types, as well as indexes to help optimize queries for both analytical and transactional use cases. 15:31 - Dave’s team iterated on both their data model and their graph traversals. Denormalization is a key technique to work around vertices with many edges. 16:42 - Dave’s advice for scaling up on a graph database is similar to familiar guidance for Cassandra: know your data and how you’re going to query it. 18:13 - Tooling is a major area of growth for graph databases that will help spur adoption

 Distributed Data Show Episode 40: Feature flags with Cedrick Lunven | File Type: audio/mpeg | Duration: 00:23:43

Feature Flags, also named Feature Toggle is a software development pattern allowing to enable and disable features within your applications at runtime. Cedrick has been working on an implementation in Java for nearly five years now. He will share his experience on succeeding to implement this patterns especially on large architecture and distributed systems. We cover expected use cases, underlying data model and architecture concerns.

Comments

Login or signup comment.