Monte Zweben ADT podcast cover
Episode: 17

Monte Zweben - Machine learning and its value in business and society

Posted on: 11 Mar 2021
Monte Zweben ADT podcast cover

Monte Zweben is the CEO of Splice Machine, a data and machine learning company from California, and the host of their ML Minutes podcast.

In this episode, we talk about the benefits of machine learning, with Monte giving some great practical examples of ML solving real-world problems across different industries. We also talk about the culture of innovation and experimentation in this field, focusing a little bit more on one of the latest and biggest recent trends in machine learning. We finish by addressing the changes in the digital industry due to the Covid pandemic and what potential impact they could have on machine learning.

 

Links & mentions:

Transcript

"Digital transformation now is not a nice-to-have, it is a must-have for enterprises. They have been forced to interact with their suppliers and customers in a digital manner."

Intro:
Welcome to the Agile Digital Transformation podcast, where we explore different aspects of digital transformation and digital experience with your host Tim Butara, content and community manager at Agiledrop.

Tim Butara: Hello everyone. Thank you for tuning in. I'm joined today by Monte Zweben, CEO of the data machine learning company Splice Machine and host of the ML Minutes podcast. In this episode Monte and I will be talking about machine learning, specifically how it can help businesses streamline their productivity.

Welcome Monte, it's really great to have you on the show. Is there anything you would like to add to the introduction?

Monte Zweben: No Tim. I'm very, very pleased to be here. I'm very excited to talk about machine learning and its connection to data, and how it helps business and society in general really.

Tim Butara: Yeah, I'm also really excited for the talk. I think it's a very relevant, very topical topic so to speak. So I suggest we just dive right into it.

Monte Zweben: Excellent!

Tim Butara: So yeah let's, let's begin with something more basic, you know, maybe some of our listeners aren't familiar specifically with machine learning, so can you first give us like a brief overview of ML - maybe, maybe in the style of one of your ML Minutes?

Monte Zweben: Excellent! Thank you for putting me to the one-minute test as well. Machine learning is the practice of using computers in a new way. We usually program computers, where we specify each step that it needs to do, like a recipe. But with machine learning what we're trying to do is get a dynamic behavior out of a program, where it can get better and better and better as it sees more examples, and really what machine learning is generally about - there are many different methods, but generally, it's about pattern matching. So if I see a group of examples, like I have a bunch of credit card transactions that are fraudulent, and a bunch of credit card transactions that are not fraudulent, how can I determine the most general conditions that describe the difference between those two different sets? And as I get more and more examples, the system gets better at pattern matching, it learns that concept.

Tim Butara: Yeah, that was a really great straightforward example. So, basically it's like enabling the system to make optimizations to itself without any external supervision, just basically you know it can enhance itself on the fly basically, because of the way it's been programmed.

Monte Zweben: Yes, it's, it's really both the dynamic behavior of it and the fact that you haven't really told it the underlying rules. You haven't told it, what does it mean to be fraudulent, and if you're using machine learning for example for a medical application, like trying to predict a particular disease in a normal program, you might have principles of medicine in it but in a machine learning program, you're just learning from the people who are sick and the people who are not sick and trying to determine those patterns.

Tim Butara: Nice! So, okay now I hope we all have a better understanding of ML, but what's its real value to business like in the practical sense, can you give us some very basic examples of this?

Monte Zweben: Sure! One of the most basic examples that we see every day is when we're on e-commerce sites that are providing recommendations. If you're on Amazon, if you're on Netflix, if you're on Airbnb, you're always being provided with the recommendations of the products that are most likely to be fit for your tastes, and that's all done with machine learning. Another kind of machine learning is the example I mentioned earlier; if you're applying for a loan or applying for an insurance policy, determining whether you're good for one particular type of policy versus another one, or whether you're trying to game a claim system and claim uh some sort of incident that didn't really happen, machine learning can be used to detect that. And those are the kinds of things that you can do in business today.

Tim Butara: So, if I understand it correctly, machine learning is actually already a huge part of basically everybody's daily lives, we just, we just, most of us just aren't aware of it yet.

Monte Zweben: I think that's very true, and we see it from everything that happens on the web to even what happens in some of our vehicles; especially the self-driving capabilities of vehicles today. That's all machine learning: pattern matching, learning, and improving.

Tim Butara: Nice! And you know judging from the relative youth of the technologies and the industry we probably have a lot of progress still to see in the very near future.

Monte Zweben: Yes, we do. There are a number of different threads of research in machine learning, but there's also a lot of, just really practical improvements that are necessary to, I would say operationalize machine learning, and those may even be more insurmountable than some of the research. And what I mean by, operationalizing, it's very easy today for a data scientist to be given a set of data examples, either in a file or a spreadsheet or in a database. And for them to open their environment where they perform their statistical or mathematical modeling, and for them to come up with a model that describes the data, but putting that model into everyday operation, where it kind of seamlessly fits into the enterprise workflow ... that's proven to be very difficult.

Tim Butara: Yeah, yeah, I can imagine that as the technology progresses there are a lot of challenges, a lot of new challenges, and kind of obstacles popping up that maybe haven't been thought of initially, yeah. 

Okay, so what about some more concrete, more specific examples of machine learning's business value? I'm sure that you've had some clients at Splice Machine with some really interesting challenges, like can you talk to us a bit, about those, about those that really stuck with you?

Monte Zweben: Absolutely. One of my favorite examples of machine learning that has both a business side as well as a human side is a medical example. One of our clients is the Innovative Precision Health Network, and this is a network of neurology clinics led by neurologists who are pooling their data together to form what I'll call a database of population data. And what they do is they grab all of the de-identified patient data so it's completely anonymous, and bring it together and train machine learning models to predict the trajectory of disease. In our first set of examples, we began with multiple sclerosis, a very significant disease that has very deleterious effects on the patients and it actually affects a quite broad amount of people in the population. And the machine learning can help the doctors predict how the patients may feel given a particular treatment, it may be able to predict how they may deteriorate across different biomarkers, and it automatically uses data from cognitive tests, from walking tests, and also from imagery of the brain. 

And the reason why this is also a business opportunity is that, in the medical field, especially in the United States given our insurance circumstance, it's still very difficult for doctors to be able to financially succeed given the big squeeze that the payers or insurance companies have. And so now by taking these objective tests that can be pulled together, like a walking test on a device with sensors, or a cognitive test, or a brain imagery test using MRI data. All of these tests are actually reimbursable by the insurance companies. So it helps the neurologists actually increase the revenues of their practices while simultaneously improving patient outcomes. So that's one really great example of machine learning. If I have time, I'd love to give another example. Would that be okay?

Tim Butara: Yeah, I'd love it if you gave another example actually.

Monte Zweben: Excellent! Yesterday actually on, on the ML Minutes podcast I had the opportunity to interview the CEO of Jobcase. And Jobcase is a company that is a social network and community for working people. And what it does is use machine learning especially during these difficult times when it is so hard for people to find work, to match the opportunities that are out there for people to work, with the people that are seeking work. And more than just the job listing type of web pages that you see out there, it truly does provide community, where you can recommend educational opportunities, you could recommend other people who may be able to answer your questions about jobs in certain locations. It truly is a social network looking to advocate for uh, workers out there, and really being on the side of the worker.  And I, I think it's a great opportunity to see a positive use of social networking, and a great positive use of machine learning.

Tim Butara: Yeah, yeah. Those were, both are very great examples and some very good points about the positive side of technology; because when we started talking about the real life applications of machine learning, and like in the wake of you know, documentaries such as the Social Dilemma, and stuff like that, the question that popped into my head was, like, then obviously you know, technology, machine learning, new and emerging technologies obviously they can have huge and really beneficial impacts on a lot of people's lives. So these were like really on point examples. One, one being the medical one, which I was particularly interested in the kind of win-win situation for both the prophet and the health of the patients. And this one, you know, as you mentioned in times of Covid in times of isolation, in times of you know, lockdowns ... being able to be part of a community that's actually human-centered, that kind of-- and, and you know not only that a lot of people lost their jobs due to Covid. So you know it, it provides both the opportunity to kind of get back into the, into the job, the job hunt, the job game, and it provides the sense of community.

Monte Zweben: Yes. It really does, but you know, there are extremely practical uses of machine learning. We're working with for example petrochemical companies to be able to predict when something might go wrong in the plant. And the reason why that's so important is that the control systems in petrochemical plants are fail-safe systems. They'll shut down the whole plant when something's about to go wrong. Of course they don't want anything to explode, or for people to get hurt; but sometimes this happens prematurely, and so what AI can be used in those cases, they can use a machine learning model to connect up to all the sensors in the plant. And the machine learning models can start to learn, to predict that something's about to go wrong, to give the operators enough time to remediate that circumstance and avoid an outage. And an outage of a petrochemical plant could exceed a million dollars a day. So this is a great example of real time use of machine learning, in order to predict an outcome that may happen in the future, and give humans the opportunity to remediate and avoid that very bad condition.

Tim Butara: Yeah, that was another great example. I think it's-- I love it that you're giving examples that aren't, like, super technical, I think that everybody can now see through these examples the real benefits of machine learning. 

But now I want to go maybe in somewhat of another direction. You know, because machine learning as we said machine learning and data science are still relatively young fields and I would assume that that means there's likely a big tendency, or like a drive towards a lot of innovation, experimentation. Like what would you say is the role of experimentation in machine learning?

Monte Zweben: Well, experimentation is at the heart of machine learning. A machine learning, a machine learning project will have a few different persona involved; data engineers who are taking raw data and performing transformational pipelines of moving the data from its original source and making it more accessible for data scientists to use, and data scientists who take that data and formulate what are called features. Features are the data elements that have been curated and manipulated that are the input of a machine learning model. They're the examples, they're the way that examples are described for models to do pattern matching. 

And then the data scientists after they form their features, they use algorithms, and there are many different algorithms, and all of those algorithms have parameters. So a data scientist's day is all about experimentation. They take their features that they've decided to use, their algorithm they've decided to use, they tune the knobs on those algorithms, they run the training, and they test to see how accurate the pattern matching was. And they repeat that over and over and over and over again in experiments and they have to keep track of these experiments, and they have to compare and contrast these experiments and this is a way of life for a data scientist, and so that's at the heart of the actual practice. 

And then if you look above the day-to-day life of a data scientist, there's lots of research being done on essentially extending with new algorithms, using new techniques to manage machine learning models and to automate this whole process. So there's some great work being done in all these areas and maybe it makes sense to focus on one of these areas, feature engineering. 

Feature engineering is the process of taking that raw data and formulating the data attributes that will be the input to the machine learning models that make predictions. This literally takes 85 percent or more of a data scientist's time; it's an extremely grueling process. And now we're starting to see automation in this space. In fact I'm writing a book on something called a feature store for Manning publishers, and a feature store is a way for one data scientist to do all this great work, and take it and put it into a repository to memorialize their work in a repository, and that's called a feature store. And other data scientists could search for useful features. 

So now this lets teams of data scientists reuse each other's work, and the feature store always keeps that data fresh. So for example, Tim, if you're coming to a website and someone's using a feature store, and they want to make a recommendation for you, the feature store knows each and every data element about you, even if it took tremendous computation to aggregate your browsing behavior over the last two years, it keeps that data fresh, and so that the minute you hit that website the feature store can be called and asked for all of Tim's data attributes to feed the machine learning model that makes the recommendation. And these feature stores used to be bespoke all the time. There are great papers by Airbnb, and by Uber, and others about the feature stores they needed to make their systems work. And now we're starting to see a number of companies produce feature stores for other people to buy, my company included, and I'm really intrigued with this new level of automation that's at the heart of the data science process.

Tim Butara: So, feature stores are basically something really new in machine learning, it's not like-- you know, we were talking about experimentation; you said that, you know, now automation is being introduced to machine learning, so I'm guessing that this is still, this is the hottest, latest thing in machine learning I guess?

Monte Zweben: I think, I think it is the hottest thing in machine learning, and let me tie it to experimentation, because that's the element of feature stores I didn't mention. But feature stores also power experimentation, and what they do is they let the data scientists say: give me a training set from time point A, let's say from Monday April 1st, you know, two, three years ago, until today, and give me all of the examples that occurred in that time period, and use that on the following features to train this model. And the feature store can go back in time and take all of the historical values of features, and collect them up efficiently for the data scientist, and then feed that to the machine learning model to train. So feature stores are used to share, feature stores are used to serve features in real time, but even more importantly and potentially the most difficult element that future stores deliver on, they deliver on the promise of experimentation because they power the machine learning model with a training set of features.

Tim Butara: And also, you know, one thing that I, that I started thinking about when we were talking about experimentation, and you know it to me it seems like you know working in silos and innovation you know, that's not a recipe for something good, you know. You probably if, if you innovate but you do it in silos, you know, how, how successful can that innovation really be?  But to me it seems like feature stores kind of solve exactly this problem, you know, it allows you to innovate kind of in isolation, but then after the fact it allows you to kind of share your innovations with others, so that others can make use of them for their own innovations, and that's actually how you drive the whole thing forward, right?

Monte Zweben: Tim, I couldn't have said it better. Future stores in general democratize machine learning. They break down the barriers between the data engineers that are formulating data, the data scientists who are modeling, and even the machine learning engineers who are taking models and putting them into production in concert with yet another persona, the application developer who has to use the machine learning model in the midst of their application. And I would say that these persona have had a really hard time working together because there were no tools, and the feature store is a tool that helps glue them together.

Tim Butara: Yeah, it really enhances collaboration of everybody working on a machine learning project, yeah. These were some really, really interesting points that we mentioned now. And I only have one big question, but it's like it's really relevant to what's been going on in 2020, and of course we kind of already started talking about it, so I'm sure you, you already have some examples and you know we covered it already. But you know, I'm interested in what you think about the impact of the Covid pandemic, and the whole digitalization around it has had, or maybe will have on the adoption of machine learning?

Monte Zweben: Yeah, I think the pandemic has really changed the face of industry. It probably primarily has accelerated the digital transformation process, and digital transformation now is not a nice-to-have, it is a must-have for enterprises. They have been forced to interact with their suppliers and customers in a digital manner. And as they do that, they realize this is not a matter of just changing the forum by which one communicates with its customers and with its suppliers, it requires a smart interaction, otherwise you can't compete. And so the digitization efforts around the globe used to be all about just e-business.
I remember in the first generation of digital transformation I had a company called Blue Martini software, and we were focused on helping companies get online do e-commerce and, and we were evangelizing the use of machine learning to do that effectively. Today you just simply cannot have a digital interaction that is not powered by machine learning. And so you're seeing companies rush into the journey of data science, and try to find their way.

Tim Butara: Yeah. That makes a lot of sense actually, you know, as you said - now, it's not enough to just have basic digital interactions with your customers, with your visitors, but they have to have all the qualities of good digital experiences. And, I hope as we've successfully highlighted throughout this episode, basically machine learning is at the heart of good experiences, basically everywhere in the digital, you know, from the web to your vehicle that you're driving as you pointed out previously, yeah.

Well, Monte, that's all from me. Before we finish I just want to ask you, if people want to reach out to you or to learn more about you where can they do that?

Monte Zweben: Excellent. Well you can certainly reach us at splicemachine.com. My email is mzweben z-w-e-b-e-n @splicemachine.com. And you can also catch me on mlminutes.com, where we talk about machine learning all the time.

Tim Butara: I'll make sure to link all of those, Monte. Thank you so much for this really great conversation. It was awesome to have you on the show and to, to learn more about machine learning, and learn more about the really relevant use cases that we went through. And to our listeners that's all for this episode. Have a great day, everyone, and stay safe.

Outro:
Thanks for tuning in. If you'd like to check out other episodes, you can find all of them at agiledrop.com/podcast, as well as on all the most popular podcasting platforms.

Make sure to subscribe so you don't miss any new episodes, and don't forget to share the podcast with your friends and colleagues.