Data Science Leaders | Episode 13 | 26:54 | July 27, 2021
Romain Ramora, Head of Data Science & Innovation - Supply Chain
Cisco
Data science jobs outnumber data scientists by three to one. The industry is looking for ways to close that gap, including turning to the concept of the citizen data scientist.
But in today’s episode, Romain Ramora, Head of Data Science & Innovation - Supply Chain at Cisco, shares why he thinks we shouldn’t be putting critical models in the hands of people lacking the proper expertise.
Romain shared his perspective on:
Welcome to another episode of the Data Science Leaders podcast. I’m your host, Dave Cole, and today our guest is Romain Ramora. He comes to us from Cisco, where he is the Head of Data Science & Innovation within their Supply Chain division. Romain, I believe you also have 10 years of experience prior to that as a financial risk modeler. You spent some time consulting at Accenture and also some time at Charles Schwab. Is that correct?
Yeah, that’s correct.
Great.
Thank you, by the way, Dave. I appreciate the invitation.
You’re welcome. This is great. So today we have a juicy topic, which I know is somewhat controversial in the data science realm. We’re going to be talking about citizen data scientists. Romain here has some opinions that he’d like to share with our audience, and I’d love to dive into that. We’re also going to be talking about who should lead a data science project. Should it be somebody on the data science side? Should you have a portfolio manager or project manager? Should it be somebody on the business side? We’re going to be diving a bit into that and just learning from Romain on what he’s seen work and not work.
But before we dive into that, your career arc is interesting, Romain, the fact that you had 10 years in risk analytics, and then you moved into the world of Cisco and supply chain analytics. What caused you to make that leap, and do you see them as similar or is that why you wanted to change things up?
Yeah, I was interested to see another area of data science analytics, so I got that opportunity presented to me to join Cisco and lead the Data Science & Innovation practice there. I found out during my year at Cisco that in the end, the problems that we are solving are not that different from what we encounter in finance. I can give you a few examples here. So, I was working in risk at Charles Schwab and we still have the problems of minimizing risk, anticipating shortage risks, for example, for supply chain.
Sorry, what is shortage risk?
Shortage risk’s anticipated when products go into shortages.
Gotcha.
So, you can build models around that to identify if an X, Y, Z component is going to be in shortage in the next few months, it’s a little bit like modeling the credit risk, default probabilities, you know kind of the same type of approach here. You also have all the area around optimization, which is also similar. For example, inventory optimization versus optimizing the credit risk allowance. So, you can draw interesting parallels between these two disciplines, even if the topics are different, the analytics know-how is pretty simple.
We’re going to be talking a little bit about—when we talk about the citizen data scientist—about the shortage of the supply/demand problem from a data science perspective. If you’re a data science leader out there looking to expand your horizons, let’s say you’re a leader in the supply chain and you need someone who has expertise, yeah, sure, you can go after somebody who has 10 years of deep supply chain data science background. That’s an obvious place to look in terms of looking at people’s resumes. But you might look at credit risk, right? You might look at somebody with a slightly different background and that could also bear fruit, if there are those similarities.
So, I think it’s important from a recruiting standpoint to have an open mind. I think you’d probably agree with that, Romain, given your career arc. There’s part of your background that’s also important too, that I want to touch on before we dive into our primary topic, is the fact that you came from a consulting background. How do you see your time at Accenture starting your career in consulting, how has that helped you in your gig at Charles Schwab and then now at Cisco?
I can tell you, when we were in consulting, you are, I want to say, a little bit of a jack of all trades, because you have to deliver the projects. You have to even sell the projects at some point to the customer. So, you have this customer-facing element, along with the delivery, and also on the other side, you try to push the company forward with innovation. So for me, it was a very structuring aspect of my career to start. I was directly thrown into the fire of the corporate environment and business. So I think that helped me a lot. I learned a lot of good skills also from my years at Accenture. And I’ll give you a few examples here that I apply today in my job. Project management skills, for example, are very very useful for a data scientist.
I also got the appropriate training with the different statistical software that obviously changes all the time. But at the time I also learned how to present quantitative findings to various levels of audience, which is, for me, one of the most important pieces of data science. Because if you can’t present your findings to two different levels of audiences, it’s useless.
That summarizes my experience at Accenture. Something that helped me even before I joined Accenture is to have a balanced quantitative background because I had the opportunity to actually do my major in quantitative finance and risk management. That was also very structuring in the way we look at finance from a statistical standpoint.
Yeah, I think the challenge of being able to talk to multiple levels of expertise, being able to talk about your findings in a way that can be relatable to folks whose job is not to be a data scientist, is something that I think sometimes gets overlooked. It’s just so critically important because the work of a data scientist is to enable better decision-making, either by folks on the business side or even by the customers themselves, and to be able to embed analytics in products. That ability to really understand the business side is so critical to speak their language.
I also have a consulting background, and I think that ability to guide and lead and communicate is so critically important. And I’m sure it has helped you throughout your career in being able to talk to others and make what you’ve worked on relatable. That’s a great insight.
So that’s yet another well of potential data scientists to go after. Don’t be afraid to go after folks from a recruiting standpoint who have that consultative background. You’ll find that they could be very beneficial. Alright. So, one of the big topics here that we wanted to talk about is the citizen data scientist. Before we dive in, what is a citizen data scientist in your mind, Romain?
There’s two areas of citizen data science. The first one is more around some type of BI role. Business intelligence is going to do more like dashboarding and reporting. The second side is more around real modeling, modeling some particular business decisions and finding the appropriate frameworks to do so. The citizen data scientist, for me, in the first part, the BI, it’s useful to have that in the company for being able to extract your information from the dashboard pretty fast and being able to make decisions. However, for the second part, which is a little bit more technical, I don’t necessarily think that there’s a compelling ROI in that area and I’ll tell you why. A lot of people come to me and they think that, you just feed a good model to your data, you put it for a loop, you optimize a parameter, and bam! You have a model.
But no, it’s not true. Nothing could be further from the truth. Actually, it’s not that data science and modeling in terms of advanced analytics, it’s a bit more complex than that. If you want to have a good ROI, you have to hire the right people with this statistical expertise of the model, coupled with potentially the business knowledge. But I think you can, in the end, you can learn the stats and the hardcore math. That’s where I stand right now. Complex problems require complex solutions most of the time, actually.
Most of the time, but then they require expertise, I think, is what you’re saying. To summarize, your point of view is that for prescriptive analytics, business intelligence, visualizations, all that good stuff, you think, “Ah, a business user.” You could take somebody whose primary role is not data science, who doesn’t have a statistical background, and teach them and give them tools to enable them. That’s okay.
But to then be able to take a data set and be able to fit a model to it. There are companies out there, there are products out there, AutoML being one of them, auto machine learning, where you give it data, it will figure out what algorithm is best to train the data set on for a particular dependent variable that you’ve identified within your data set and that give you all the model stats in the world. But where do you think it breaks down? If you’re a citizen data scientist using a tool like that, why would you not trust, say, somebody whose primary job is not to be a data scientist with a tool like an AutoML tool?
No, because I think first you need the knowledge of the data, and you need to understand how to define it from a business standpoint. What’s an outlier, for example, in the data set? You need to be able to understand also what quantitative assumptions underpin a certain type of model. That, unfortunately, you need to spend some time to understand, even for simple models like linear regression, for example, that when you build a model you must not violate the main assumptions. To do that, you need to know the main assumptions, first of all. There’s a lot of difference, I’m giving the example of OLS, but depending on what model you use, you need to understand what assumptions are at play.
So, what assumptions are you referring to? Take the simple linear regression.
Yeah. So, for example, you see, making sure that your data is normally distributed. Making sure that the estimator is unbiased. How do you verify that an estimator is unbiased? That’s funny because every time I ask that during the interview, very few people know how to actually answer that question, but after I follow that they don’t understand OLS, but they want to deploy very complicated neural network algorithms. So to me, to answer your question, you have to start simple. Understanding what simple algorithms are and after you can make your progression all the way to a hierarchy of modeling.
Yeah, there’s been a lot of press out there about some AutoML tools that they are participating in Kaggle competitions and they do quite well. What I’m hearing from you, though, is that you’ve got to be very careful, most likely the people who are driving in those contests, who are at the wheels driving the AutoML tools, are probably data scientists. I’d love to take somebody who’s got just the basics and gone through the normal training for using an AutoML tool and allow them to build a model based on a data set and enter them into Kaggle or competition and give them minimal assistance, see how they do. I’d like to see the results of that.
But to your larger point, which is ‘science’ is in the name of ‘data science,’ and to be a scientist, you can’t just wake up one day and be handed a tool. You need to understand the process of coming up with hypotheses, the process of making sure that the project that you’re working on and the experiments that you’re running are done in a way that doesn’t introduce bias, and all that good stuff. That takes training, in some cases advanced training beyond just getting an undergraduate degree. I certainly am sympathetic to that. That being said, Romain, what are we going to do about the fact that there’s three times the number of data science jobs out there than there are data scientists? How would you bridge that gap?
You have to train people and particularly make sure that their fundamentals and their basics are covered. That would be my recommendation. So that you can, for example, go from citizen data scientist to someone who takes the data somewhere, do some visualization to full blown modeler.
For you personally, would you hire somebody who maybe took an online course, a Coursera type course or Udemy type course, or would you want somebody to actually go back and get a Master’s in data science or statistics or something like that?
Well, it depends. As long as you go a little bit more than just the hands-on, very practical exercise that’s fed to you because that’s basic. It’s useful to have, as I was mentioning earlier, a little more fundamental content. Always in my interviews, I include a little bit of fundamental content to make sure that you know the basic point. So, yeah, I would consider someone with experience in Coursera, but I would also lean toward a little bit more theoretical knowledge on top of the hands-on experience.
The way you hire the best, by the way, Dave, is you hire the best because you ask the right questions. You ask the right questions and you get it.
So if you ask the right questions and if you also have a varied background…I think what you’re saying, though, you can come out of consulting, you can even have a background in a different industry, but if you have that foundation and understanding data science and how it’s done, and you have to weave in statistics in there as well. If I’m creating a model that I am making a critical decision off of, am I going to trust an AutoML type tool or a citizen data scientist at the helm of a tool to be able to put that model together, or am I going to lean on the experts? I’m probably going to lean on the experts. Are there any use cases that you think you’d be okay with a citizen data scientist, maybe some basic forecasting, basic classifier?
Yeah, if it’s basic. I would try with classic regression. When you have someone who wants to start, I would start with that classic linear regression, trying to understand what data you’re working on and see if you can extrapolate, etc.
Would you be open at all to a citizen data scientist building a model in a particular circumstance, like with a particular use case?
Yeah, exactly, if it’s for a simple use case. Yes, not very complicated, I’m totally open to it. It’s even a good thing for people to start trying new things, but if it’s to build a more complicated framework, at some point you need a bit more expertise to be able to answer quantitative questions around economics or finance, even questions in supply chain, a lot of them are related to economics.
Oh, that’s part of the challenge too. Sometimes you see a problem and you think, “Oh, this will be easy. Oh, there’s 10 features, I know this problem very very well. I’m building a simple forecasting model, I can do this.” But then, you build the model and you might not realize that one of your independent variables is highly correlated with the dependent variable and there’s some bias and you don’t know what the right questions are to ask about your data set, and you can fall into traps.
True.
It sounds almost like you might be advocating, first have a data scientist looks at the problem at a high level and just basically bless it and say, “Yeah, you can go ahead and build a model using this tool. Citizen data scientist, have at it!” It probably is a great way, if you’re looking to get your business counterparts closer to understanding what you as a data scientist do, it can certainly…if somebody is interested and wants to go down that path and at least knows when to stop, knows the boundaries of what they should be working on, what they shouldn’t be working on. It could be a great way to bridge the divide for sure.
I would not say boundaries there, I would say more the pitfalls to avoid. You know that you can ask someone else with a different skillset to help you. That’s important.
Absolutely, and before you know it, who knows? Maybe they go back and they get some training, and build some more models under their belt, and then they might meet you in the middle a little bit from that citizen data scientist approach.
But I hear what you’re saying. This has been a fun discussion. I want to switch gears a little bit here, Romain. I want to talk a little bit about who should run a data science project, in your opinion? I would love to hear your thoughts.
I’m very biased, obviously, Dave. I would say what I’ve seen work, and I will keep it at that. What works is when you have a solid data science team, who also have the business knowledge, I think that’s very important. The knowledge of the business, how it works, and obviously the knowledge of the data associated to this business. That’s what works best, in my opinion, for complex projects. Maybe it’s a little bit like the citizen data scientists that we talked about, I mean, maybe you can have business owners like manage projects and lead projects, when it is a little bit less complicated, you know, depending on the ROI, depending on the complexity of the project. So that’s what I would say. But what works the best, from what I’ve seen is to have the data scientist leading that type of project.
But to your point, this has got to be a data scientist that understands the business problem that they’re solving. It’s not somebody who’s just like, “Give me a data set, I’m going to go ahead into the basement and I’ll come back in two months and I’ll pop some model.” It’s somebody who really gets the business problem. I don’t know the right answer to this. On the Data Science Leaders podcast, we talk all the time about should data science be centralized? Should data scientists be embedded within the business units?
Everyone is seemingly in agreement that the closer from both sides, the data scientists that understand the business and can present the results and speak the same language as the business users, those are going to be the ones that are highly successful. Then likewise, what can we as data science leaders do so the folks on the business side are closer to understanding the science behind the data science without them just thinking that they can buy an off-the-shelf solution and become a citizen data scientist or without the help of an experienced data science team? The more we get into the middle there, the better off we’re going to be.
Let me ask you this question, Dave. Did you have any data science leaders telling you that they like the business to be in control of the data science project? Tell me, what do you think?
This topic, we’ve talked about it a bit during the Data Science Leaders podcast, but not explicitly. We haven’t specifically asked the question, “who should lead a data science project?” I’ve seen folks on the business side be owners of it, then in terms of the budget and whether or not they’re going to go with the eventual model that was created by the data science team, but they rarely have a say in how the model is built and what algorithm is used and what approach is used. That usually is on the data science side.
So, what I hear the most is that strong partnership and that strong need for both sides to meet in the middle and that’s part of the challenge. I am biased too, and I think it’s easier for a data scientist to learn the business side, than it is for somebody on the business side to learn the data science. But again, I’m showing my bias there.
Hey, I didn’t say it, you said it.
Yeah. I’m learning from you quickly here. Well, this has been great! So, Romain, if people want to get in touch with you, I assume they can reach out to you via LinkedIn. Is that correct?
Absolutely, yeah feel free.
We covered the two main topics. To recap, I think Romain thinks that the citizen data scientists label needs some work. He’s not willing to trust his critical models to a citizen data scientist. We all can agree that both sides need to meet in the middle and learn about each other as best they can.
Reach out to Romain if you want to learn more about data science and his team and what they’re up to at Cisco. And then also in terms of leading a data science project, the unbiased Romain here is looking for data scientists to take more of a leadership role. That feeds in the same thought that the more business expertise you have as a data scientist, the more successful you will be.
Absolutely, absolutely. Thank you very much for having me, Dave, today. I appreciate the time.
Well, thanks for joining us. Thanks, Romain, for spending your time with us on the Data Science Leaders podcast. Take care.
Thank you, have a good day.
29:22 | Episode 16 | August 17, 2021
40:04 | Episode 15 | August 10, 2021
38:29 | Episode 14 | August 03, 2021
Use another app? Just search for Data Science Leaders to subscribe.
Data Science Leaders is a podcast for data science teams that are pushing the limits of what machine learning models can do at the world’s most impactful companies.
In each episode, host Dave Cole interviews a leader in data science. We’ll discuss how to build and enable data science teams, create scalable processes, collaborate cross-functionality, communicate with business stakeholders, and more.
Our conversations will be full of real stories, breakthrough strategies, and critical insights—all data points to build your own model for enterprise data science success.