Episode Transcript
[00:00:00] Speaker A: Foreign.
Welcome to another episode of the true DataOps. I'm your host, Keith Belanger, field CTO here at Data Ops Live, and Snowflake Data Superhero. Each episode this season we have been exploring the topic in the world of AI ready data.
If you have missed any of the previous episodes and you would like to Visit us at DataOps live on our YouTube channel, subscribe you'll get notifications of any new episodes that we publish. My guest today is Peter Chase, Enterprise Account Manager at Catalyst. And the theme for today's conversation is Fail to plan Fail. Why ddops Best practices matter more than ever. Peter, welcome to the show.
[00:00:46] Speaker B: Thank you, Keith.
Nice to be here.
[00:00:48] Speaker A: Yeah.
So, Peter, for those of you who aren't aware of who you are, get. Give people a little bit about your background, who you are. Let us into the Peter world.
[00:00:57] Speaker B: The Peter world, My God.
Well, I've. I've been in technology all my career. I did a computing science degree in London and yeah, I'm. I'm pleased to have stuck with it, you know, all the way through.
I guess some of the ones, Some of the things that worh mentioning are nine years at the British Broadcasting Corporation, the BBC, and a particularly telling part of that was managing their standard desktop and server deployment team, where you were trying to integrate a whole load of bits of software, some for users to use, but others that manage the whole thing.
We may come back to that as a theme.
I've been with software companies.
In 2009, I joined one, which was a startup and a fledgling user of DevOps. And we got into Git and then proper agile and sprints and all that good stuff. And that was a really big education for, for all of us, not just me.
And then since 2016, I've been in data. I've absolutely been hook, line, the sinker, right up to my eyeballs in. In data.
And. And that's been a journey, you know, cloud era was the thing actually back then. Yeah, yeah.
[00:02:07] Speaker A: That was like a blimp on the radar. Right.
[00:02:09] Speaker B: It came gone.
[00:02:11] Speaker A: Right. It was like, you know, it's interesting
[00:02:14] Speaker B: to reflect on why I did that.
More of a stayer was talend, which was also something that company I worked for at the time used, which was datalytics, which has an important part in the history of data drops. We then got into Snowflake, data drops came along and more recently I've got much more into AI, also into Qlik and Qlik's platform, Qlikview. Qlik Sense or Qlik Sense now Qlik Cloud analytics as they call it, which is a really great rival to some other offerings. And I'm. It's not something I've come across until the last few years, and I'm actually very impressed with it.
[00:02:51] Speaker A: So, yeah, it's funny you bring up, you know, we're in the world of AI and then you're talking about software and I can remember, insert disk one.
Right.
We come a lot. We've come a long way. So you've worked in obviously a lot of organizations with a lot of organizations over time in your.
What's the most common mistake you're seeing teams make when they first start building data solutions?
[00:03:18] Speaker B: Well, I should say actually, that the experience I have with Snowflake specifically and Data Ops largely does span. I've got it listed down. Industrial energy, crypto, would you believe, Telco, retail and government. Right. So been around a lot of different types of organizations with all the different sort of cultural differences that they have. But there's a. There are some common themes and actually the first thing that people need to think about. Right. Is what are they going to use? Right. So they need to procure the, the actual data platforms that they want to work with. Right. And too often I see that go wrong because either people focus on too much of the core and not of the peripheral stuff around manageability, governance and so on. The other thing as well that happens with data, unfortunately, is that people tend to inherit tools by virtue of buying other things and they sort of like get it given to them as if, well, you know, we've got this bundled in with, you know, other. Other software and you should just use this. Well, that's not a, that's not a choice. That's not choosing something.
[00:04:20] Speaker A: Yep.
[00:04:21] Speaker B: But let's, let's focus on people who have made the very sensible choice of using Snowflake, even, even then, despite all of its great features, there are other things to think about which people don't, perhaps because they don't have experience. Right. So they just dive in and they want to impress people because they want to reassure people that they've made a good choice. Right.
And they start delivering straight away without necessarily, you know, all the things that we're going to talk about. Right. About how you ultimately need to look back. If only you could see with hindsight. Right. What, what you need to know at the start, that's, that's, that's the thing that we're trying to help people with, Right. As partners and as yourselves, as data drops, is try and be open people's eyes to where they will be right out for themselves in the fullness of time. Right. But if you can help them right at the outset, then that avoids all of the stress that might become something that you need to experience right before you realize, hey yeah, we should have thought about these other things first. So yeah, that, that temptation to dive in very quickly and do some quick wins which, you know, because you do need to do that.
[00:05:29] Speaker A: The, you know, the business, when I say the business, right. Oftentimes the, the investment into doing. You know, I've always said we're in the business of data and analytics and us from an IT perspective, you know, because of the business, we wouldn't be, you know, if it wasn't for the business, we would have. No, we don't just do it for the fun of it, right. So there's a business has this desire to get value quickly, you know, and as you said, part of, I guess my role as an architect and some of us in the industry is like protecting you from what might happen, right. Is making sure those things that we know are gonna impact us. Right. So it's, it's that finding that balance of how do you get that investment in the value but at the same time protect yourself from. We know what's going to happen, right.
[00:06:21] Speaker B: When, when you're starting out, it's obviously a, a little seed or you know, you might be lucking it to trying to light a fire, but you don't want the fire to become an inferno. Okay. So, and you also, you, you want to start out if you can, as you mean to go on. But how do you know what that is?
[00:06:39] Speaker A: Right?
[00:06:39] Speaker B: People, people do things.
Well, somebody said to me once, you know, there's nothing more permanent than a temporary fix, right? So you do things on a tactical basis and you, and you think you'll come back to them later and get them right. Actually that's really, really hard to do.
But where stakeholders come into this, right, is that they generally come to a team that might be doing work with data, with a, with a rather cynical view. And this is an unfortunate aspect of our industry.
It is fabled for its disaster projects, right.
And there is unfortunately a prevalent view out there amongst stakeholders, amongst C level people and so on and so forth, that it can blow up in your face and waste a lot of money. Right now they would like to hear from you that you're not going to be one of those.
So you're under pressure as a straight off to prove them wrong and to give them confidence.
So I'm not Saying you can't, you shouldn't do something quick which shows value and builds confidence. But you need to choose very carefully what it's going to be because if you start out with too big a challenge, you won't deliver something very quickly or it'll be such a lash up. Right. That you will have real trouble building on it as a foundation.
[00:07:54] Speaker A: So a couple questions from that kind of two questions put together in one question here.
What do you see other things that are often sacrificed or skipped and you know, how do they rear their ugly head later on?
[00:08:12] Speaker B: I mean everybody wants, everybody has this vision, right. That I'm gonna, I'm gonna create something great. It's going to be really helpful to the business and it's going to go into what they call production. Yeah. The live environment. Okay. Do you think about what's gonna, what's going to have to happen once it is live? Because you're then expecting it to sit there, work day in, day out.
A data product, as we've come to realize is much more than just about the raw data at the center of the data product. It's also about the confidence and the reliability that business users want to know that they can expect from a data product. So once something's live, it's not the end of the story. You've got to keep it running. You've got to make sure it's delivering the changes to the data, you know, the frequency that is expected.
And, and you've got to cope with sometimes uncontrolled change that comes from the source. Yeah. Where columns get added and all the rest of it, you know, and you've got to watch out for security. You've got to manage. I mean the more popular something comes, the more important it is for security to work correctly. So like any, like any product there are non functional requirements, nfrs. Right. Which yeah. Relate to that product which you need to think about as well because you want it to be successful. But unless you deal with those NFRs and you think about them fairly early. Right. You will ultimately end up with a product that you can't really take to market to the full market potential if you like that it, that it deserves. So yeah, not thinking about those NFRs I think is, is, is the problem that a lot of people realize that
[00:09:42] Speaker A: they've created and you, you know what, you know, I think years ago, not too far when I said years ago, could have been just a few years ago, we, we kind of normalized it being okay, right. Good enough. Right. It was you know, you talk about the non functional requirements and you're like, something would happen. You're like, so what, 10 records didn't load, no big deal, right? Or oh, we got that transformation wrong. So we're off on today's load.
No big deal. The pie chart might be off, the graph might be off, right?
I think we've gotten, we got okay with things being good enough, but we're in the AI world today, right. And we like to say good enough data is not good enough for AI. You know what, what's your take on, on that?
[00:10:33] Speaker B: Well, yeah, I, I agree. I think, I mean the whole basis of DevOps, right, is, and, and therefore data ops in the world of data, right, is to, is to have actually a very, very good product which hopefully is spot on. There's no inaccuracy at all. But to achieve that, you need to start with something simple and then incrementally add to it whilst retaining the same level of quality. That's I think, isn't that the sort of basis of DevOps where you get something right and you're pretty sure a professor from a computing science department at Imperial College or whatever would say, I can prove that's correct because it's simple enough to do.
[00:11:10] Speaker A: Right.
[00:11:11] Speaker B: But as you incrementally add to it because the business wants more from it, that correctness is more and more difficult to achieve. But you, if you do it in small steps, so frequent changes, but ones that you've tested in a way that show the qualities retained. Right then, yeah, then it's funny how time sort of goes by, right. Weeks turn into months and when you look back at what you've actually achieved with all these small steps, you've gone that huge distance that somebody in the past would have said, well the waterfall approach would have got us there and that would have been the big, the biggest mistake of all. Right. Is how to deliver something in that, that sense. Right. Give me three months and I'll. And I'll bring something to you, right? You want to be working with the business and doing that iterative agile thing, right?
[00:11:57] Speaker A: Yeah, yeah.
That kind of brings me to, you know, you being a partner and expert in the industry, you know, how do you see that role you as, as that in the organizations you, you in influencing them, like playing that role, protecting them, kind of like what's your, your part as that partner and expertise in the industry?
[00:12:19] Speaker B: Well, at Catalyst we, we have over 300 clients and a lot of them are in the sort of mid sector. Right. They're not huge banks or big farmers. Pharmaceuticals, they, they're honest companies with, without the budgets to have very large technology functions. So there might be really a handful of people.
And so we represent a chance for them to have an extension of that. Right, where we have.
And other partners as well. Of course, we have particular specialisms in our areas that we can bring to them that they wouldn't have otherwise for themselves. So as partners, we represent a sort of repository of experience and, and sometimes we're able to do a bit of experimentation as well. And I think another thing that we bring is we, we bring relationships with good technology partners like yourselves. Right. So we've got an in with you guys because we know your CTO guy, Adams, and all this sort of good stuff, right? Yeah. And we can go and talk to him. Well, you know, we do like to respect certain protocols, but nevertheless, we've got those relationships with you guys so that we can bring something to our clients which they wouldn't really have for themselves because they don't, they don't know you as well as we do. So there's that partnership, as it were, backdoor that we can offer to those clients, as well as the resource extension, if you like, in the skill extension and the experience that, that we are building for ourselves as we deliver solutions for all these clients.
[00:13:44] Speaker A: So how do you introduce data ops? I'm going to say data ops from two perspectives. Data ops as a practice in data ops live as a product into customers that might not even be be aware of either.
[00:14:00] Speaker B: Well, on the first point, right. People at least have pretty much recognized DevOps as a thing. Okay. So they have any experience of application software. Application development. They recognize the success that Agile and DevOps has brought to the world of software development.
And then you say to them, well, yeah, you know, now you're going to be doing some of that, but in a data warehouse environment instead, where there's actually an added dimension that your test data is potentially terabytes in size.
And it's not an area, in traditional terms that has had the same practices and development standards, right. Applied to it as DevOps. So first of all, yeah, we talk, we talk about the need for that and how good that would be because you can see the benefits brought to software application development then with Snowflake. Well, wow. I mean, you know, Snowflake, fantastic as it is, right, isn't a totally complete platform.
It needs. You have a choice. You have enormous choice indeed of what tooling to bring on top of it to do pipeline development could be DBT could be Talend or Informatica or Python. You know, there's such an amazing spectrum that you can bring to it and it all works fine, but they're all very raw tools and you need this wrapped in some sort of developer environment. You know what people call an sdlc, a software development life cycle. That's a bit of a posh way of putting it, isn't it? But, you know, it's, it's true. You need some environment for developers to, to live in. Right. And that's, I think, where DataOps for Snowflake is, is the preeminent product for Snowflake. Because you got in early. You have exploited some of the amazing features of Snowflake to help deliver that sdlc, the zero copy clone thing that when you set up a branch and it creates a whole new environment just for one developer with a warehouse and a zero copy clone of the data is, I mean, once people get that about DataOps, they can see value almost. You know, it's, that's the starting point, but it's, but it's, it really then brings it home to what, what's being added.
[00:16:05] Speaker A: Yeah, it's, it's funny, you, you reference sdlc and I used to say to my team, because I used to do a little pun off of ddl, used to call it ddlc, right? Data development life cycle.
You know, you had said something to me, not today, but in preparing for today's show, you had said one thing that really stood out about data ops. Politely makes, you know, people adopt best practices.
What is, what did you mean by that?
[00:16:32] Speaker B: Well, you know, I mean, people, people like to develop. You know, there's somebody told me actually once when, as part of a very large group of people, we were being transferred from one company to another and there was a grieving process going on and they actually brought some people in to help us through this grieving process.
And one of those things they said to us was when you're in a job, you, you're in one of three states, you're either a learner, which is great because you feel you're developing yourself, you're a passenger, or you're a prisoner. Right? You don't want to be a prisoner. Is somebody who feels so trapped that they don't even think they could get another job because they don't think they're worthy enough. Right.
Learning and developing yourself. Right. And considering yourself to becoming a better practiced person or a better, a better person who's a better practitioner is a very motivating thing. And when it's put on a plate for you, like Data Ops gives you on top of Snowflake, you, you quickly realize that actually, rather than just being able to sort of waffle to people or, or put something a bit summary into, into your cv, you're actually able to really talk with confidence and really, you know, some real substance about how you've learned through just the way Data Ops works, Right. It's just, it's just showing it, right. It's just showing the best way to do things, or certainly a very good way to do things. I'm not saying it's the only way, but I think, I think it's, broadly speaking, it is, it is really good practice. And to be able to say you've been there and done it and actually you appreciate what it really is, I think is very, very powerful.
And you know, the opposite is that people fumble about for a long time and actually their pipelines break and they've got a tremendous mess on their hands. Well, the worst thing in data, right, is that you don't just have to fix the software, you have to fix the data. The data that got screwed up by the bad software, right.
So I've seen teams, and it's very sad to say, right, where they're burning out, they're working all the hours, they're in the, they're up in the middle of the night, they're working all through the weekends, right? And this is not for one particular big milestone they're trying to achieve. This is, this is the norm, right? Well, those teams, it's just, it's just sad, right? As well as dreadful. It's not good for the company, but it's particularly not good for the, for the people in the team.
[00:18:43] Speaker A: Yeah.
[00:18:44] Speaker B: And ultimately they vote with their feet, right? They, they walk off, right?
[00:18:47] Speaker A: Yes.
[00:18:48] Speaker B: You can't live like that. So that's the, that's the fiery hell of the opposite end of the spectrum. Right.
I just think data drops so helpful at simply being there for you to be part of, to be in. Right. And it just sort of looks after you in a way, but it also develops you.
[00:19:07] Speaker A: I know when I was implementing Snowflake, so before I came to Data Ops, I was actually in the field for many years implementing Snowflake before I found DataOps. And what is the norm? And maybe you're seeing it, people tend to take the DevOps solutions as you talked about and morph them in, you know, get them to try to work for a Data Ops. What does adopting Data Ops live bring? To an organization versus well, let me just use these other tools.
[00:19:44] Speaker B: Yeah, it's a build versus buy thing, isn't it? Yeah, well, yeah, you know, when people do this a lot and where it's a very unique situation, maybe fair enough. Right. But what Gartner isg Forrester are now very clearly saying, right, is that the idea of using data ops, so DevOps in the data environment, right. Has resulted in people bringing to market some products which do that. And it's all there, integrated for you. Somebody's tested all of the integrations between the different component parts for you. They've done it for many, many customers. So it's been hardened, right. Productionized, whatever you want to call it.
So the idea that you then go and build all of that yourself when it's not, when it's actually already available, right. Is, I think it is, it is a bit questionable because like why are you in the job you're in, is it really to build one of those or is it to build some solutions you know, your business actually needs?
If you can buy in something and the value that DataOps provides, right, it's, it's hard to argue, right, that you would actually spend more by buying the product. It's it, you know, it's. No, for me it's a slam dunk. It's a no brainer because no longer are Data Ops products a really rare idea. They are going mainstream. And if you look at what Gartner are saying about Data Ops as a, a marketplace, I think there's a market analysis report, isn't there? Or two and, and day drops gets a very good showing in those things quite rightly.
It's demonstrating that, yeah, this is, this is a mature product area now.
[00:21:18] Speaker A: You know, you brought up a good point and I experienced this right, when, when I first got into Snowflake and going towards the cloud, we were almost a forced to adopt, you know, the infrastructure as code mentality of things going to the cloud. Right. The way we, we deployed things to Oracle Teradata years ago didn't hold true.
And our team, our developers are by nature, they love to develop, right. And they're like, oh, we can build this. But yeah, we spent three months on this building a good enough solution and I say good enough because it did the bare minimum.
And those were Data engineers, not DevOp. Yeah, people, right. Because we had data team and they took away from being able to like your point, deliver business value. Right.
[00:22:11] Speaker B: You're actually slowing yourselves down ultimately, aren't you? So the consequences ultimately are Slower time to value because you can't get good products out as fast as you used to be able to. Because you, you know, you're spending time worrying about the environment and making sure that the latest version of Terraform or Git or whatever it might be, right. You know, just when you get it
[00:22:32] Speaker A: working right, Terraform changes something, or Airflow changes something, or Git goes and adds a new feature and you can't keep up, right. And when you're stitching all these tools together, you're in just this endless upgrade and modification.
[00:22:45] Speaker B: You've got, you've got two layers going on here because you've got all the use cases that you've been developing anyway as business, as data products for your business.
And you've got to have a way of keeping on top of those while you still want to move forward. Right. So you, nevertheless, you're going to have an overhead looking after all that stuff. So you want a tool, you want that environment in which you can basically see how those things are performing. You can check their cost and all those sort of good productiony things that we talked about earlier. And then there's another layer below that which if you, if you, if you were building it yourself, right, you've done, you've then got the maintenance and integration checks and all the rest of it for the actual software development environment.
Well, you can, you can buy your way out of that very, very cheaply and, and allow someone else to take that heavy lifting for you and just deliver it as a product.
[00:23:31] Speaker A: It doesn't take long to find the ROI in it, that's for sure.
Before we got on, on the show, we, we were having another conversation. I'm going to bring it back. We're going into, we're in the world of AI, right.
And everybody wants to go faster. They want to use AI. If you don't have these foundational, like, how does that impact your ability to go towards, towards AI?
[00:23:55] Speaker B: Well, AI can, It's so powerful, right, that when we were talking earlier about correctness.
[00:24:01] Speaker A: Yeah.
[00:24:02] Speaker B: What was previously good enough, right.
[00:24:04] Speaker A: Yeah.
[00:24:05] Speaker B: If you, if, if good enough are you not perfect is given to AI.
You have that thing where, you know, if you, if you're a tiny bit out when you're navigating a ship just a short distance, that's fine, right. You'll find the port on the other side of the straight, you know, like going from England to the Isle of White or something. But if you're out by half a degree and you're going across the Atlantic, you End up in the wrong country. Right.
You know, and that's what AI is like. It. It will be that that value and that insight from the data is. Is so powerful and it draws on that. That data. So to such a keen degree. Right. That if there are. If the data's not anything other than pretty damn good. Yeah, yeah.
You're potentially going to get insights from AI, which. Well, hopefully you would spot that you thought they were mad. But that's not necessarily the case, is it? Because it's much more nuanced than that. It's, it's, you know, recommendations around how much stock to buy and when to buy it and that sort of thing, which is the sort of thing that we're dealing with for some of our customers.
These are lines they buy in and then if they don't sell, they have to sell them on to the sort of outlets that take ends of lines and make no money from it at all. Or else the other extreme, right. To buy enough and then everybody wants to buy something that they stock for.
[00:25:25] Speaker A: So, you know, I've been, I've had conversations with a few people who are very much involved in building models, training models, and when you're hearing them talk about the models being 70% accurate. Oh, 80% accurate. That's amazing. Well, then you start asking, well, which use case is 80% accurate? Good enough. Right. Maybe for a call center or. But to your point, if you want to do something that's making a fiscal decision by it, you know, you. We got to get that number.
[00:25:56] Speaker B: I'm one of those people that I have to say, I suppose maybe because I'm a bit older. Right. That I think data should advise you shouldn't let it be the. The be all and end. All. Right. There are very good people out there who've got years of experience. Their intuition, their gut is telling them one thing. AI should be challenging. Okay. Yeah.
But it isn't necessarily why you shouldn't hand over the keys to the palace, to the.
[00:26:21] Speaker A: You know, it's funny, you said your gut. Right. You hear, so here in the States, like Major League Baseball, and you will hear certain managers say they're very analytics driven. And then you have some that are. I've been in baseball for years, I just know. Right, right. And then they'll make a certain decision. Right. To your point, and they'll sit there and say, well, the analytics told me that I should button or something like that. Right. And they were way off. So it's funny that you said that, you know, my gut and Maybe it's that balance, right, between AI and gut and experience.
[00:26:57] Speaker B: I think so.
[00:26:58] Speaker A: So it's.
[00:26:58] Speaker B: There's so many different factors on there in, in modeling and, and they don't stay the same. There are new factors sometimes as well. So, yeah, you have to be pretty creative actually to. To get some of this right.
Yeah. How the world's changing.
[00:27:14] Speaker A: So before we completely end, you know, I have one last question or one from you, is if you could give one piece of advice to teams starting on their snowflake journey, what would that be?
[00:27:26] Speaker B: Yeah, well, I thought about this and I think basically at the outset, you need to really try to embrace a data ops approach. Okay. What does that mean, though? That means at the outset, when people are champing at the bit and demanding to see some results, right, that you choose something pretty simple, but which will be impactful as your immediate first delivery and then demonstrate the DataOps approach of actually building that out from being an MVP into something grander over time. Because you'll then demonstrate you'll. You'll keep dropping more value. Right. And you'll be demonstrating that a small increment retains quality, delivers new value. There's always something new coming out. You know, when I was with the software company, the startup one that did project management software and we did a. A two weekly Sprint cycle, it. It might not have been. It might not have seemed like very much at the time, but every two weeks we were dripping, dripping, dripping more value into the product and more things for people to look at and it was never anything, you know, hugely amazing and cataclysmic. Right. Occasionally there was a big drop or something, but when you look back over many weeks or months, right, you realize just what amazing journey you've come on and you've taken people with you, you take, You've shown them the value, you keep delivering that value, but you haven't dropped the quality. The quality of the product still as good as it was way back when you started at the first, the first release.
So set yourself off on that path. I think people like ourselves are very happy to help with our experience of that sort of thing and help guide people and, and maybe be that sort of critical friend where you bounce some ideas around and stuff and then we obviously can help technically to deliver it as well. But yeah, fundamentally, great drops approach.
[00:29:14] Speaker A: Well, Peter, it's been a pleasure having you on. We ran out of time because I'm sure you and I could talk for another half hour or more. More. But it's a pleasure having you on.
[00:29:21] Speaker B: Thank you very much. It's been pleasure talking to you, Keith.
[00:29:24] Speaker A: Great. And for everybody, thanks for joining us. Thanks everybody for watching. And today, remember, good enough data is not AI ready data. Thanks for joining us. Bye, everybody.