Episode Transcript
Welcome to this episode of our show, True DataOps. I'm your host, Kent Graziano, the Data Warrior. In each episode, we try to bring you a podcast discussing the world of data ops with people that are making dataOps what it is today. So be sure to look up and subscribe to the DataOps Live YouTube channel, because that's where you're going to find all the recordings from our past episodes. If you missed any of the prior episodes, you always have a chance to catch up. Better yet, if you go to truedataops.org you can subscribe to the podcast and get the notifications when we have new shows coming. Now, my guest today is Matt Aslatt, who's the director of research and of Data and Analytics at isg. He's actually a returning guest of the show. He was on here probably about this time last year, I think. So welcome back to the show, Matt.
[00:00:56] Speaker B: Great, great to be here and thanks for having me back on.
[00:01:00] Speaker A: Yeah. So for folks that missed you last time, just give us a quick rundown on your background in data, data management and everything that you do.
[00:01:12] Speaker B: Sure. No, absolutely. So I've been an industry analyst covering the data and analytics sector now for almost two decades. So during that time I've covered obviously a lot of the significant trends shaping the data management, database and analytics sector, including things like NoSQL NewSQL databases, Hadoop data lakes, data lakehouses, and more recently, obviously, Data Ops. Obviously we're talking about today and also the emergence of data intelligence.
And so I'm currently the director of research, as you say, for analytics and Data isg, specifically ISG Software Research, which is obviously, as the name suggests, the software focused research business of isg, which is a global AI centered technology research advisory firm.
[00:02:06] Speaker A: Okay. So in this season we've been taking a step back and kind of thinking about how the world of true data ops has actually evolved and what we've learned in the last, you know, basically four or five years since this all started. Now, last time you were on the show, you'd done a study about the Data Ops space and the vendors that were in that space. And that was, you know, very enlightening then and now. Recently, ISG has just come out with a brand new, it's a set of buyer's guides. This time on Data Ops and say the least, there was a lot of progress, a lot of changes, and I think, you know, some surprising rankings. Actually, when we looked at the rankings, surprising to some of us anyway, probably not to you because you did the Research.
Well, some of them discussed. Yeah, yeah. So give, you, give me a little rundown on the study and what you've been seeing happening out in the world of Data Ops here in the, in the last year.
[00:03:05] Speaker B: Yeah, absolutely. So, so for those who don't know. So the, the ISG research Buyer's guides. So ISG acquired Ventana Research which has been doing these buyers guides based on the value index methodology for about 20 years. So there's a lot of sort of background research that goes in there. But now, so now the ISG research bias guides, what we do with those, essentially we try to put ourselves in the shoes of somebody who is looking to assess software providers and products that in a particular space, obviously in this case Data Ops. And what we do is we create an RFI RFP style process where we reach out to the providers that we think satisfy our inclusion criteria and we ask them a series of questions around the product and customer experience and then we basically rate the responses we get against kind of a notional ideal of, you know, what we think a Data Ops buyer in this case ought to be looking for. And then obviously everybody gets a grade and we end up with a two by two matrix and all the providers positioned according to where they end up according to the scores. So yeah, as you said, we talk about the buyer's guide for DataOps but there were actually multiple reports here. So there's overall data Ops. And then we also looked specifically at data pipelines, development and production, data orchestration obviously of those pipelines, data observability. And then specifically new this year was data products, which is obviously a hot topic and an emerging area. So it was good to add that to the mix. So yeah, there's a lot to unpack there. But yeah, it's obviously it's an interesting emerging evolving space. And in particular I think we've seen, you know, say a few years ago, and particularly when we, even when we did this research last year, there was a clear difference between the sort of emerging providers who, who had picked up on Data Ops and were very much positioned as, you know, new products that enabled collaboration and agility and automation. And then you had, obviously you've got the established data management providers, a lot of them, you know, with some existing traditional manual products, but those lines are blurring. I think, you know, we've, we've definitely seen that some of the established vendors have caught up and have added the kind of capabilities that we would associate with DataOps now to their products. So it's, it's not remotely that Data Ops has gone away. In fact, you could sort of say that ddops has kind of taken over and that's why those lines are blurring. But you know, there's still, therefore, and we can, I'm sure we'll discuss this, a need for potential buyers to be cautious of when they're evaluating, you know, looking not just at what products are called and what they're labeled as, but what they actually do. But, you know, it's definitely an evolving space and a very interesting one.
[00:06:17] Speaker A: Yeah. And so give me your perspective on what is the definition of Data Ops. Like you said, it's just like we're seeing everywhere obviously with AI and gen. AI like vendors are notorious for just relabeling functionality that they have and going, oh, well, this is Data Ops functionality, this is AI functionality. And we've had all this all along. So, you know, what do you see DataOps as being today?
What is it in your view?
[00:06:48] Speaker B: Absolutely. So, ISG Research, we define DataOps as the application of agile development, DevOps and lean manufacturing capabilities by data engineering professionals in support of data production. So as I said, it encompasses multiple things. Development, testing, development, sorry, deployment and orchestration of data pipelines, along with observability to improve the quality and validity of those type pipelines. And then obviously, as I mentioned, data products as potentially one of the ultimate outcomes of that process.
Yeah, as you say, I think we were very aware when we put together the capabilities against which we judge the products here, that it's one thing, there's sort of functional capabilities that Data Ops products have that overlap with some of the traditional products that may be more manual in nature. So obviously we assess them according to them, but we were very careful to make sure we were not just assessing, but putting a significant weight to things like agile development, collaborative processes, you know, that DevOps approach, and you know, more modern data engineering approaches to data pipeline development and orchestration and observability compared to, you know, the more traditional manual approaches. So, yeah, it was definitely a big part of the way we approached this from the start. As I said, putting ourselves in the shoes of a data practitioner. What would they be looking for that would identify product as clearly addressing their Data Ops requirements in addition to some of the underlying core data integration and transformation requirements?
[00:08:44] Speaker A: Yeah, I think as soon as you said Agile and DevOps features here, then we've stepped beyond, we'll say the traditional ETL tools that we used to use in data warehousing, where it was, okay, yeah, we had a tool and maybe we could generate some of the code. But the version control, the CICD capabilities, incremental development, all that really wasn't there in the traditional tools.
You still had to do something else, Right.
It wasn't part of it. So that's, and I guess that's where some people see certainly, I'll say the, the rebranding of existing functionality and where the old, the traditional tools, as you said several times, manual, a lot of manual processes, they may be part of a Data Ops process, but in and of themselves they don't constitute data ops. They're just one.
[00:09:39] Speaker B: Not, not necessarily. I'd say some, some of those vendors, if you look at the old traditional products. Absolutely. I'd agree with you. I'd say some of the vendors have been introduced particularly in the last 18 months or so, newer products that do have, you know, some of those features and capabilities that we, that we do associate with, you know, with, with true Data Ops. So, so yeah, as I said, I think that's reflective of the fact that, you know, Data Ops has, you know, a few years ago it was kind of, it was new and emerging and it was kind of defined in some ways. Even though, you know, you had these sort of dead ops pillars that were clear about what it meant, it still was sort of almost defined as not being the old way of doing things. And I think, you know, now we see that a lot of the, even the established vendors in this space, providers in this space, do have products now that do. Would fit a DataOps criteria. But yeah, as I said, buyer beware. It's definitely something to be cautious of if you're approaching this, to make sure you're looking not just at the product description, but actually the documentation itself and the true level of, as you say, particularly when it becomes the things like CI, CD integration, AI capabilities, automation, things like that.
[00:11:00] Speaker A: Right, yeah, yeah. So, yeah, it sounds like the seven pillars that we've been talking about for however many years it's been now, four or five years.
That's really kind of what came, I think came into play here is that's one of the ways certainly we tend to evaluate these things is like, do we have all those, all the things you just mentioned, the cicd collaboration automation, automated regression testing. That's a, that's the big one that I still see kind of hanging out there is how, how well are the products supporting that today? Because they certainly. Yeah, five years ago there were very few. There was some testing vendors, there was a couple of vendors that had good testing suites. But getting that integrated into the whole of a data ops process was not necessarily trivial.
[00:11:52] Speaker B: Yeah, no, and that's a fair point. So, yeah, as I said, we definitely, you know, when putting this together, we're very mindful of, sorry, the seven Dadops pillars, also the Data Ops manifesto. And really taking into account those documents and the community requirements as they had evolved over the years, there are definitely still some gaps when you look across.
Obviously some products are better than others, as you suggest. Regression testing is one of those. The use of AI is another. Another I've seen as well, which is, I anticipate we'll see a lot of innovation on in the next couple of years is actually, you know, the ability to measure the impact of these projects and actually, you know, data about your data projects so that the data ops teams can go to their, you know, the businesses and say, look, you know, we've delivered this level of improvement that's always been part of the data ops approach. A lot of products are still missing those capabilities. A lot of that's still being left to the user to sort of, you know, the metrics are there, some of the numbers will be there. But actually, you know, reports and dashboards that they can, they can send around and collaborate on with, you know, people, you know, more on the business side of the enterprise is still missing.
[00:13:14] Speaker A: So, yes, that's the fino.
Yes, that's the finops. Yeah, yeah, yeah. Because we, you know, a couple years ago started talking about observability and there was, you know, things like Monte Carlo and companies that were really focusing on observability and we're. That was really kind of observability around the data pipeline more than anything else. Right. You know, so we know, because that was always a question back in the day when I started with data warehousing is like, did the ETL process run right? Did it run last night? How much date did we get the data? We were expecting, you know, you know, if the numbers look off today, can we figure out why the numbers are off? And that's where that, I think that observability piece being integrated into a data ops process was very helpful. But yeah, what you're talking about now, I think is what we call finops now.
[00:14:06] Speaker B: Right.
[00:14:07] Speaker A: It's like, what's the roi? You know, are we getting the value and how much is it costing us to do these things?
[00:14:14] Speaker B: Right, exactly, yeah. And that's a big part of the, as I said, there's of the method, of the value index methodology that underpins the bias codes, goes back towards 20 years. And the ability, one of the things we assess there is the ability of the provider to help the customer measure roi. So we're not talking about here what is the ROI of the product, but how does the provider help the customer understand and measure the value of, in this case, obviously the data ops or data observability product they are using and report on that back to the business. Because, you know, if you can't prove you're benefiting, you know, you, as the user may know you're benefiting and in multiple ways. But you know, we obviously as data, a lot of, you know, money being spent on data and AI related projects. A lot more focus from the CEO and board level now on what are we delivering, how are we delivering value, how is this delivering improvement? So yeah, a lot more focus on that side.
[00:15:23] Speaker A: Yeah. And I think you mentioned data products.
As more organizations are focusing on delivering the data via a data product concept, if you will, being able to have that feedback is probably even more important today than it was a couple of years ago.
[00:15:45] Speaker B: Right? Yeah, and that's definitely, I think it's interesting. Obviously the data products area is still evolving.
It was an area that was interesting to me looking at the providers in this space. A lot of providers offering functionality that enable organizations to create and package data products and you know, deliver domain based ownership and share those, those data products across the organization.
Less. A lot of work in progress, let's put it that way, still are around things like data contracts, the agreement between the data owner and the consumer of the, of the data. And also as you said, you know, we see as absolutely critical there is the ability for the data consumers to, you know, just rate the data product. Comment on the data product. Is it a good, you know, did it, does it fulfill what it says it does? You know, is it a good data product? Can you trust, you know, the, the trust score in relation to the data? Oh yeah, things like that. So that is still a work in progress.
You know, it's still relatively early stages and you know, some providers are further ahead than others. But yeah, that was interesting. You know, it's interesting when you, these buyer's guides give us the ability to look across. In that case, I think it was about 20 providers in the data products buyer's guide. I think, you know, when you're looking across and you're assessing multiple providers, you can start to see where there's, you know, yes, there's, there's some that are ahead of the others. Well, where there's gaps that are commonplace across the dividers because, you know, there was the early stages. Development.
[00:17:25] Speaker A: Yeah. So with, you know, you've been doing this research obviously for a couple of years, you know, with the, the one we talked about last year and now this year. What's the most surprising evolution that you've seen?
[00:17:41] Speaker B: I, in some ways, but again, maybe this was because I was expecting, expecting too much because maybe I bought into the hyperbaric. It. The level of adoption of, or the level of delivery, I should say of AI based capabilities in these products is still early stages. You know, if you think, you know, we're going back to, you know, about two years now of, of the use of large language models. There's a, there's a lot of obviously vendors talking about that. There's a lot of providers, you know, that have capabilities in development or in preview when you actually. So one thing is important when we do these assessments. We are assessing what is generally available. We have an eye on what's.
[00:18:28] Speaker A: What can I buy today and use.
[00:18:30] Speaker B: Yeah, what can I actually use today? Not what's on the, you know, coming up, not what's on the slides, what's actually in the documentation, as I said. And some of that, it's still, it's still relatively rudimentary. There's a lot of use of obviously the use of AI for, you know, to ask questions of the data, you know, natural language query of the product itself. A lot of assistance, you know, not that dismissing that as being useful at all. But I think there's a lot of scope in this, in this area for documentation of data pipelines, summarization of data pipelines and the orchestration of those. And there are some providers that are doing that, but others are still early stages. But maybe that was me, as I said, just getting caught up in. When you've seen so many providers make announcements around AI as they have, it's easy to think, oh, everybody's doing it. And when you, as I said, when you do an assessment like this where you look across the board, it's interesting to note that, you know, it's still, still relative, it is still early stages.
[00:19:36] Speaker A: Well, and I guess that's, that's one of the benefits of what you guys do and putting these buyers guys together is because as a organization, you know, how many, how many vendors did you evaluate?
[00:19:50] Speaker B: So across all the buyer's guides, it was, it was, I think it was 49.
Obviously not every, every provider was in every buyer's guide, it was, it was roughly between 20 and 30 in each one.
[00:20:03] Speaker A: So yeah, and even that is a pretty substantial number for any organization, even a large global organization, to try to sift through that themselves and have to, you know, basically host presentations from, you know, 30 or 40 vendors and do their own sort of RFP process and then try to, you know, do. There's always a question in my mind, you know, where I end up getting called in for things like this is, you know, do they even have the staff that could actually evaluate, you know, what you, what we just talked about is, is this just hype in the slides? Does the product actually do it? But then even more so, it's like, do we need the product to do that? You know, are there, you know, these features may be fantastic, but are they things we actually need as an organization? And to try to go through all of those vendors yourself with, you know, to do that would. You'd have to have pretty much a dedicated team of fairly knowledgeable people to do, which obviously, you know, you do that yourself. That's what you do in your role at isg. And you guys go do that and basically have done that for the organizations so they can read some conclusions and see what you've done.
[00:21:16] Speaker B: Exactly. And you know, you know, obviously each enterprise is going to be different in terms of their needs and requirements and you know, products they've already got and products, they look gaps they're looking to fill. But yeah, the Buyer's guide is, is definitely designed to at least, at least give them a head start in doing that. You know, we see a lot of interest obviously across the, the ISG client base in terms of at least drawing up from a long list to a short list and identify which providers are worth spending more time with. Because one of the things that absolutely continues to be true in this space is a huge number of providers in the space. I mean, data observability in particular, we only assessed 17 providers because those that, because we have a inclusion criteria which is based on obviously the capabilities but also the size of the company. But there was actually another 25, what we call providers of promise that we could have evaluated in that area. So it's just a huge, huge number.
[00:22:22] Speaker A: It's a lot.
[00:22:24] Speaker B: It is particularly in that data observability space. And of course, not all of them are going to survive. Some of them will be acquired, some of them, some of them will become bigger, some of them will evolve into more general data ops providers, and some of them will disappear. And that's the challenge, obviously, as you said, from an enterprise looking at that number of providers, how do you begin to weed those out?
[00:22:48] Speaker A: Yeah, I had no idea there were that many in particular in that category that there's that many out there. Again, that's where your buyer guides are definitely a benefit to organizations because there's companies out there that you might not have heard of that might actually be on the top of your list. You do the evaluation thing. Hey, you should look at. If you want to look at a couple companies, here's the ones that you should look at. And they may be companies that these organizations have never heard of. So, like, yeah, yeah, I think last time, like DataOps Live wasn't even in the mix, if I remember correctly from the last.
[00:23:23] Speaker B: Yeah, last time, the company was just very. Just below the level of inclusion criteria. So. But obviously, yeah, made the cut this time.
[00:23:33] Speaker A: Yeah, yeah. So that's going to be, you know, people that are looking at the guide this time versus the one that they looked at it, you know, last time. You know, we're in there now. We weren't there before.
And I'm sure there are. There are other companies obviously that you've uncovered in the research that. Yeah, a year ago they weren't in the report. This year they're in the report.
And like you said, who knows what's going to happen with them? That's, I guess that's part of the, the due diligence that a company has to do when they start looking at, you know, are we going to pick up this kind of software? You know, the company, you know, how big is the company? Is the company growing? Is it likely to be acquired? I mean, when I first started with Snowflake back in 2015, that was like the biggest objection anybody had to Snowflake. It's like, wow, the architecture looks great. This sounds wonderful. But you're not aws, you're. You're not Azure, you're not Google. How do we know? And you're not Oracle. And that's when I got most of that. You're not Oracle, you're not Teradata. How do we know you're going to be here in three years? Right, yes. Cloud thing really going to take. Is this little company that, you know, came out of nowhere, is it going to survive? And that was the biggest question. And obviously for the data ops, space observability, data pipelines, that's the same question. I guess when you're looking at some of the newer ones. Right. You know, some of the companies, like you said, some of the companies have been around for a while, have now added data ops True Data Ops functionality. Those companies, okay, they've been around for decades potentially in the data space. You can be fairly confident they're, they're going to be around. Then, then it's a question of do they have the features and functionality that the, the, the newer guys have that really meet your needs. Right. And that's a lot of the comparison in the matrixes that you've done are going to help people. Yeah, you know, ferret that.
[00:25:31] Speaker B: Exactly. Yeah. And that, that's part of, you know, at the heart of the, the Buyer's Guide, as I said, the methodology behind it is, is obviously we look at the core capabilities of the product itself for Data Ops in this case, but also things like manageability, availability, reliability, some of just the core tick box items from an enterprise perspective, security obviously, but also that customer experience. So yeah, as you said, yes, this vendor is going to be around, it's established that's not a problem. But does it actually fulfill the functionality we need? And the other side, yeah, you know, here's a new vendor, great emerging innovative functionality, but are they going to be there? And so yeah, if you, if you tick both those boxes, that's how you get up in the, you know, the top right of the, the Buyer's guide matrix.
[00:26:24] Speaker A: Yeah. And I guess this year one of the guys at least. Yeah, DataOps Live ended up in the top three, if I remember our.
[00:26:33] Speaker B: Yes, believe so.
[00:26:36] Speaker A: Yeah, yeah, yeah, well that was, Yeah, I was happy to see that. Obviously, I guess we've done something right. We thought we were doing something right. So that's good. So do you ever get into the buy versus build conversation with customers?
[00:26:54] Speaker B: We, we do and I'd say certainly less so in this area than say a few years ago. I think, you know, a couple of years ago when a lot of the functionality, the products that were evolving, the space were less mature, you know, you could make an argument for potentially, you know, building some of these capabilities yourself, particularly if you were complementing an existing investment.
I'd say these days, you know, the number of providers, as we've already discussed, that are out there, you know, the breadth and the depth of functionality. I think, you know, the level of investment required to not only develop but you know, build it yourself, but also maintain it yourself. I think, you know, it's you, you'd be hard pressed to make that case these days, especially with, you know, obviously and Data Ops Live is a good example of this. But there are others obviously as well where you've got an increasing number of providers now that can span all the capabilities that we addressed across these multiple reports. Obviously, as I said, not every provider was in every report, but there is a core set of providers now that have that breadth and depth of functionality.
[00:28:15] Speaker A: And I think they're really hitting all seven pillars.
[00:28:19] Speaker B: Yeah. And to try and replicate that, you know, it's a lot of work. It's a lot of work. Yeah. And as I say, not just upfront development, but maintaining that going forward as well.
[00:28:28] Speaker A: Yeah, yeah. Do you think if an organization isn't considering looking at some of these data ops products, are they out of their minds to try to scale these days?
[00:28:46] Speaker B: I mean, yeah, I mean, yes, I'd say, you know, I, I can't, you know, unless you know, a very, you're a very established organization with a lot of products and, and, and, and you know, investment that's been made and you're, and you're not looking to spend any money on data and analytics.
But who? I can't think of any.
Yeah, exactly. So, so yeah, in short, all enterprises should be considering these products. I mean, you know, there then is a question of when do you begin to invest in addition to what you have or replace what you've got? Yeah, there's some complex, obviously investment decisions to be made. But yeah, given the focus on data driven approaches and particularly AI, I haven't even talked here about obviously the importance of data pipeline development and orchestration and observability to AI, you know, beyond just, you know, more traditional analytics. So, so, so yeah, they definitely should be part of every organization's strategic thinking. Even if, you know, the, the actual purchasing decisions may be maybe further out.
[00:30:03] Speaker A: Yeah, I remember early on when I was working for Snowflake, they come up for the statistic, I can't remember who, who it came from that said, you know, they had done a survey and only found like we were looking at, you know, companies that were intending to move towards data driven decision making and there was only like 7, 76% where yes, we want to do data driven decision making. And so I kept asking the question, it's like well what about the other 24%? They don't want to data based decision making. And I think they're, those are companies definitely want to stay away from because they're going to disappear in the long term. And I think that's, you know, we're saying today and even now with like you said, the people looking towards AI, right, the data feeding your AI and your LLMs, it has to come from somewhere. So we are talking about data Pipelines and data quality and orchestration and observability, all of that comes into play if you're going to be trying to run any part of your business off of AI. Right. And have any level of confidence.
[00:31:15] Speaker B: Right, absolutely.
Those good, good data management practices have been there forever. But, but what we do see is AI is a real forcing function. You know, particularly as a lot of these decisions are coming from executive board level and, and they want to see results because they're making big investments. And so. Yeah, so absolutely. Is forcing, forcing people to re. Not rethink their data management, but absolutely.
[00:31:40] Speaker A: Take it more seriously.
Okay, so where can folks get this Buyer's guide that we've been talking about? The guides.
[00:31:50] Speaker B: So they're available from ISG Software Research. I think we've got a, I'm missing the word link.
[00:31:59] Speaker A: QR code.
[00:32:00] Speaker B: QR code, that's the one. Yeah. So you can scan that and find them there. Obviously if you go to ISG Software Research website as well, everything is under Buyer's Guides. You can just search for, you can actually search for a particular topic, you can search by vendor. And so yeah, they're all available to us to register and download.
[00:32:20] Speaker A: So anything coming up for you, you're going to be hosting any events or webinars or anything else in the near future where you're going to be.
[00:32:29] Speaker B: Yeah, so I mean there's obviously a lot of interest in this area. So, and, and obviously conference seasons begin to kick off. So we've got some, some, some webinars in the work, some presentations around both data ops and, and you know, data integration, data intelligence, I mentioned earlier. So, and we're just kicking off actually the next Buyer's guide, which is kind of related, which is specifically on real time data as well, streaming data and events. So a lot of focus on that as well. So it aligns nicely with data ops and Data intelligence and data platforms as well. So you've got the QR code for the LinkedIn there. Obviously we'll be posting all the details up there as and when they're available.
[00:33:15] Speaker A: All right. Yeah, so everybody can link with you, connects with you on LinkedIn. And folks, if you do decide to connect with Matt on LinkedIn, do me a favor and put a little note in there that he, that you saw him on the True DataOps podcast, so he knows why you're reaching out and what, what bits you actually know already.
That's always helpful. Yeah, it's better than getting just a blind connection request.
[00:33:41] Speaker B: So.
[00:33:42] Speaker A: All right, well, we're, we're at time here. Matt, thank you so much for your insights and and being on the show today.
Thanks everyone else for, for joining in. And those of you who were on live with us, you're going to watch this on a replay. Again, thanks and glad you were able to join. Hope you got some some value out of this. You can join me again in two weeks when my guest is going to be one of my buddies in the agile world. Author, thought leader and agile data expert Scott Ambler is going to be on and we're going to talk about data ops in the agile data world.
Scott's written quite a bit on this, so I think it's going to be a really exciting session. I just recently got to take a workshop with him on continuous data warehousing, so we're going to talk about that. So that'll be in two weeks.
So as always, be sure to like the replays from today's show. Tell your friends about the TrueDataOps podcast and don't forget to go to TrueDataOps.org and subscribe so you don't miss those future episodes. Until next time, this is Kent Graziano, the Data Warrior, signing off for now. Bye.