Episode 15

April 23, 2024

00:34:17

Carsten Bange - #TrueDataOps Podcast Ep.33

Hosted by

Kent Graziano
Carsten Bange - #TrueDataOps Podcast Ep.33
#TrueDataOps
Carsten Bange - #TrueDataOps Podcast Ep.33

Apr 23 2024 | 00:34:17

/

Show Notes

In this episode of the TrueData Ops podcast, host Kent Graziano speaks with Carsten Bange, founder and CEO of BARC (Business Application Research Center), who shares insights into BARC's role in technology selection and data strategy, emphasizing its specialization in data and analytics. Carsten discusses the evolution and importance of data products, boosted by the data mesh concept, highlighting the necessity of continuous processes over traditional project-oriented approaches in data operations.

 The episode also explores DataOps, focusing on the challenges of operationalizing data projects and the crucial role of automation in managing data complexities and ensuring quality. Carsten also touches on the growing importance of ethical considerations in data management and the need for a robust data culture to support effective and responsible data usage. The episode ends with a look forward to the impacts of generative AI on the industry and the ongoing need for innovation in data practices.

View Full Transcript

Episode Transcript

[00:00:05] Speaker A: Welcome to this episode of our show, true Data Ops. I'm your host, Kent Graziano, the Data warrior. In each episode, we're going to bring you a podcast covering all things related to data ops and the people that are making dataops what it is today. If you've not yet done so, please be sure to look up and subscribe to the DataOps Live YouTube channel. That's where you're going to find all the recordings from our past episodes. So if you missed any of our prior episodes, you can catch up there. Better yet, go to truedataops.org and subscribe to this podcast. Then you'll be sure not to miss any of the future episodes. Now, my guest today is industry analyst, advisor, consultant, and the founder and CEO of Bark, Doctor Carsten Bonger. I almost said it right. I got close. Welcome, Karsten Kent. [00:00:58] Speaker B: It's a pleasure to be here. Thanks for the invitation. [00:01:02] Speaker A: So, for the folks who don't know you, can you give us a little bit about your background in data management and what you all do over there at bark? [00:01:10] Speaker B: Absolutely. Yeah. I founded bark 25 years ago, so quite a while in the industry. I founded it as an industry analyst, meaning that we look at all the vendors in our space, but we are specialized on the data analytics space. We are not an analyst trying to cover all software segments or even more, but we only do data and analytics. And being an advisor to clients or users, we help them to make the right decisions. Meaning on data strategy, on architectural choices, but also especially on technology selection, since that's our core area we look at. [00:01:51] Speaker A: So what does Barc stand for? I assume it stands for something. And maybe in German. [00:01:58] Speaker B: No, it's in English. It's a business application research center. And it alludes to our roots. Basically, we started this as a project at a university, at a chair for information science. We created a test lab for business intelligence software. This is now back in 1998. [00:02:22] Speaker A: Wow. [00:02:23] Speaker B: And so we were comparing all the great solutions that were out there at that time. So I'm talking about Arbor airspace, I'm talking about Cognos, power play and holos and all the good stuff. And so we are creating this test lab. And basically the idea was to try to help enterprises make an informed decision on software. And this went very well. So, people, we published this first in books and then later as studies, and then companies came and said, well, this is great. This really helps us in this intransparent marketplace to get an idea what the software is really useful for, where the strengths and weaknesses are. And you being at university, you are truly independent and neutral and can really help us. And you have no interest in selling our software, as most other people that tell us about software. And this is what we definitely maintained. So we maintained this independence from the software vendors and also from the interest of selling a big consulting project to implement it. So that's what we are not doing, but we focus on this evaluation of technology and helping our clients to make the right decisions here. [00:03:38] Speaker A: And you have a very interesting event that you do every year, right? [00:03:44] Speaker B: Yeah. You are talking about the retreat where you also participated once. Yeah, that's an industry event, actually. We run the largest events for or on data and analytics in the german speaking countries. That's our core market for events. And the one you are talking about is a very specific one. That's like a vendor workshop. So we are bringing together analysts from all over the world and vendors on the other side that we brief on trends, we are seeing changes in buying behavior. On the user side, we bring in investors that talk about their view on the market, and we have a big discussion, or great discussion going. Yeah. What trends do we observe and how does the industry react in terms of how do users and buyers or software react to changes, and how do the vendors react or should react to that? That makes it very unique, I think, as a gathering of vendor executives. And with that, it's always astonishing what brain pool, in a way, we have in the room many of the attendees with 2030, maybe even more years of experience in our industry. So that always makes a very good discussion and a great strategy workshop, if you want to call it. [00:05:08] Speaker A: Yeah, and that's great. That's really kind of how bark stays up with the trends and what's happening and knowing what vendors are thinking about and where they're going to go in the space so that you can advise, you know, basically everybody, the rest of us, as to what's going on out there. You probably have a better insight than a lot of people do because you're getting that kind of cross section of the industry by having vendors and analysts and investors all together. [00:05:35] Speaker B: Yeah, exactly. So we are following around 400 vendors in this space, around, let's say, 100 to 150 more closely in a way that we get briefed by them or we have a frequent interaction with them. So that's like the larger vendors in our space, and then we have another like 300, 250 that we at least we know what they are doing and we have an idea where they're positioned in the market. So that's quite a lot to cover. But I think what's really good for us is to have also this end user exposure. So really working with them on their requirements, understanding them, what use cases do they actually try to solve, also getting their feedback from proof of concepts, from the actual implementation. So what works with the software was not where were they disappointed, what looked good on PowerPoint, what didn't work in practice and so on. [00:06:31] Speaker A: Yeah, well, along those lines, you know, data products is one of those hot industry buzz words that's been going around and, you know, every, probably every conference, you know, every other blog post that's out there, every other podcast is talking about it, including ours. So what's your take on that concept? And, you know, what it means and, you know, how does it really fit in the, in our, I'll say, modern data landscape that we're dealing with for analytics? [00:06:59] Speaker B: Yeah, I think you can take different views on that. First of all, I think we see that it became quite popular with the data mesh concept and that's, we define data mesh as an organizational concept, and data fabric would be more a term that's often used in relation to that. But we would say this is more the technical implementation. So data measures and organizational concept also calls for data products that are developed in a decentral fashion in the domains. And we see that this made this whole idea of data products quite popular, but it has been around before. So we were talking about data products and the good ideas behind this concept already before the data mesh concept was out. So I think it's basically the question and the idea, how can we make data more accessible across the organization? That's one thing. The second thing is that, is that we see many problems with running data and analytics projects or trying to get something done. Let's put it that way. In data and analytics, if you have a project oriented approach, we all know about the deficiencies here, the problems, and one, for example, is that it has an end, which basically that's not right. I mean, we are in the data, yeah, we are in the data ops podcast. So if you want to operationalize something, obviously you cannot have an end, but you have to have an ongoing process. And that's not only true for data pipelines, but that's basically true for everything. [00:08:37] Speaker A: Yeah. [00:08:38] Speaker B: Even if you build a dashboard from day one, you will have changes, you have new requirements, you have people asking for different things. And so that's one of the biggest problems. The second idea in data products that we see that resonates very well is this idea of end to end responsibility so if we do it right, we end this endless discussion on whose fault it was that something's going wrong. Is it the requirements? Is it the IT implementation? So people are arguing who made the mistake and are ignoring that. Basically, in my belief is in data analytics, we will never find a user that can actually express what they really want. And plus any requirement will change extremely quickly. So these two things together will lead to an approach, or we have to have an approach that, that is always incorporating iteration changes and that quickly. So it needs to be right. [00:09:51] Speaker A: I was just going to say, you're talking my language here on Agile because I got into doing agile data warehousing back in the early two thousands when people weren't even talking about that. And the concept of having a product owner and a prioritized backlog and all of those sorts of things really allowed my teams to be more successful. Right. As we thought about it that way, like you said, there was end to end. That product owner was from the business and they bore as much responsibility on the outcome and the quality of the outcome as we did as the IT team that was building the ETL and the data warehouses and all of that sort of thing. But it really became, you know, that team effort. It was much more of a team effort. And I think that the data products concepts, as they're being discussed now today, more so. Thanks again. Like you said to data mesh, having that product mindset is very different. You know, you think about Ford, right? They don't just build one car and that's it, and they do the same thing forever. It's obviously evolved and changed over time and cars break down and have to be repaired and they have to say, is there a better way we could engineer that so it doesn't have to be repaired all the time. And in the agile world we call that technical debt and refactoring. So, yeah, no, I agree with you. I think this is, it's not completely new, but it's gotten, I'll say, new life and maybe some new focus as a result of all these recent discussions. [00:11:31] Speaker B: Yeah, absolutely. I mean, there's some discussion around what actually a data product is, some definitions, we prefer a broader one. So we think also, for example, data or dashboards or applications that are data or analytics rich, we would also say that's a data product. But I mean, in the end, it doesn't matter. It has to help us to bring data to life to make it easier to share it across an organization. And all the good ideas in data products like ownership, like also model I think is super important. And all these things really bring us forward and help us overcome problems that we have with a more traditional project oriented approach. [00:12:19] Speaker A: So yeah, you mentioned data ops, like you said, obviously this is a data Ops podcast. So what's your perspective on data ops processes and how this really fits into all of this data products or not. [00:12:33] Speaker B: Exactly, no matter how you call it. But what we see also in our research, we talked about our approach to help clients directly in a more advisory consulting fashion. But our biggest area of our business is actually the research side. So we do a lot of empirical research. So primary research in our market to understand better what's going on and what, what we see consistently over years now is that one of the biggest problems that the professionals have, or the practitioners have is the actual operationalization, very difficult word of anything they do. And so getting to a pilot or before that, getting to a proof of concept is typically not so hard. But taking the last mile, the last step of bringing it really to life in an operational context into the processes, that's the really hard thing. And this is consistent over the years. And we see that data ops next to any other ops, especially mlops, that has been top of mind for some years now with the rise of more AI or data science related projects or products, has really helped to focus on the processes and tasks at hand that you need to do to operationalize to make that successful. And it starts, I think, with a mindset that you understand, okay, there is something to be done, there is a problem that we need to solve. And secondly, obviously, I mean, you have your approach there, how to do it, what steps to take, what to take care of. But I think the key thing is here is that operationalization is a big challenge and we need to really have a very close look at it, how we can solve it, how we can overcome it in projects. [00:14:39] Speaker A: Yeah. So how do you see the role of automation, maybe even AI driven automation for data ops and the importance of it? [00:14:49] Speaker B: Well, I mean, automation is key. We have to automate because we see obviously also on a general term, we see data volumes rising, we see the frequency of data integration, let's say decreasing in terms of it's going down to real time. Instead of having seconds, minutes, days to integrate data, we see often complexity rising in terms of what, for example, needs to be calculated along the way, what needs to be done. We see an increasing amount of external data to be integrated into data landscape, which typically also increases things like you have to have more quality checks, for example. You don't have control over the schema anymore. So things that are adding to complexity. So all this leads to a higher challenge in a way, and this is where automation comes into play, because trying to will simply not work. [00:15:49] Speaker A: Yeah. One of our key pillars, and the seven pillars of true data ops is automated testing and monitoring. And I think that's what you were just alluding to there, especially if you don't have control of the data, you're getting third party data. How do you make sure that, again, we're still dealing with, and forever will deal with garbage in, garbage out, making sure that we're getting the right kind of data, the right quality of data, and not having things break in our operational pipelines. [00:16:22] Speaker B: Yeah, exactly. So quality is a constant concern and basically it will ever be. I completely agree. And we always, I mean, I said we do a lot of these empirical research and data quality always comes top. If you ask about challenges. Yeah, wherever. In machine learning, in bi, in data management, it doesn't matter. So we always ask the question, but when do we actually solve this challenge? But the good news is I do see more initiatives in data governance. I do see companies taking a bit more care about it and thinking about it and also taking action, meaning actually allocating resources, actually being more serious about tackling this problem. So I have a little bit of hope that this will get better and that more and more companies understand that if they really think that data is an asset and that it really contributes to their business strategy or positively influences their business strategy, then that they also have to take care about this asset. So if we see more and more companies being serious about that, then I have a bit of hope that we also finally tackle this data quality topic. [00:17:45] Speaker A: Yeah. So what do you think is driving companies to actually be thinking about this more? The governance and quality aspects of data management? [00:17:57] Speaker B: I think the strategic importance of data analytics has increased. I think significantly. It used to be more an afterthought, some data that comes out of our main systems, the ERP systems or others, and that needs to be reported. But if it's not working properly, it's not a big problem. Our core processes are still running. That used to be the case. Now it's different. We see more and more processes that rely on data that cannot work any way without data. So I think the strategic importance is higher. I think it's also maturity topic that we have more and more companies being more mature about how to handle data, how to think about data, and that they now really see as a current example, Genai got all this hype and what happened, no wonder after like six or twelve months, people or companies started to realize, oh, it's again the same topic. If we feed these models is garbage, then we will have no good results. So I think it's, again, a little bit adding to the maturity here that we need to take care of the data first. [00:19:13] Speaker A: Yeah, I think there's a little bit of urgency too, because people are wanting to jump on the Genai bandwagon. And like you said, they realize, like, well, you have to feed that. Whether, you know, we were talking, we've been talking about machine learning and AI for a number of years now, and that's still been the same thing. You have to train the model. And what are you training the model with? You know, where is that data coming from that you're training the model with, and what's the quality of that data is very important if you're trying to develop one of these things and you want the results to be useful and accurate. [00:19:47] Speaker B: Yeah, absolutely. [00:19:50] Speaker A: So you have a podcast called Data Culture, the Data Culture podcast. And in that podcast you say, data culture eats data strategy for breakfast. So could you explain a little bit of what that means and the importance of having a strong data culture for success? And what is a strong data culture from your perspective? What does that look like? [00:20:15] Speaker B: Yeah, absolutely. So the quote you just used is obviously taken from a pretty famous quote in management literature that says culture eats strategy for breakfast, meaning any type of strategy you want to or business strategy you want to implement in your organization. If the culture of an organization is not supporting it, then it will fail. And I think the same is true for data. So if you have a data strategy, but your data culture in an organization is not supporting it, then again, it will be to no avail. You have great thoughts what you could achieve, but in the end, you will not be able to achieve it. So what is data culture? It's the things that you cannot really see and define clearly. So typically we say it's the values, the beliefs, the behaviors within organization that promote the effective and ethical use of data. So that's data culture, and it's part of the company culture. So whatever you know and see and feel about a company culture is typically also reflecting on a data culture. So I always compare it in a way. If you walk or if you're a new employee in an organization, you get a lot of things in a structured and formal way, like an employee handbook and in a process description and so on. But then there's also this informal thing that's the culture of the organization. Yeah. And you feel that very quickly, it takes maybe a few days or weeks, and then you understand what type of organization it is. Is this very hierarchical or not? Yeah. What type of, how are decisions being made? How are people treating each other? How are people treating their customers? And often that's not really written down, but it's really more a set of values, of beliefs, of how things are being done. And why do I think it's important? Because we see that these more people related aspects of how data is being used are equally important to any strategy you have in terms of, you think about use cases and how you support the business strategy, you think about technology, what you want to use, and so on and so on. But if you forget about the people that in the end actually would have to use the technology, make it work, use the results of any type of data processing that we do, then you will fail. That's the hard statement here. That's why data culture is so important. And to be really focusing on that as a major success factor next to your business related data strategy, your technology related data strategy. That's the statement here. And that's why I caught my podcast, data culture podcast, because we, in terms of also now close to 100 guests, think that it's super important, and we want to help people that want to address the topic, how they can actually do that. [00:23:21] Speaker A: Yeah. I like the fact that part of your definition of data culture is effective, and ethical use of data. That's definitely become the ethics of data usage, has certainly become a much hotter topic with the advent of AI, and Gen AI in particular. I was at the data universe conference in New York last week, and they actually had one stage that was kind of dedicated to talks about ethical use of data and putting ethics policies together and things like that. And it's a very important conversation that in many ways has been, I think, inferred in organizations in the past, but not necessarily directly discussed other than in terms of compliance with things like GDPR and CCPA. But that's more of a, that's a regulatory compliance discussion. It's not really an ethics discussion. And I think the conversation is changing a little bit now, thankfully. [00:24:24] Speaker B: Yeah, absolutely. I see the same. It has started maybe five years ago with the advent of a AI that came top of mind with all the data scientist discussions and so on. So more companies were investing in more AI. And here, very quickly, we had these discussions and also these prominent examples of bias and data, and then what could come out of it and, or what bad results basically could come out of it. And we had the firm, we run an event called the data festival in Munich each year, year. And we had this, like five years ago already. First companies going on stage and saying, well, okay, we need to have a guideline for our data scientist, how to treat data and how to make sure that the outcome, for example, the models they are building, are actually working ethically. Correct. So I think that was the first time this was really starting. And now there's definitely accelerating with, especially with the European Union AI act that is now becoming real. And now we see that in many organizations, they start to really look at it because they understand something's coming. There's regulatory pressure coming, and so they need to take care of it. So I think that even accelerated this discussion and the process here. [00:25:50] Speaker A: Yeah. So our, you know, I'll say our traditional data management practices of governance and privacy are kind of expanding. Right. So now we're, now we're having to include things like ethical use of the data. I mean, to me, that's, that's a governance question. Right. Is how, how are we using the data? What kind of data are we producing? And, you know, is that in line with what, I guess, as an organization, we believe is ethical according to our organizational values? [00:26:21] Speaker B: Exactly. [00:26:23] Speaker A: Which means you got to have a data culture defined. Right. Otherwise you don't know what those are. You can't answer those questions. [00:26:28] Speaker B: Exactly. So that's where things come together. [00:26:31] Speaker A: Ah, yeah. Oh, good. Very good. Well, let's see. We're kind of coming up on our time now. What are you looking forward to for the rest of 2024? [00:26:44] Speaker B: Well, in the data world. In the data world, I mean, there are many things happening that will have an effect. Right now, the discussion is still a bit overlaid by Genai. I think that dominated a lot of discussions. But what I think is positive is that it gets now more down to earth in terms of the question. Yeah, what can we actually do? And we do see a big gap in what is possible and what potentially could be done and what is actually being done in organizations, in companies. So there's a very big adoption gap, and that's, I think this year will be the year where things have to happen. So where anyone involved in Genai, in whatever fashion, will have to deliver. And that, I think it will be super interesting because, as usual, there are a lot of promises, but there will also be a lot of tiers of things that will not deliver the results that people thought they would. And so I think this will be super interesting to observe this year. Second thing is, as we all know, in data and analytics, there are many things that Jenni will not solve, why will not actually advance things, if you think about calculations and let's say, the core analytics. But I'm super interested to see what happens in terms of an innovation side. So, I mean, there are already announcements, like from, I read about OpenAI saying, well, the GPT five will have major improvements, for example, in analyzing data or capabilities to analyze data. So I'm super interested to see what will happen there. And then, I mean, we have advances on all levels. I think the increased, the increased automation on the one side and increased user friendliness in using any type, data management solutions, bi dashboarding, reporting, whatever. I think how this plays out is one of the most interesting things. So how now software vendors adapt these new capabilities? And again, most of what we have seen so far is slideware. So vendors announcing what they want to do, and most of them promise to deliver something. This year, 24, I think, will be the year where we see how Genai for data and analytics will actually be visible in products. In actual products. [00:29:39] Speaker A: Yeah. Okay. Yeah, that makes sense because I think the popularity of Genai and even things just like chat, GPT is driving that question of why can't my analytics platform be this easy to use? Right. [00:29:51] Speaker B: Yeah. [00:29:51] Speaker A: And so we're really, again, hopefully taking another big leap forward in our attempts to, as we've been saying for several decades, democratize data, democratize analytics. You don't have to be a highly technical individual in order to get value out of the data that we can have. You know, business people who are running the business get used from that data without having to call somebody in it to make something happen. [00:30:18] Speaker B: Exactly. [00:30:19] Speaker A: And that's very exciting. [00:30:21] Speaker B: Yeah, yeah, it is. And that, by the way, that will also drive data culture, because if it's easier, then you have an easier probability to actually have it spread out in the organization. [00:30:32] Speaker A: Yeah. There'll be more adoption. [00:30:33] Speaker B: Yeah, yeah, exactly. [00:30:35] Speaker A: Oh, great. So what's next on the list for you? Do you have any conferences or meetups or anything coming up in the next couple of months before you hit summer vacation period there in Europe? [00:30:48] Speaker B: Yes. So we, first of all, what's really top of mind for us is our us expansion. We are investing a lot in the US. We have now three people on the ground, and it's, the team is growing, and that's super exciting for us because our ambition is to be the leading global analyst company specialized on data and analytics. And that's a major step forward to do that. [00:31:11] Speaker A: Yeah, we had one of your colleagues on here a couple of months ago. [00:31:14] Speaker B: I know Sean was with you. Yeah, that's great. And so that's definitely a big thing for us. Then in a few weeks and four weeks time we run the biggest trade show for data analytics in the germ speaking countries. That's the big data and AI world in Frankfurt. So that's definitely keeping us quite busy. And then we have some events focused on finance. And the next one that has a global impact will be the data festival online. We do it once in Munich and one in online. That will be in November. So a bit of time, but that has a pretty big global audience. [00:31:52] Speaker A: Awesome. So what's the best way for folks to connect with you if they want to follow up on this and follow what you're doing and see the announcements for all these events? [00:32:00] Speaker B: Yeah, follow me on LinkedIn. That's I think the best source to see what's going on. And obviously we also have bach.com as a webpage where you can see what research is available and what we are doing. [00:32:14] Speaker A: Okay, so that's barc.com, right? [00:32:18] Speaker B: Exactly. [00:32:19] Speaker A: And then there's your LinkedIn QR code for folks. That's your quickest way to link up with Carsten. Make sure when you send him a connect message that you tell him you saw him here on the true Data ops podcast so he'll know how you found him. Well, thank you so much for being my guest today, Carson. It was a great talk. Thanks for everybody online for joining us and I hope you got some value out of this conversation. Be sure to join me again in two weeks when I'm going to be live from Stovermont at the worldwide data Vault conference. I'll be talking with Cohen Verhane and Sean Johnson from Vaultspeed. They happen to be one of our dataops live automation partners. So we were talking about automation earlier. So I'm going to talk to a couple folks about that in two weeks. Now note that this is actually a live event as in in person. I mean all our events are generally live, but I'm going to actually be in the same room with these guys. I did this last year so it'll be the second in person, I guess I should say in person rather than live in person interview with the folks from vault speed up there in Vermont. But because of live we had to make a couple of adjustments because we're right in the middle of a conference there and so we're going to be 1 hour earlier than normal. So make sure you make a note of that on your calendar. It'll be 1 hour earlier when we do that event, but the announcements will be coming out here on Facebook. Yeah, on LinkedIn. And you can obviously register that way. And you'll get the notifications. So as always, be sure to like and like the replays from today's show. Tell your friends about the Truedate Ops podcast, and don't forget to go to truedataops.org. Subscribe to the podcast to make sure that you don't miss any future episodes. So thanks again, everyone. Until next time, this is Kent Graziano, the data warrior, signing off for now.

Other Episodes