Insightful Connections | EP. 16 Roddy Knowles, dtect

TRANSCRIPT

[00:00:00] Sebastian: You're listening to the Insightful Connections podcast. Our guest today is Roddy Knowles. Roddy is the COO at dtect, not deetect, as he told me earlier. Founded in 2022, dtect is the data quality platform that prevents survey fraud. It makes you more effective by using advanced technology to ensure your data is real. Prior to joining dtect, Roddy held leadership positions at Disco, Feedback Loop, and Dynata. Roddy, thanks for being on the show today.

[00:00:26] Roddy: I appreciate you having me. Looking forward to the conversation.

[00:00:29] Sebastian: So I always like to start with this question, and it gets at sort of the more personal aspect of where you came from and where you're going to. I'm wondering how you got into market research as a space and how that sort of origin story has accounted for the places you've gone in the years since.

[00:01:06] Roddy: It's a bit of a long story. I'll try to make it short and I'll save the listeners ears a little bit, but it's a bit of a torturous path to get where I got. So I never intended to be in market research like a lot of people who started 20 years ago like I did. So I'm a social scientist by academic training. I was actually in religious studies. So I was in a PhD program for religious studies, did a lot of ethnographic work, and I had a professor who contacted me because someone contacted him and said, someone is recruiting to for people who can do ethnographic market research. I said, I know what the word ethnographic means, but I don't really know what the word market research means, but they want to pay me twice as much money as I get paid being a TA. Yeah, man. So it all sort of got started there. So I did a lot of qualitative work, a lot of work that was really focused on shopper insights, then started to do project management, started to do quantitative data analysis. I never set out to be a quant guy, I guess secret's out now. So I'm a quali at heart. That's what started me down my market research journey, ultimately made the decision to leave academia and do market research full time. And I think one of the things that's always stuck with me being an academic and being a quali at heart is never forget about the people on the other side. I think a lot of people that start as a quant researcher, you start to think about the data and the approach and the analysis and what you're delivering and forget about the human part of research. And that human part is absolutely critical and that plays into everything that I've done throughout my career. And what I continue to do today with the tech, we're focusing on preventing survey fraud. Why is that important? Because we need real actual humans that are taking our surveys. I don't care about the numbers. I don't care about the data. If you can't actually count on it, it's not actually human driven. So I think that's been really the persistent theme, honestly, throughout everything that I've done in the last roughly 20 years in the market research industry, wearing a number of different hats.

[00:02:51] Sebastian: I love that focus on the human. I think we're getting really quickly into our topic area today, which is, of course, research fraud and the crisis of research fraud. And I guess just to set the table a little bit for this discussion, a lot of people are speaking about the sample fraud in terms of being a crisis the industry is facing. Do you agree with that characterization?

[00:03:10] Roddy: If it gets more people to pay attention to the problem, I'm OK with crisis. But I would also say this is not something new. This is something that we've always dealt with in industry. It's always been a problem. Has it gotten worse? Absolutely. Has the advent of technology, specifically AI turbocharged everything? Absolutely. But some people are thinking about AI as being a watershed development for the problems that we're seeing in terms of data quality. But those existed way prior to that, too. So, yes, it means we need to attack things in new ways. But we're definitely if you want to call it a crisis, I'm not going to fight you in calling it a crisis. But it is a crisis that I do believe that we can address. So the amount of survey fraud that we're seeing right now definitely continues to escalate. I remember back in the day when we would get data that would come back and we deliver it to a client when I was doing full service work and they would throw out seven percent of the participants that came in and we'd have to have a meeting about that because if it got over five percent, that was a problem. Now you probably get a gold medal if it's seven percent. Right. So we're seeing problems that way exceed what they used to. And so, yeah, the term crisis gets people to pay attention. I'm fine with that. But I don't believe it's an existential one because I do believe that as long as there's a focus on it, attention on it and companies that are doing the right things to prevent it, that it's something that we can get through.

[00:04:26] Sebastian: You pointed out a couple of the things that people are talking about today as drivers of the escalation that I think you've had a front row seat to observing. One of the questions that I'm kind of curious to ask is, what are the things that you think holistically have been driving this problem for the industry? If it's a multitude of factors, what are those things and how do they account for what we've been seeing?

[00:04:47] Roddy: I would say it's a multitude of factors, but I'll just give you one way to think about it. I think as an industry, we've known data quality is a challenge. We've known fraud is a problem. The traditional way of solving that is obscure the problem from your clients or whoever you're delivering the data to. So one way to solve with air quotes to solve the problem is to simply clean out the bad data when it comes over to you. Make sure your clients don't see it as best as you can. To use another metaphor, you're just putting sort of duct tape and glue on the problem that works. Again, air quotes works for a while, but it doesn't actually solve the problem itself. So you have so many research companies and research organizations that have functions that are built into clean data, and that's important to clean data, to go back, reconcile data, replace participants and do all that stuff. And that needs to happen to some degree, but that's embedded in the way of doing things. It's sort of like as data quality has gotten worse and worse, some of those old ways of approaching getting clean data by cleaning everything after the fact that really hasn't solved the problem. We really need to be solving the problem at the top of the funnel and keeping those bad people out. It ensures that our samples are more representative and ensures that you're not inserting bias in ways that you don't understand when you're just cleaning and quickly refielding, for example. And it really requires a rethinking of how we approach research. And when you do think about it really top down, making sure you have real people at the top of the phone that are coming in, all those downstream processes, data cleaning and stuff like that, it's super important, become less labor intensive for the people who have to do that and allow them to focus on higher value things, which is really what we need to do as an industry. Got it.

[00:06:22] Sebastian: And now would be a good time for you to talk a little bit about what dtect has been doing to respond to the crisis of data quality and what results you've been seeing from those efforts.

[00:06:32] Roddy: Sure. So I'm in an interesting position of focusing on survey fraud all day, every day. That's what we do at dtect. We stop survey fraud. That's really all that we do. So we're not trying to sell sample or clean your data or deliver insights. We're really just focused on that problem. So we do have a front row seat to really everything that we've seen in terms of data quality. The interesting thing is that the threats are always changing. So what they looked like six months ago, even three months ago, look like a lot different than they did three months prior. Pick the timeframe that you want to. So some of the things are the same. People trying to spoof IP addresses or use VPNs or spin up virtual machines to try to get into surveys. Those things haven't changed. But what has changed is the nature of where fraud is coming from and how people are attacking it. So whereas it used to be, and it still is to a degree, survey farms, you think about images in your head of a survey farm, people co-located somewhere with a bunch of machines or virtual machines that are spun up, a bunch of phones on a rack, stuff like that, trying to get into surveys to get incentives. That still happens. But it's become more and more distributed. So now when people are hitting your survey and all of a sudden you get infiltrated with fraud, it doesn't mean they're all coming from someone in a garage somewhere or a dorm somewhere. They're coming from all over the place. People are communicating on Telegram, on Discord, they're sharing information about how to get into a survey, how to scam your way into a survey. So it's become more organized, but also more distributed at the same time. And then I think I'd be kicked out the podcast if I didn't mention AI at some point. So I'll go ahead and throw that one in there. That's definitely changed the landscape. The most simple way to think about how AI has impacted what we do, like open ends, and we see this all the time. I'm sure you do. Everyone's seeing it now. This is clearly an AI-generated response. Okay, that's a problem, definitely. It becomes a lot easier for people to answer open-ended questions, but they're also using AI in more sophisticated ways, too, to spin up personas and to make their job as a producer more efficient. And also, if you just look in the last two months, AI agents. So now people are able to use AI agents to try to get into surveys. So you need new techniques to catch that type of fraud than the old way of what you were doing before. So again, the olden days, doing IP checks, checking for VPNs and stuff like that. Yeah, did a pretty good job, but those old tools don't really cut it anymore. So we're really focused on staying ahead of the curve and making sure that we're attacking the new types of fraud before they start to infiltrate your surveys.

[00:08:56] Sebastian: Personally, as somebody who works in the more qualitative side of fieldwork, right, but I deal with this issue on a daily basis as well, and mentally I've started categorizing it into basically three categories. And I think you could almost arrange them as a spectrum. From over here, you've got the most egregious attempts at fraud, which are people who are not even maybe in the US, right, who are attempting to spoof their IP, pretend they're somewhere they're not, pretend they're somebody they're not, and they're doing it in an organized, coordinated way. It's almost a cottage industry, right? And then you have people over here who maybe participate in a lot of studies and have a sense of what qualifies and are embellishing some of their life details. And then you have, I think, the most benign category, which is just people who legitimately make mistakes or encounter issues with survey UIs, right, and are unable to give truthful answers. Maybe because the actual design of the survey itself, in some way we don't realize, prevents truthful behavior. Great example is if you, and this is for anyone who's programming a survey while listening to this, right, if you have a multi-select question that doesn't have a mutually exhaustive set of options and you don't put a none of the above option there, you're forcing everyone to lie to get through your survey.

[00:10:04] Roddy: That's one of my favorite tips, by the way, and people make that mistake all the time. I see it constantly. I guess you force someone to do something that you made them do and now you're blaming them for it. Look in the mirror.

[00:10:14] Sebastian: Yeah, yeah. So I guess, and this is sort of my categorization. I don't know if this categorization holds for you, but if it does, you know, how would you classify percentages of what dtect sees and what it can respond to?

[00:10:26] Roddy: In terms of overall survey fraud, the short, unfulfilling answer is it varies, but let me give you some numbers. So around 20 to 30% is the average that we're seeing in terms of fraud that we can classify as this is definitely bad. We have a very high degree of confidence, so much so that we're going to block those people or not people from entering surveys altogether. Then there's a bucket of about 5% to 10% typically where we see traffic that looks a little bit suspicious. So what we do in that case is we don't block them, but we will flag them and identify them as suspicious because they may be doing something like using a VPN. Does using a VPN mean that you are not qualified to take a survey and they're committing fraud? No. Maybe you're at work or requires it. Maybe you're privacy conscious. That's fine, but it's a flag for us and there are other flags like that. So we'll be that sort of suspicious category. You also are going to want to look into your survey itself, look at the open ends in your survey, see if they're failing trap questions that you have, speeder checks, things like that. And it's going to give you more confidence that, yeah, they look sketchy with some of their behaviors coming in and also what they're doing in the survey looks sketchy too. That's something that people that you're probably going to want to ultimately kick out. So I gave you that 20 to 30% number, but there's a range and you can really get killed by outliers sometimes too. So it's not uncommon that we see 60 to 80% on some studies that are totally fraudulent, either because it's the sample provider that you're using. Maybe the sample provider you're using, they're usually reliable, but something happens in their system. People able to gain their way in, they get inundated by a product. This happens. This happens to good panels. So we definitely see outliers there for sure. Another point I'll make just because I think it's important, like when you were categorizing different types of traffic, like that categorization makes sense. One of the categories that people oftentimes use is bots and they say, oh, bots got into my survey. I got hammered by bots. Usually not true. That's like less than 5% of the traffic, bad traffic that we're seeing. So bots can be an issue, but usually it's humans. It's humans are using some sort of automation, some sort of technology. And I think that's important because when we simplify it and we just say it's a bot problem, it takes the human element out. It makes us think like it's just automation that's sort of unsupervised that is coming in and infiltrating your surveys. And that's really not the case. So I do think that's important. At the end of the day, it doesn't matter to your client if it's a bot, someone come from a survey farm, someone who's a professional survey taker, or to your point, maybe someone who's just wasn't given the right options in the survey. If you can't count on the data, really who cares what the source is and where it comes up? I care a lot because I'm thinking about different techniques to use to stop them. But at the end of the day, person getting the data, they don't really care why it's bad. They just care that it's bad.

[00:13:04] Sebastian: I feel seen when you say that the nomenclature of bots kind of rubs you the wrong way. I mean, I would say in my professional experience, having done a lot of qualitative recruitment directly, there's never been a circumstance where I've said, I think that screener submission came from a bot. Right. Right. Like, you know, we've definitely had people trying to game screeners, but those efforts are relentlessly human, unfortunately. Right. Another question for you, Roddy. In terms of the things that are happening at a industry level to respond to the issue of research fraud, what are the things that you think are most promising at an industry level right now? And what are the things that you have your eye on and what are the things that potentially are threats to the industry that are on the horizon as well?

[00:13:45] Roddy: First, I think some of the coordinated efforts like the Global Data Quality Initiative that industry such as their associations, the Insights Association, SampleCon, SMR, some of the other local associations have been supporting are great. We need to be banding together to have standard nomenclature about data quality, to push things like secure end links and things like that that we need to do. And so those efforts, I think, are great, but they're not enough. I guess I'll be candid about this. Those efforts can be a little bit slow moving, and that's OK. It doesn't mean people aren't working hard, but it takes a lot to collaborate and coordinate globally as an industry to attack some of these things. So I think they're great. But just saying industry associations are going to solve this, that's never going to solve the problem. So it really takes everybody. If you're on the sample provider side, you should absolutely have data quality checks on your side to make sure what you're delivering is quality. But you also don't know as a buyer what standards different providers have. So having an objective tool, something like dtect, Use Us, great. Use a competitor, OK, just do something. Build something internally to make sure that you have an objective way across panels and across sample sources to ensure that whatever standards you have are actually being upheld because you can't assume that all standards are going to be the same there. So in terms of looming threats, I think one thing that came as a surprise to many within the research industry is all the news around OP4G. So we think about survey products coming from outside the industry. When it comes from inside the industry, that hurts a lot. And so that was a real eye-opener for a lot of people. And there are a lot of disreputable companies out there that understand how market research works literally because they're in it. So when you see fraud come from the inside, I think that really hurts. And that's opened up a lot of people's eyes. The irony of the whole situation is that companies who were working with OP4G had basic checks in place. They would have caught all of this stuff. So I think that was a wake-up call for sure. So threats coming from the inside of the industry, I think, is one thing that's really arisen and coming more to the fore, unfortunately. Another threat that I really see is, especially when you're thinking not so much for Qual, but for Quant, is just, why do people want to conduct surveys anymore? If fraud is a problem, and if we don't tackle it, and I'm going to pay, let's say, a dollar per complete just to throw a round number out there, and 40% of the data is going to be bad, and I'm going to have to throw it out, and I also don't really trust that data, why conduct surveys anymore at all? Maybe good for you and your business. Because I think there is a stronger place for Qual to ensure that people are human that are giving you responses. But we also have synthetic data and synthetic participants, which I think the term synthetic participants needs a rebrand. It's not up to me to do that, but someone should do that. But synthetic participants, when they're built, if you can trust the data that the models are built on, you can generate synthetic participants and synthetic responses using all the technology we have. There's definitely a place for that. But in the quantitative research space, that poses real headwinds. So I think there's a place for both. I think there's a place for real quantitative participants. I think there's a place for synthetic data. I think there's certainly a place and a larger place for Qual. But if we don't actually jump in and solve the data quality problem, then we're going to be in a place where a lot of people just don't want to conduct quantitative surveys anymore, and that drags the entire industry down. So if I want to pick something to really keep me up at night, I would think about that more.

[00:17:07] Sebastian: On that note, Roddy, one last question for you. What keeps you motivated?

[00:17:12] Roddy: Trying to keep this industry going, to be honest. I've been in this industry a long time and I care deeply about it and seeing things on the front lines and seeing how things are evolving and knowing that what we're building at dtect can not just be good for our business, but good for the entire industry. I also am a problem solver at heart. And so, you know, seeing how the problems continue to change and evolve and building technology to stay ahead of the curve is really something that keeps me and my team excited every day. Roddy, thanks so much for being on the show. I appreciate the time. Thanks for a great conversation.

‍

Subscribe to Our Podcast Channel Today!

Dive into a world of insights and inspiration - Stay updated with every episode. Subscribe now and never miss a beat!

Subscribe to Our Podcast Channel Today!

More Great Episodes

Ep. 1: Hugh Carling, Liveminds

Ep. 10: Sue Collin, RTi Research

Ep. 11: Andrej Zukov Gregorič, Unsurvey