Episode 62

Why sample consistency is everything

Jack Millership, Head of Research Expertise & Tassia Henkes, Research Director at Zappi address how to tackle the data quality crisis in the insights industry, debate the opportunities and risks afforded by generative AI, and reveal why, in an increasingly AI driven world, first-party data will be king.

Ryan Barry: Hi everybody, welcome to this episode of Inside Insights, a podcast powered by Zappi. My name is Ryan and I am not joined by my co host Patricia Montes de Oca today because she's traveling today. And we wanted to bring you this episode because it's so important that we had to run ahead.

But Kelsey's here with me. Hey Kelsey, what's up? 

Kelsey Sullivan: Hey, how's it going? I'm ready for the holidays. 

Ryan: Sweater season. 

Kelsey: Sweater weather. Have you put up your Christmas tree yet? 

Ryan: Yes, my Christmas, I'm actually looking at my Christmas tree right now. It's uh, it's dark here because we're recording this at 4:30 on a Friday everybody. That's how much we love you guys. And so it's already dark in Boston. So yeah, the Christmas tree is not only here, but I'm looking at the lights and they're very pretty. And so I don't know how many people know this that listened to this podcast, but Kelsey and I have the same birthday. Um, and it is not the day that this episode is coming to you, but it might be the day you're listening to it.

Ryan: Our birthday is December 12th, team Sagittarius in the house. Not many podcasts have the same birthday production going on. 

Kelsey: I know. We can't make that up. 

Ryan: Can't make that shit up, you know, so anyways, today we have an important conversation. Uh, I wanted to bring in two experts who I happen to have the pleasure to actually work with every day.

Ryan: Tassia Henkes and Jack Miller ship from our research team. We are going to talk to you folks about data quality. And before you say, sample, it's boring, you got to listen to this because what we need to be true is we need to have clean data. We need to have data that is rich, that understands consumer perspective, that is representative of the population that we can use again and again, and then add to. 

And leverage AI to get us smarter. And it will only be useful and not disruptive. If the data we have is of the utmost quality. Think of a world where you can leverage all the data you've got and ask humans for things you don't know, and then make your database smarter, knowing that everything is representative.

Ryan: That is what we should be talking about. First party data is important in a world where privacy is increasingly more regulated. We need to know why people do what they do. And we have the ability to query data in a way that's never been possible, thanks to AI, but it's only going to work if the data is useful.

Ryan: And my friends data quality is a disaster in our industry. And I don't care if you're doing business with somebody who claims they have their own panel or by panel. Everybody's either selling you crap or fighting it like crazy. And it's something as an industry, we all need to work towards. So I implore you to listen to this episode.

Ryan: We try our best not to be geeky. And we try to share some advice with you that you can use to make sure you're buying the right data in the right way. And we're going to share, share with you some of the things we're thinking about. 

Ryan: So what do you think Kelsey, should we get into this thing? 

Kelsey: Let's get into it.

[Music transition to interview]


Ryan Barry: Thanks for tuning in, everybody. Really looking forward to this very important episode about the state of data quality in our industry. I'm joined by two of my colleagues, Tassia Henkes and Jack Millership, both of whom are on our research team. Tassia is our research director and Jack is the head of research expertise.

Ryan: Hello, Tassia. Hello, Jack. Jack Millership: Hello. How are we doing?  Ryan: Doing great. Doing great. Before we talk about research quality, the last time I saw Jack was five days ago in London where we consumed, uh, at midnight, a kebab wrap. Um, and if you've never had a night of a few libations in London or had a kebab wrap, you're missing out in life.

Ryan: Um, you don't even need to have a few drinks. You can just go have the kebab wrap. It was wonderful. Jack, it was lovely to see you last week. Tasia, you didn't get to join us for a kebab, but it was nice to see you too. 

Tassia Henkes: I stayed for a drink, but missed the kebab sadly. 

Jack: I just want to, on the record, uh, I tried to make it home kebab less, but I got intercepted on Camden High Street by Ryan

Ryan: I just want to… 11pm in London, you're coming for a kebab. Uh, in America, we usually get pizza or cheeseburgers, uh, but, uh, in London there's a lovely, uh, kebab vibe. So. I wanted to talk to all of you today about the state of data quality. I think it's a topic that a lot of you folks on the brand side can quite easily say, well, this is my vendor's problem.

Ryan: And I think there's a lot of propaganda and rhetoric in the industry and quite frankly, a lot of misunderstanding. Um, but I would like to tell you, uh, corporate researcher, it is your problem because you buy millions of dollars of data and, uh, your company's making multi million dollar decisions off of it.

Ryan: And there is a lot of things that need to be done to continuously take advantage of first party data. So before we dive in, my hope for this conversation is to give you folks some, uh, things to consider, some things to challenge, but also some additional expertise. The three of us happen to be...

Ryan: Uh, sampling geeks, um, we will try our best to make this, uh, not dry, um, but I implore you to stay with us because it's actually, uh, for the betterment of your job that you understand what's going on. So I got into this industry in 2008. Which was not quite the beginning, but just around the beginning of online panel, and a lot's changed, right?

Ryan: So, uh, it was not long before that, that in person and random digit dial was how people, uh, connected with human beings. But it, well, that wasn't even 20 years ago, and everything sort of changed. So, Jack, why don't you take us through kind of the flow of what's changed, uh, from the way we get access to people to give us their wonderful opinions, uh, so that we can make better decisions within our organizations.

Jack: Yeah, sure, sure. I mean, I started at a similar time to you, Ryan. So I started around 2008 as well. Um, first ever job was, uh, doing automotive, uh, satisfaction surveys. and that was at a time where online sample was quite a hot new thing. You know, this was quite a well known research agency. I mean, you can see it on my LinkedIn.

Jack: Um, that were, you know, 

Ryan: You used to work at J. D. Power. 

Jack: I guess for a long time, very traditional, right? Which makes sense because obviously they are trying to look at vehicle owners. They want that kind of verified proof. They're using their customer records. When I came into the company, it was the second time they'd ever done what was called the vehicle ownership satisfaction survey online.

Jack: And that was partnering with, you know, at the time, I think, um, e rewards, which is now part of Donata, you know, used to be research now, uh, you know, to source all the sample that way. Um, and that was, I guess, that sort of a mid 2000s thing, you know, uh, to late 2000s. Uh, a little bit later than that, we started to see a lot more rivering.

Jack: So, um, you know, customers or August funders being directed to panels, from third party places on the internet, or sorry, surveys, not panels, from third party places on the internet about signing up for a panel. You know, recently, I guess, uh, we've seen exchanges, which are sort of blending together those panels, that more riveted sample, and of course now things like offer walls or, you know, in app credits for games and so on and so on.

Jack: And I think the most recent thing that we're starting to see a lot of in the industry that we've just moved to ourselves is, you know, this notion of a private marketplace where you can kind of get some of those benefits of those exchanges, you know, access to huge amounts of sample, but in a slightly more curated and controlled way for your brand.

Ryan: Yeah, so we've fundamentally shifted in the last 18 years from, uh, from a statistically random sample to a convenient sample. And I think everything you said, whether it was double opt in panels, reward networks, river sources and the programmatic era. That we find ourselves in today was designed to give us access to more consumers.

Ryan: Yet, I remember in the early days of river sampling, doing some research, and it fell short on that promise. And I think there's a lot of reasons why, right? Like, so so we are. Uh, and no shortage of demand constraints. But there are a lot of ways today for a human being to monetize their attention on the internet.

Ryan: Whereas in 2007, 2008, that very much wasn't the case. And so, why? Why do you think that we struggle to get access to real people? Why do you think that we have such a demand constraint, uh, or a supply constraint relative to the demand?

Ryan: What are some of the things with the sourcing, but also in the way we treat respondents that have led us to this place?

Jack: I think, uh, as an industry, we probably, uh, are all culpable for this to some extent. Um, you know, I think online research, really the story has just been a compression in, in CPIs, right? So the last, uh, 10, 15 years, however long it's been at this point. It's just been a race to the bottom, right?

Jack: It's been, um, the way to compete has been on price. Um, there's been a lot of downward pressure from agencies, but I'd say also, you know, obviously the customers are part of that too, putting pressure on the agencies themselves, uh, which means that, you know, over quite a long time, really, we started to pay respondents less and less for their time.

Jack: Um, I think that's one part of it, but also, the things that we are asking these people to do are out of this world, right? I mean, you know, I've seen 30, 40 minute long online surveys. You just think these people, the amount that we're paying these poor human beings to complete these surveys, uh, and then trying to get their attention for 30, 40 minutes, all these scalar measures in a row, you know, maybe, uh, five or six similar open ends.

Jack: It's, it's pretty horrible, right? And I think, um, as a result, the pool of people that are willing to engage in that behavior slowly shrinks. I think quite a lot of people probably bounce off of it after doing just one survey, having one bad experience, and then you just end up with a maybe more of a cohort of people who are more happy to, to engage in those kind of somewhat tax experiences.

Ryan: Yeah, I, I, I agree. I vividly remember in 2009, online, online data collection was becoming. Accepted by the skeptics, uh, quite frankly, the skeptics didn't have a damn choice because there wasn't enough people who were willing to get phone phone calls at their home or intercepted in a mall to keep up with what was what was the start of a programmatic marketing revolution.

Ryan: Um, but I remember you could purchase a rep rep to click, which was as close to random as you could get with a convenience sample. To de geek what I just said, the percentage of people clicking your survey would be representative of the population in which you wish to understand across, at that time, really demography, age, gender, region, ethnicity, income, and then the farther east you would get, the harder that would go.

Ryan: Mistake one we made as an industry was... There was a willingness to pay of around $8 an interview, two that would go to the respondent and, you know, the panel companies would've to make their, their money. Um, uh, they don't exist at the company anymore, so I'm gonna hold no punches. SSI fucked it all up because they, they took, everybody was willing to pay $8 and they went down to three.

Ryan: So everybody's gonna make money in the panel industry off that. So how the hell does a respondent make money? And I think. The rise of programmatic sample and river sample started because of that. I remember, uh, and Tassia and Jack, I think you were both part of this. I remember a few years ago, Zappi doing a robust piece of research on research, comparing the quality of an opt in respondent, so somebody who's agreed to join a panel for the sole purposes of sharing their opinion versus exchange. And exchange is a mishmash of affiliate networks, rewards programs, people trying to get access to free content, and there was no difference in the quality of the data.

Ryan: And so why would you pay more if there is no difference? Um, but, but Tassia, I'd love to tap you in here. 

Tassia: On this topic, I come from a different background. So I come from a company that had their own panel. Uh, so I find exchanging private exchanges and private marketplace extremely exciting because the pool is much higher, so the volume, there's only so much you can do with 10, 000 people, uh, if you are tied to those 10, 000 people over and over.

Tassia: Um, so the volume you can tap to, and the fact that you have a lot less bias because If you have your own panel and it's only 10, 000 people, very likely they recommended each other and they are friends of friends and there's a bit of bias there. Um, and also if you buy through the private marketplace, you can be more consistent on the quality you ask, um, for all the markets you are in.

Tassia: So, yeah, I think it's very, very exciting the volume you can reach. 

Ryan: Yeah, the promise is real, right? And I think there's a lot of companies that say, well, we have our own panel, so we're not succumb to this, but I got a little secret for everybody that's listening, the most incestuous part of marketing is the sample industry.

Ryan: I used to work for GMI for many, many years, um, And we were ResearchNow's biggest customer, but if you looked at our strategy deck, they were also our biggest competitor.

Ryan: Um, and so there's a lot of, of kind of overlap of responses, uh, that we, that we need to wade through. I think the other thing, uh, that really bothers me is we took, uh, and I promise everybody, we're not just an event. We're going to talk about some constructive things. We took a choice that's cost us a lot of money in revenue, if I'm honest with you.

Ryan: That we're not gonna put respondents through surveys that are longer than 15, 16 minutes. I think our longest survey median is, is maybe 17 minutes. 

Jack: Um, it's, it's lower than that. It's, it's 12. 

Ryan: Is it? So our stats as a company are low Ir, sorry, high ir, low incident, uh, low length of interview. Uh, and that's common sense, right?

Ryan: A person in 2023 doesn't have enough attention span to sit through a three minute TikTok video about something they're interested in, let alone your whack ass discrete choice model survey that they're, that is gonna look really good on your statistics table, but actually be filled with crappy data. I remember, like, this was in going back 11 years ago, guys.

Ryan: I used to do 50, 5 0 minute conjoint surveys. And I'm telling you the data tables looked fantastic. It was so robust. We had so much data to geek out with, but I used to sit there and say, what weirdo would sit through that? so this is my bug bear. Cause if I've, if I haven't lost you yet brand clients.

Ryan: This is the number one thing you can do to stop bad data happening is stop commissioning shitty research. Uh, you two are professional researchers. Tell me a little bit more about shitty research. What are some of the components? The engagement. Where does somebody go wrong with a research instrument that's actually alienating people from taking the surveys?

Tassia: Ah, there's so many things. You asked the same thing more than once. Um, I've seen, I've seen some surveys where they try to trick the respondent to answer one statement and then reverse the statement to see if they agree. With what they are saying, um, so there's that, there's the length, there's the fact that things are changing all the time.

Tassia: Sometimes you have a five point scale and then you have a weird 17 point scale right in the middle, uh, and you don't know how to rate. What else? Yeah, you try to trick the respondent, you ask multiple things, you change the, the settings so things look different. They have to click in multiple things and go back and forth, um, and it's extremely long.

Tassia: What else? Jack? 

Jack: I think, I think there's two big ecosystem things here as well, right? There's, I mean, a lot of this relates to what we were just chatting about, but if you think about how Sample works now, quite often with an exchange, a BNB, whatever it is, these guys are being bounced all over the place.

Jack: You know, maybe there'll be sent an email from one panel that's inviting them into a survey, you know, and then there'll be sent from there, you know, to the exchange and then there'll be sent from the exchange to the actual, you know, people that are doing the interview now. In that process, I've seen times, in fact, there's a guy, uh, Arno Harmerston did a really, really good talk a couple of months ago, at one of the conferences here in London.

Jack: Um, and, uh, you know, he basically showed us a respondent journey or a couple of respondent journeys where these people are being asked their age three or four times. They're being asked their gender three or four times, you know, and all of this information should be pre coded, should just flow through the pipes of the system, but we get people not just like to do.

Jack: Similarly repetitive things within the survey, but across the whole survey , having to do the same horrible talks over and over again. Um, that was the Association of Survey Computing competitions. Uh, I think the other thing as well as is a super obvious one, but it's, um, you know, a lot of the stuff that we do in interviewing, it's still from 2010, 2012, right? And the reality is in many of these markets, mobile is the default. 

Jack: And yet if you're working at an agency , maybe you test the link, you'll be sent this link to test on your desktop. You won't even think twice about that mobile experience.

Jack: It's such an obvious thing. It's been sent to that far. All surveys need to be mobile first now, um, and so when you think about, you know, things like your discrete choice models, how does that work for someone on a mobile, you know, with more than a max diff, for example, pairwise is fine, two options. As soon as you're having people scroll between five or six options, compare all these attributes, all these levels, I mean, how is that going to work for any human being on a mobile phone?

Jack: Doesn't make any kind of sense. 

Ryan: No, it's not. Go ahead, Tassia. 

Tassia: It's a very interesting experience to be a panelist. So, for everyone listening, I would say sign up to be a panelist. Experience the survey yourself and be empathetic to people that are actually going through and how they feel about going through your questionnaire. Um, I was for two years and two years of weekly surveys, I made 50 pounds. So. 

Ryan: Wow. Yeah, that's, that's, that's problematic. There's a, there's a bunch of things in here.

Ryan: I want to unpack with you folks. Uh, and then I want to talk about how we can change it. So the first thing that Tassia said I want to click into a little bit. 

Ryan: So asking people the same information over and over again. It's something that we're on a journey to solve here at Zappi. Um, so for some context, when Jack talked about private marketplace, what that allows a business to do is to bring in the exact right amount of sources, suppliers, that are needed to have as representative of a convenient sample as possible at the country category level. 

Ryan: So in the United States, that doesn't mean ignoring Hispanics, their acculturation, African Americans. Like we have been, by the way, market research industry for 30 years. Um, it actually means having the appropriate source composition, but it also means there's an ability to leverage what we already know about Tassia and not ask her again.

Ryan: Um, so, so Jack, talk to me a little bit about the design there of how we can better use what we know about people to avoid wasting their goddamn time. 

Jack: Sure. I think, um, a lot of it is just about, uh, the industry aligning around certain sets of standards. I think one of the big problems we have in quantitative research is we all have our quotas.

Jack: We all have our slightly different way of asking measures. Um, and what that can mean is, you know, if I, as, uh, you know, someone that works for, uh, I'm not going to call Zappi an Agency because you'll kill me, Ryan, but a research provider, um, have a preferred way of asking a load of measures. Um, and then someone in another research provider has a preferred way of asking a load of measures.

Jack: You know. It can mean that the same respondent who, you know, doesn't really care about all this nuance and slightly different codes that are supplied has to, you know, fulfill the same information differently every time they do a survey. So I think there's a really big job for us to do as an industry, and we are getting there to align around this stuff.

Jack: So, I mean, a great example is there's an E Smart working group at the moment. I think they've done three reports so far over recommendations and asking measures. I think they've done working status, uh, gender and, uh, ethnicity so far, I may be wrong on ethnicity, that might still be to come. Apologies E Smart if I've missed it, but, you know, the point is there's recommendations that are coming out from our rolling bodies and we all need to align around them.

Jack: And if we do so, that means that information can be stored, you know, at the panel level, so the supplier or respondents level, um, on a respondent level basis. And then these guys can just have that information stored, that can flow, you know, via API into our systems, systems while competitors. And we start putting respondents through this hell of saying very similar information over and over again.

Jack: So I think, I think that's a big one is standardization and alignment. Um, I think the other one is, if the information is there and exists, you know, research providers, please use it. Um, I mean, there are many, many times where, you know, even if someone is, is screened, you know, on the, uh, exchange level. And they hit the systems of research providers.

Jack: Research providers are just going to ask them their age and gender again. Anyway, please don't do that. If that information is available to you as a field, then whoever is giving you your sample, use it. Don't ask the respondents a second time. 

Ryan: Yeah, exactly. And so we, we recently made a decision to strategically partner with Pure Spectrum.

Ryan: So shout out to the Pure Spectrum team. Um, and one of the things that, I'm excited about what their technology is the ability to recognize Tassia as she's comes back into our environment, detect what we already know about her and then optimize her experience by putting her into an opportunity that's going to make the most sense for her while also leveraging, um, what we know.

Ryan: So that's something that excites me. So one of the other, there's two things that you said I want to unpack both. So the consistency and standardization. So there's a lot of, uh, tension with that topic. And I want to talk to you guys about category markets for a moment, but there's tension with, I can hyper buy ads.

Ryan: I can hyper place product experiences and I work in a category and my category from a representative standpoint behaves this way. So talk to me a little bit about what you mean by standardization and how you can give marketers and insights departments. Comfort that they're getting a category rep audience, but also the ability to understand new opportunities, uh, new cohorts without looking for, an echo chamber that tells them exactly what they want to hear.

Ryan: And I'll tell you a story about that in a minute or, uh, sizing a market. That's not going to make any goddamn money because it turns out we do all this research stuff so we can make more cash in our businesses, um, and help customers have a say in that. So talk to me a little bit about that tension and what you're doing about it.

Jack: I mean, to me, this is all about how we sample really what our frames are. Um, I mean, the first, the first principle is, and this goes way back into the history of Zappi, you know, a few years ago, we obviously worked with a load of people from PepsiCo to develop a load of testing, um, and PepsiCo at the time, and still are, uh, huge subscribers to Byron Sharp’s “How Brands Grow, you know, all of that sort of thing.

Jack: And the fundamental tenet there, I think, is, um, the growth of a brand, um, does not just come from, you know, stealing share from your competitors, there's no such thing as brand loyalty. You know, these things don't really work. And really, you know, the, the, the tenets are that it's best to try and acquire new customers, grow the category as a whole.

Jack: Now that's only for established companies, very different thinking where you're trying to start a new category, but for a large broad category, it's already well established. You are much better off trying to acquire new customers than you are trying to convert people to your brand because conversion and loyalty, all these things are not real.

Jack: Um, and I think the, the DNA of that thinking really made its way into how we, uh, approach sampling as a whole within Zappi. So our number one thing is we always, wherever we can, we really try hard to encourage our customers to go as broad as possible in, uh, their audience designs. And in fact, we have a set of audiences that are super broad that we try to align people to.

Jack: Um, there's a number of reasons for that. Um, the most obvious one is the sample is hugely influential in driving survey scores. So if you think about a normative database, you know, that's basically a database is full of benchmarks that we have our end for our key survey measures. It's really important to us that all of the cells that contribute towards those norms, those benchmarks are sampled the same.

Jack: Um, there can be slight differences. Maybe you've added a filter question here or there, but when it comes to things like quotas, screening, and so on and so on, we need as much consistency in that as possible because we know that that sample is really more of a driver than survey scores in some instances than stimuli.

Jack: I mean, our mantra here is, um, you know, all results must be driven by differences in stimuli, not differences in sample. How do we do that? We learn sample, right? Um, I think that's, that's a huge part of it. And then I think the other part of it is, is that, that Byron Sharp level thinking, you know, trying to grow, grow broad, don't miss people by only sampling category users or particularly don't sample brand users, uh, and just try to, um, you know, have, have that breadth of data.

Jack: Which has one more key benefit, which is around stability of scores. So, not to go too wonky into it, um, but, uh, Amplify is our leading ad test. That was validated by comparing the results of Amplify sales to mixed market modeling data. Uh, we did that by sampling in a certain way, those, those tests that we did, you know, very broad samples, um, because we know that reflects the real world reality.

Jack: And so if our validation data is from one set of samples, we need to be using those samples everywhere else that we, you know, we conduct that research to make that validation valid. Um, I think just one final thought before I start ranting on this subject is around stability of data. So, we know from a lot of ROR that we've done in house that, broadly speaking, the higher the instance rate of your sample, the more reputable it is, right?

Jack: So, if you test the exact same stimuli twice with a broad sample, you're going to get the same results twice. If you test the same stimuli twice with a niche sample of heavy brand users, for example, there is going to be way more volatility in those results. Now, what's important is that we're giving you accurate reads.

Jack: Um, that are repeatable, right. This should be a science and therefore you need to go to those broad audiences to get as much accuracy as possible, as opposed to that volatility and unrepresentativeness that you get from kind of a niche and narrow audience. 

Tassia: Yeah, and I think to your point, Jack, it's important to also highlight that we are saying we recommend a broad audience, but it doesn't mean that you cannot filter it down.

Tassia: So that's what we recommend, you know, go as broad as you can can to everyone that would could potentially be interested in. But if you want to filter it down to male 35 to 50, you do that. So you filter down in the analysis. Um, and I think it's just a bit being open mind about what could be your audience and grow a bit broad.

Ryan: Yeah, that's right. I mean, and obviously waiting to be your friend. But before you wait, you can just boost up the base sizes. I mean, with with all the technology available to rapidly launch a survey, collect data model, analyze report data. It doesn't take much time. So our ad test has a base of 400. Why?

Ryan: Because you want to be able to actually cut the data by different subgroups. 

Ryan: I'll tell you a funny story. This is going back, uh, many years ago. Uh, but it was in my time at Zappi. We had a, uh, electronics company and their CMO was like, Oh my God, that's the best ad we've ever had. And we were really, really small.

Ryan: And so the head of insights called me and was like, Hey, can you check this? And any time in my career, the data looks weird with, I think, zero exceptions. You two tell me if I'm wrong, because you have different careers, but anytime in my career where the data looks wonky, it's a hundred percent of the time been a sampling issue.

Ryan: A hundred percent of the damn time. It's been a sampling issue, not a methodology issue, not an analysis issue, not an analyst issue, et cetera. So you can save yourself a lot of stress. Just go look at your sample frame, go look at what you, the bias you've introduced, and then you'll find your smoking gun.

Jack: It's why we're so obsessed with these broad audiences, right? We need to remove any leavers, you know, any externalities that might be contributing to survey scores as much as possible. You know, you're testing it out on our platform.

Jack: The score that you get must be a reflection of that ad. So where there are differences in sampling. And those differences feed into a normative database, it just creates a nightmare of, of, you just don't know what could be driving this score, right? When, when everything is the same, when all the samples are identical, it just removes that massive lever that, that could be responsible for, for, you know, the resulting scores.

Jack: And that's why consistency is so important to us, um, for the quality reasons as well, of course, but yeah. 

Ryan: Also to make sure you're, so there's two reasons, the consistency to have the data get smarter, but the consistency to also or the thought to not talk to yourself. So in this example, this manufacturing business, they make cell phones, um, and I'm not going to call them out because they're still a customer of ours, uh, but this is a long time ago.

Ryan: They talk to people who buy new technology right when it comes out, are a loyal super fan of the brand, and are techie, defined by some typing tool. No shit the ad was good because you littered it with technological benefits. And features and functionalities. And so I remember getting the phone call and being like, I can save you the time, but it's a sampling issue.

Ryan: We went and then just fielded a rep category audience. And it was one of the worst ads we ever tested, except for the subgroup of people that, um, that we're going to tell them whatever they want to hear. And that's always the risk to me of like, in your last company Tass, we had a smaller panel or market research, online communities, or super, super niche sampling.

Ryan: You're going to get somebody who tells you what you want to hear because they're your super fan. Um, but the test retest point that Jack makes, I want to talk about that for a second. What we've seen in the last 10 years is big late stage gate testing has shrunk as iterative learning has risen from a market share perspective, right?

Ryan: So, uh, and everybody's got some platform, but everybody's platform business is up and everybody's late stage incumbent testing business is down. Why? Because it turns out when you develop ideas with your customers, Uh, you're going to sell more shit when you launch them to them because they're going to have had a say through the duration of the process, not the Friday night before you launch your ad.

Ryan: But think of how much data variance that that puts in. Case study, creative territory, we assess it. The next week, storyboard, we assess it. The next week, we do an animatic, we optimize it. Then we go to finished film, we got two end cards, and we're trying to see if the celebrity makes sense.If we don't have consistency in the sample frame, Don't be surprised when in market something gets completely messed up.

Jack: And it's also compared to what, right? So in your earlier example, uh, with the, uh, smartphone manufacturer, you know, it's like, they say it's the best ad we've ever tested, but you've got to think about the composition of the audience used for that ad, sure. But what about all the, what are you comparing it to in that normative database?

Jack: If all of those other ads aren't tested in the same way, you know, we all know that scores go up, the more niche you get, the, the consistency across the piece has to be there. Otherwise, the benchmarks don't make sense. The survey itself doesn't make sense. Like what are you doing? Unless you've got all of these things lined up perfectly, none of it is going to actually be reflective of any kind of reality out there.

Jack: Um, you know, in my mind, consistency is everything for this reason, both in terms of audience, but also in terms of data collection. You know, we're not trying to treat everything as a tracker here exactly for that reason. 

Tassia: I have a bit of a funny story about, uh, testing something with a completely different audience.

Tassia: So I was in an agent at some point, uh, and the creative guys, they came to me and they said, look, we are launching this campaign for a cereal. And it was a cereal, like a very cereal before school type of cereal brand. Uh, and they came to me and they said, Ah, we are launching this campaign and we want to track how it's performing and do like a full evaluation at the end.

Tassia: And our population and our target is going to be all adults. And I said, excuse me, it's going to be what? Uh, because for this brand in particular, it's always like moms or parents. You're testing cereal and they said no, no It's a completely different campaign. We are doing all adults because adults snack and cereal is as good as any other snack and it's actually higher in fiber and they had all these creatives for cereal like literally like morning cereal before school targeting adults.

Tassia: And that was the campaign about. I was, okay, I'm not very comfortable, but I am with you. Let's do this and let's get it tested. Um, and it was actually a very, very good campaign. It was in many channels. It was online. There was some traditional, so there was TV that was out of home and it performed really well and it got a lot of, a lot of buzz.

Tassia: So it's taking something that they will usually test with parents and moms and just going a bit broader because everyone's next. So let's go out and show how a bowl of cereal can be eaten any time of the day. So just broadening the audience a bit and testing with a different audience. And that made a big, big difference.

Ryan: I love what you say, because I think, I think so often, and let's be honest, a lot of you folks listening are insights people and your brand manager is requesting a brief. And so if you don't have companies like us challenging you or you're not an expert in this area, then you might not challenge and so on and so forth.

Ryan: The echo chamber continues. But as an example is another reminder of why you must ground yourself in the orientation of your market. You might sell toothpaste, your target audience also drank whiskey last night, and also might have had McDonald's. 

Ryan: So, I think looking broadly at the occasions in which your product might be consumed is a, it's a reason why demand spaces has taken off so much in the world is because you can actually ground yourself in the jobs to be done or the usage occasion and stop looking only insularly and say, Hey, when else could they use us?

Ryan: Because let's be honest, particularly with product development, we're trying to drive incremental users, revenue, downloads, subscriptions, whatever the hell you sell, right? Incrementality is at the center of it. So I think that's a really smart, um, example that you mentioned. 

Ryan: Okay, so we've talked about survey design and how important the engagement, the efficiency, the accessibility is.

Ryan: We've talked about leveraging what we already know and how increasingly technology is enabling us to do that. We've talked about consistency of sampling frames so that you have a bigger, smarter, ever growing data asset, but also don't introduce sampling bias or talk to yourself. I want to talk to everybody about...

Ryan: How important a rich harmonized queryable first party data asset is to you in a world of gen ai. But before we do that, we have to make sure that the data we're collecting isn't a bot or a liar or a not real person or not representative of the population that you seek to consume. Because if we perpetuate a bias bot filled matrix, well, the synthetic data might be accurate of a really shitty foundation that isn't representative of what human beings will do. 

Ryan: So, Jack and Tass. You're, uh, advising somebody they're about to go and set up a survey ecosystem. What are the questions they should be asking and what are the things they should be doing to set their ecosystem up to ensure that the people are real, engaged, and as representative of as possible of the audience that they are seeking to represent with their, uh, with their efforts?

Jack: So I mean, look, I think what we're all talking about here really is, is, um, responding quality to an extent or sample quality. Uh, I think that can be articulated in a few different ways. Um, I think, uh, so there's an initiative going on at the moment. Another ESMR one that's kind of the UK side of it too, called Data Quality for All from ESMR.

Jack: And then there's a number of UK working groups involved in it. Um, and I think, I think part of the key thing too, is to really think about what you mean by sample quality in the first place. So, for example , there is a difference between, uh, unengaged, generally poor quality, but still doing the survey in good faith, uh, respondents, and then, and then bad actors.

Jack: Um, and I think that's, that's a really important distinction to make, um, because for the former, a lot of it's our fault, we've talked about it earlier, we talked about survey design, I'm not going to go into it again, but if a respondent is, is unengaged and taking a quantitative survey, to me that is okay.

Jack: There are probably, you know, a degree of people in the population that we're trying to represent who are unengaged and don't really care about that thing we're surveying for. So to me, to me, that's, that's, that's, that's kind of okay. What, what isn't okay though is the bad actor stuff. Um, and often it can be quite hard to understand the difference between those two things.

Jack: So for example, if, if someone is going for a survey, And they're answering in good faith the verbatims, you know, they're doing, they're thinking about the scales, they're doing the surveys they should, but maybe in the verbatims are only answering a handful of words, not particularly insightful or useful responses.

Jack: To me that's a valid response. It could be better, sure, we could squeeze more insight out of them, but some of that's on us, you know, are we are we prompting correctly in the open ends? Are we giving them a shortened up survey? All of that sort of stuff, you know. And that isn't necessarily poor sample quality.

Jack: Sure it sucks that you don't have a, you know, verbose Shakespearean open ender, you know, telling you exactly what your ad was about, what they liked or disliked. But it's not a bad response. However, when there is someone who's coming through your survey, and there are a number of flags for this, there's the traditional stuff, you know, the speeding and the straight lining, when they're whizzing through the survey, pattern responses, you know, when there is repetition, uh, particularly within a respondent, but also across respondents, that's a really big red flag.

Jack: I mean, that's something we've seen so much of in the last two years, actually, in the ecosystem in general. Uh, bad actors who are either, I don't know if they're bot farms, I don't know if they're click farms, I think that's both, if I'm honest with you, um, some human automation, some automation automation.

Jack: Um, but you know, when you see people that have given the exact same response within a survey or the open enders, or more alarmingly than that, when you see multiple respondents and multiple rows giving the exact same answers in the exact same places to something that isn't, you know, brand or equal, you might expect that, um, then clearly those are massive red flags.

Jack: Um, and honestly the sample sources at the moment, I've never seen it so bad. The industry is full of these bad faith respondents or respondents coming from bad faith sources. Um, and to me that the leading indicator that any company needs to look at remains the verbatim responses, um, you know, for repetition, but also for just, um, you know, nonsense, gibberish, you know, spam, whatever you want to call it.

Tassia: Yes, there's also something I find very clever, which is to detect patterns, because it's very easy to detect straight liners. Uh, but patterns going A, B, A, B, A, B, it's something that's a bit more tricky. And we have a model for that as well. And when we talk about quality and quality checks, um, I think it's important that in your ecosystem, you have a rule of what's a no no.

Tassia: So you have a rule like, okay, this no, and it goes to the bin straight away. But you also have some metrics that are... probability based, and you keep track on them. Because, for example, if someone is saying A, B, and B, and it went A, B, A, B, five times, maybe it's not a pattern, maybe it's like an acceptable answer.

Tassia: Um, so you have some metrics that are okay, this no, and you have some metrics that you monitor, you control, you see how they are doing, and you use them as indicators. 

Ryan: So we're using a couple of things, right? So we, uh, we're really excited about Research Defender. We like some of the source controls that, um, that Pure Spectrum has.

Ryan: I think part of this is also the diversification of supply. So do you have the right, uh, and sorry everybody, it's 9:45 on a Monday morning, so I'm being more ethnocentric than I would like to be. I live in America and having an audience framework in America that doesn't represent the Hispanic population is insane to me, but the traditional panels haven't actually tapped into that.

Ryan: So you need to have the appropriateness, but then all of it in survey checks that you were talking about are also key. So bot detection, gibberish detection, pattern response detection. Um, and so these are questions, folks, you should be asking of all of your suppliers. And don't, don't fall behind the we have our own panel, uh, rhetoric because, um, it's a, it's a, it's a lazy thing to say and a lot of companies do it and it's not because we don't have one.

Ryan: It's just everybody's buying panel. We have more demand than we have supply. Right? So, um. I think this is a really important topic. 

Tassia: We also have to be a bit careful on the language people use. Uh, so for example, if you're detecting gibberish, if you're detecting any words that are not considered English, it doesn't mean it's not a language.

Tassia: It doesn't mean that's not how people talk because people talk slangs. People make mistakes. People use and speak online in a different way that they would in a conversation. So it's important that we don't cut because of that, you know, it's not because of accent or because the language people are using to answer the service.

Jack: I think, I think that's a really important point. Um, I think you need, you need to think about the populations that you're trying to represent a lot of the time, and we don't do good enough at that. Like, you know, we'll just think, oh, not that, this is what that means. We've been online rep, whatever you want to call it, this is what that means.

Jack: But I mean, particularly when it comes to quality features, you know, we'll sometimes have customers come back to us and say, Hey, there's swearing in this verbatim. How did this respond and get through? Please remove it. And I think you need to have a little bit of nuance about that, right? Like people sweat, I mean, if someone is talking about, I don't know, uh, a chicken burger and, and the respondent says, Oh, that hot sauce looks like it fucking rules.

Jack: That's a great response. That's a valid response, right? That's how people talk. There's a distinction though, isn't there? And I think, um, if something is all expletives with no insight, That's the poor quality response. And then there's some really blurry lines that I'm probably the, the least suitable person in the world to talk about when it comes to representation and community and so on and so on.

Jack: You know, it comes to racism. I mean, racists are people that exist in a population, sadly. 

Ryan: And they sadly buy your product, customers listening. 

Jack: Yeah, yeah, yeah, yeah, absolutely. Like, is, is our job as insights people, I don't know the answer to this, but it's a question we get a lot. You know, is our job as insights people to represent all populations, no matter how reprehensible they are, if they exist?

Jack: Or is our job to whitewash some of that and hide some of that, you know, you can go both ways of it, because obviously, you know, there's, there's certain words are very triggering for certain people and who knows who's looking at the research report, but also should we be sweeping this under the carpet?

Jack: Probably not. And it's, it's a really hard one to get right. I think when you think about representation and research. 

Ryan: It really is, whether it's geopolitical, where you should even do research. I mean, we've been on the right and the wrong side of this. Like, uh, Tas and Jack and I had a series of robust discussions last year about why would we, why would we suspend field work in Russia and not other countries?

Ryan: And so I think having principles. Are really important here because then you you and I've been I've been victim of this you fall into the trap of your own beliefs dictate whether you want to hear something or not or whether you think your company should be somewhere or not and maybe that's okay right or Or maybe it's not and I think you should be really thoughtful of what that means for you And so just because an executive doesn't want to hear something doesn't necessarily mean you should say that that's bad data It could just be very well representative. 

Ryan: So our last topic, all of what we spoke about is the foundation of the data assets that we all should be getting together and harmonizing.

Ryan: There was a now very famous article by Mark Ritson that came out about a month ago, where Mark used synthetic data, Data that was already collected to predict the same correspondence map for an automotive brand. Then a representative 500 interview respondent basis.

Ryan: And I think people fall into three camps when they read that report. Market research is so great. Um, Holy shit, we're screwed or what an opportunity. I think you two could guess what camp I fall in. I'm in, uh, Bucket 3. How about you two? 

Jack: Uh, it depends when you ask me. Mostly bucket three, but it is terrifying.

Jack: I'm not gonna lie. It's more terrifying, I think, for, I'm less fussed about this notion of, um, all of the insight sector going away and it all being replaced by synthetic data. That clearly isn't going to happen. There'll always be a need to keep whatever model you have fresh. You know, feed more data into it, validate results.

Jack: Maybe you predict with 80 percent accuracy, you finish off that last 20%, you know. There's things like multi armed bandits, which is the idea that you collect as much data as you need to validate a hypothesis. There's a big future for that sort of thing in insights. Um, but I am terrified in another sense, right?

Jack: Not necessarily, oh, my job as an insights professional, you know, in quantitative data is going to go away, but more in the, how on earth are we ever going to fight this from a quality perspective? Are we ever going to know who's real? I think, I think that's the part that scares me. The existential threat is around data collection, not around modeling.

Jack: I mean, modeling can only be a good thing for the industry as a whole in time. I think. 

Tassia: I think we also need to acknowledge that synthetic data and data that doesn't exist has been around for a long time, but we've called it a different thing. So, for example, if we are replacing missings and we are creating a model to replace missing, that's creating data.

Tassia: If we are fusing two data sets and I know everything about Ryan, but I don't know Ryan's media behavior, but I have his media behavior from a different data set. So I'm going to find, uh, Ryan in my other data set and just fuse the two of them, uh, and create data in that way. Or it can be making predictions.

Tassia: So I know enough about trying to predict if he's going to watch the Super Bowl, yes or no. Um, so the synthetic data has always been around, but now the scale of it and the volume of it is a lot bigger, but also what we can get is a lot more, uh, but yeah, I think there's no reason to be too scared about it.

Tassia: It has been around for a long time, so it's nothing new. 

Ryan: Okay, so my colleagues are, uh, researchers, so they both hedged a bit. Let me tell you why I'm an, a ubiquitous bucket three. This is a great opportunity. Um, I think with the rise of generative A. I. The type of information that we collect doesn't need to be radio buttons anymore.

Ryan: We can just as easily codify and cut across and synthesize rich verbatim responses, video responses as we can scale questions. And I think that is such an immense opportunity for the types of dialogues we have with customers when we're learning from them. I think it puts a strategic imperative on making sure that your data set is yours, that it's harmonized, that you can access it and query it.

Ryan: I also think it puts another impetus to make sure that it actually represents the population you seek, because if you just play around with chat GPT, it's a patriarchy in that fricking machine, right? So, so make sure that your data set isn't just propelling today, yesterday's matrix, and is actually learning new information.

Ryan: And I forget the technical phrase that Jack used, but the ability to only go ask what I need to know is such an amazing opportunity to optimize the entire business model of our industry because We have a cost per contact incidence LOI model. That's how all the unit economics of everything works. What happens if we just need to ask two questions at TAS?

Ryan: But we already know everything else. What can we do then? We could chat with her. We could be where she's at um, and, and this is all really important to me because I think particularly given PII.

Ryan: Rich first party data is cool again, but we've just spent 54 minutes talking to you all about sample quality, not the opportunity of rich first party attitudinal data that can be queried, that can be prompted, that can continuously get smart because if the flour we bake the bread on is not good. The bread's gonna taste like shit.

Ryan: But at a time that we're in right now, we have such an opportunity to make the voice of the consumer everywhere using simple chat as long as we get the basics right. Um, that was my TED talk. Thank you all for listening to it. What am I missing, you two? 

Jack: I think I agree with you broadly. I really do. Um, I just, I don't know.

Jack: Uh, I'm conscious of a few things. I think I'm conscious of, uh, I think you alluded to earlier, but the biases that exist within LLMs. And if we are asking an LLM to do a job on, on our data, it is still using its biases for the job it is executing. I'm also conscious of, um, this regression to the mean effect.

Jack: My, my fear I think is, um, there is a, uh, a problem within LLMs where they're not creative, right? It fundamentally, it is just auto complete, right? It's just on your phone. You touch something on your keyboard.

Jack: It's predicting the next word. That's all it is. There's no creative energy here. There's no like, um, looking at what we know from groups. And tailoring something to them. So I think, I think, for me it is, yes, it's a useful starting point. And it can maybe be used for ideation and getting you to a certain place.

Jack: But until you ingest that data from outside of the LLM's own little world, it's, it's fundamentally useless to me. So I wouldn't want anyone to ever think that LLMs could ever be a replacement for, um, optimization and, and, having good ideas fundamentally. Those two slightly different things, I guess.

Ryan: 100%. Sorry, I'm all, I'm all fired. There's two points I need to make here because I completely agree with you. 

Ryan: If we don't optimize the data, welcome to the era of homogenized marketing. Shoot me. The, the fucking ads on TV are already terrible enough. So homogenized marketing can rise, but if anybody, so I do disagree with one subtle thing you said.

Ryan: I've been playing a lot with prompts, and I can get that bot to sound, so I played this with the Gaza conflict, explain to me what's happening with Palestinian eyes, explain to me what's happening with British eyes, explain to me what's happening with American eyes, explain to me what's happening with Israeli eyes, and I've got, obviously, very, very different responses.

Ryan: But I think the prompting is gonna get more and more profound. Um, but the thing you said, the reason I interrupted you, is... The reason I'm firmly a bucket three is all this is, is a force multiplier of creativity, innovation, intuition, all the things that make humans the answer. This is just an enabler. 

Jack: There was only one time that an LLM could be trained off of human only data, right? And that was before the first LLM came out. Now LLMs are out there in the wild, the internet is full of responses generated by them. So I agree with you to an extent, but I think it just gets worse and worse and we get more regression to the mean.

Jack: And over time, I think actually in an increasingly AI content driven world, uh, first party data is going to be king. 

Tassia: The way I see this is as being a great baseline. So, okay, you created this ad and it has a dog and it's Christmas and a family and bread on the table, you know, there's this element of your ad, what's the baseline?

Tassia: So, predict it, what should it perform, and then you... Do a survey to almost calibrate it. So instead of using the survey only, you already know the information that you have and all these elements that you have in your advert and you use it to calibrate and see how you compare. Um, so yeah, that's how, how I think I would see this going as a baseline, for something that can be built on.

Ryan: Yeah, the opportunity to rise that baseline is make sure your data set is credible. Because that is the, that is from an insights, INA standpoint, the differentiator [00:47:00] between your soda company and your competitors from a competitive advanced standpoint with, as it relates to consumer intelligence is your first party data.

Ryan: And the robustness, the representativeness, and the, the harmonization and query ability of that. LLMs are the commodity, not the data. And that's the thing I think we oftentimes forget, right? It's like, well, the LLM is, if I, if I use LLMs, then I'm gonna win. No. Well, my son is eight years old and he can use chat GPT.

Ryan: It's completely irrelevant. It's what's inside of it that, that I think will really help. And that's why I'm so excited about the promise of first party data. 

Jack: I agree. I mean, we've been, we've been running linear regressions on our own data, you know, in this industry for, you know, decades. Um, LLMs are just a new kind of linear regression, right?

Jack: It's still all about the fundamental underlying data. It's just a more sophisticated way of analyzing that data. Um, which comes with a whole load of baggage around biases and I'm not even going to get into that. 

Ryan: Well, Tasia, Jack, this has been riveting. Thank you. Great way to start the week. Uh, thank you everybody for listening.

Ryan: If you have any questions, get at us. Our names are our first name dot our last name at zappistore. com. We'd be happy to talk to you more. Thank you both.

[Music transition to outro]


Ryan: Alright, so this is the part that Patricia usually recaps the episode. But she's not here, so we're not going to recap the episode because it was just wonderful. Jack and Tassia, I appreciate you. Very happy to be on your team. Um, and to all of our partners that we work with around the globe to to help us design better surveys, to push us to, put the right stimulus in front of people, to get access to the right populations with the right representation, we appreciate you. 

Ryan: We're going to keep working together. This is a holistic supply chain issue. And our season finale is coming up, Kelsey. 

Kelsey: I can't believe it. 

Ryan: I know, it's kind of crazy. So we love doing this podcast, but we also love the season finale. I'll tell you why, because we get a little break. And this is a unique kind of break, because it's about to be the holidays, we get a few weeks off, we can chill, we can put our feet up.

Ryan: But so our season finale is with Kate Schardt from Pepsi. Kate runs all things insights back transformation at PepsiCo.

Ryan: I've had the pleasure of really transforming the way insights is done, not just at PepsiCo, but also through partnership collaboration, sharing with many other brands over the course of the last. She's seven years now. Um, and I'm really excited for this conversation. We're, we're at the next frontier of innovation.

Ryan: We're going to share some of the things that we've learned and, and some of the resilience required to get a benefit of really improving the ads you develop, the innovations you develop, the, the impact of the insights people you have. So make sure you don't miss the episode with Kate. We'll be back soon.

Ryan: Thanks everybody. 

Kelsey: Bye.