The general principles are applicable in any software. However, I'll show you things in Displayr. You can do almost all of this with one exception in Q as well.
There are two approaches to segmentation. You can use judgment to create segments, which I'm not discussing today, or you can use algorithms combined with judgment. That's my focus today.
If you have any questions, please type them into the questions field in GoToWebinar, and I'll go through them at the end.
Once you collected your data, there are four and sometimes as many as six steps for creating a good segmentation. I'll give you a moment to review the steps. As you can see, the first step is preparing your data.
How you do your data prep depends on two things. First, what your data looks like and, second, what algorithm you'll use to form your segments.
I'll give you a moment to have a quick read about data prep.
I have three satisfaction variables that I wanna use in my segmentation. Currently, these are ordered categories. I need to convert them to numeric, and how you do that depends on what program you use.
In Displayr and Q, we need to explicitly change the structure of the data. We see here that higher values are associated with higher ratings, which makes sense, but I need to set not answered as missing values so it's excluded from analyses.
Now we have numbers.
This text data here shows why people like their cell phone company. It's from an open ended question. So how do we convert this into numbers? The best answer for segmentation is usually to use a special form of principal components analysis that we invented here at Displayr just for text data. This takes a minute to compute, so I ran it beforehand.
Let's take a look at the results on the next page.
I'm just going to show you how to do it and not explain the theory. By default, Despoier will convert the text into two numeric variables. If we want more granular information, we can change this, the number of components. We save these two variables to the dataset. Note that a new set of variables has appeared.
The first of these variables measures whether people said they liked nothing. And the second component measures if they said service or not. Okay. So now we've converted even text data into numbers.
Having prepared the data, we now need to move on and create some segments. Let's see what our options are. As you can see, there are lots of segmentation tools in Displayr. There are specialty tools for conjoint and MaxDiff and general purpose clustering tools like latent class analysis, hierarchical cluster analysis, and k-means cluster analysis. We'll start by creating four segments using latent class analysis.
Why four?
It is by far the most common number of segments that people end up using, so it's usually a good place to start. You may also be asking, what's latent class analysis? Why not use k means cluster analysis? Sure. You can use k means, but it's just not as good as latent class. If you want to use k means, there are many more technical things that you need to do, and even then, you won't get something better. It's like most people trying to fix their cars themselves. It'll take them more time, and there's a good chance they'll make a mistake.
Latent class analysis is like using an expert and trustworthy car mechanic. So, again, let's start with latent class analysis.
I'm going to choose three sets of variables, the variables from the text analysis that we just created, the satisfaction variables, and important scores from a MaxDiff.
By default, Displayr will automatically choose the number of segments using something called a Bayesian information criterion.
This may sound fancy, but algorithms that automatically select the number of segments are often pool's gold. So we'll manually change the number of segments to four. This output you see here is admittedly not great, so we'll use something better Note that a segment variable is automatically added to the dataset when you run a latent class analysis, and you can see that bit bear that variable at the bottom of the dataset.
Displayr has a special type of table just for segmentation. First, I choose the segment variable. And I'm going to compare the segments using the variables we used to create the segmentation.
Okay. Let's start evaluating evaluating the segmentation. Our segments vary from nine percent to forty nine percent. I'd rather not have half the sample in one segment, but it's not a deal breaker. The bars show us how different each segment is from the other segments. Blue means higher, and red means lower.
Let's start with segment one. Who are these people? In the text data, they were more likely to have liked nothing and didn't like service. The satisfaction data is consistent with this. This first segment is people who are just unhappy. Now the importance data should show us what they want, and segment one is basically average. Nothing is higher or relative or nothing is higher or lower relative to the other three segments. So we know they're unhappy, but we don't know what to offer them to make them happy. Such a segment isn't going to help us work out a strategy. So from my perspective, this segmentation is no good, and I need to tweak it. So the question is, how should we tweak it? The results seem to suggest that the satisfaction data doesn't tell the same story as the text data about what people want. We're trying to force together two bits of data that are incompatible. So we'll remove the satisfaction data and text data from the weight in class analysis and rerun it. Note that our segment sizes are still not ideal but have improved a bit.
Now even though we're no longer using the text and satisfaction data to create the segments, we're still getting some differences between the segments, which is good. So who is segment one now? The thing that's most important to them is streaming speed followed by mobile hotspot data allowance. I'll call them data logs.
Let's look at segment two and be a bit careful here. Their highest score is actually for price, but it's not significant since price is high for all segments. Coverage, voice quality, and unlimited calls in the US are the really high scores that distinguish segment two. I'll call this segment big talkers.
Now let's look at segment three. Price is their clear number one. Even though forty five percent of the market is in this segment, they've got an average score of ninety seven out of a possible one hundred per price. They also have a relatively high score for voice quality, not as high as the big talker segment but higher than the other two segments.
I'll call segment three, cheap talkers. Segment four has its highest score for unlimited calls to Canada and Mexico. However, it's not a super high score. The other segments all have high scores in the eighties and nineties. This one's highest score is just sixty eight. This segment wants it all. It's fully connected, so I'll name it accordingly.
Okay. Let's review the other evaluation criteria. The segments don't seem to differ due to response biases. I'll return to this criteria in a second case study, but we're good here. Sometimes you just can't figure out how to name segments, and that's always a bad sign. We're also good here, however. The strategic implications of these segments are pretty obvious. For example, we target cheap talkers with inexpensive plans, very inexpensive plans, and data hogs with unlimited data.
Now we need to see if the segments are related to demographics. Our big talkers are more likely to be female, while the fully connected segment is more likely to be male.
And fully connected and big talkers skew much younger. So this is making a lot of sense. Now if we had more time, we review other differences. The key thing is our segments are clearly and strongly related to demographics, which is good.
The last criteria that's sometimes applicable is whether we can build a useful predictive model. There are three options here, and I'll give you a moment to read them.
Like it says, you can skip this step most of the time, but let's see how how to do it.
My go to predictive model is always a random forest. It's kind of idiot proof, so it's a good fit.
We'll predict our segments using demographic data as predictors. As you can see here, our predictive accuracy is forty nine percent, and we can predict predict segments one and segment four. Whether this is a good model depends a lot on context.
If we were using it for buying Facebook ads or something, it may be fine. But if we were wanting to allocate customers into segments, we'd usually want at least eighty percent accuracy. So if we needed a predictive model, we've got a problem. What we need to do is change our variables in the model until its predictive accuracy improves.
Some people like to put demographics like age into the latent class analysis itself, but this is usually a terrible idea. You'll end up with segments that are easy to predict, but the segments are blah and really hard to interpret correctly.
Another approach is to predict based on the variables used to create the segments.
Not surprisingly, this improves the predictive model a lot. Now sometimes people want to implement their segmentation by asking golden questions. These are a small number of questions with which you can confidently predict segment membership. The basic idea is you leave out variables until the predictive accuracy drops a lot.
Note that the out of bag sample accuracy is eighty nine point seven percent. So we'll start by removing the least good predictor, premium entertainment, and you'll see that makes the model even better.
Now we'll remove mobile hotspot data allowance and a few more predictors, so we're left with just four. So if we wanted golden questions, we could do okay with just asking about price, coverage, friendly customer support, and streaming speed. But a little technical point, if you introduce often you introduce a lot of error when you write new golden questions as people inevitably interpret them differently to the way they were that they were originally asked. It's just something to keep in mind.
We've created two segmentations so far. We created one with just the importance data and another with the satisfaction and text data added. But we can create many more segmentations and should. For example, we can change the number of segments, exploring anywhere from two to eight segments. We can also use other algorithms like k means instead of wait in class. And there are some more exotic things we can do as well.
The secret to segmentation is really that you should do all these things. Keep trying different things to generate new segmentations, and then evaluate them and choose what's best in your judgment. I suggest creating a table like you see here to keep track of all the segmentations you run. The more diligent and systematic you are, the better your odds of winning.
Remember, it's not about finding the best statistical solution. It's about finding the best strategic solution, and trial and error is the best way to get there. We've reviewed five of the six possible steps for performing a segmentation. The last step is to create a segment typing tool, which is optional.
The two main use cases of a typing tool are when recruiting people for qualitative research, where you want to know what segment somebody is in, and for sales where you want to put customers into segments. It's easy in Displayr, but you can't do this last step in queue I'm going to add gender and age to this model to make it a bit more interesting.
Now I just click create simulator. We then enter the data.
The initial default person is predicted to be in segment four. Let's increase their price importance to one hundred, and let's make them an older female. You can see such a person would be in segment three.
You get the idea.
We'll now do a quick second case study and start by adding another dataset. This is a fun dataset provided by the good folks at Green Book. It's about how people choose market research companies. You can see importance ratings for various attributes in this table here. Like before, I need to make these importance ratings numeric.
That's it for data prep, so let's move on to step two, which is creating four segments using latent class analysis. Now I did say I'd start with four segments, but I'm actually gonna start with just two segments to make a point.
Now let's take a look and evaluate.
Once the table generates, I'll make some quick formatting changes just to make it easier to read.
The segment sizes are good.
We can see that segment two says just about everything is important, while segment one has lower important scores for almost everything. This could be because segment two is just more involved. But a more likely explanation is that we have a, yeah, saying bias, and segment two is just people who are more likely in general to select five on a five point scale. Fortunately, there's an easy fix for this. What we do is scale each respondent's data so that they have an average of zero for everything. This is easy to do. Now we'll use these scaled ratings in the weight in class analysis.
Just give it a second to update the table.
There we go.
Note that I'm still evaluating the segmentation using the original ratings data, not the scaled ratings data. I always recommend doing this. When we transform data, we make assumptions. By only using the original data in the reporting, we validate that the assumptions were appropriate.
As you can now see, we have fewer differences between the segments, but they're more useful. Segment one cares more about price and previous experience. By contrast, segment two cares more about sophisticated research technologies or strategies, data quality, and a few more things. Obviously, we're not finished. We should continue to carefully evaluate the segments using the main remaining criteria that we discussed earlier. We're going to stop here for now, however.
Later this week, we'll email you a recording of this webinar and a link to this document. And for those of you that love detail, there's a lot in our segmentation ebook that you can download on our website.
Now let's move on to your questions and answer as many of them as possible in the remaining time. Just give me a few seconds to switch gears and review the questions.
Okay. So Kyle asked if we'll if everyone will have access to a recording of the webinar, and the answer is yes. So keep an eye out for an email in your inbox, and, yeah, that'll have the recording link and also a link to the document that you see here.
So Tadishi, she asked, is this predictive model so let me go there. So, again, today she's asking, is this predictive model model essentially how you would build a typing tool in Displayr? And the answer is yes. It is one way you can build a typing tool in Displayr. There is another way, that you can do so. And, Tadishi, I will follow-up and send you the help center article that explains how else you can create a typing tool.
Rebecca asks, are we able to download the allocator tool so you can give it to the client so they can use it on their own? Rebecca, you can't. The answer is you can't download it, but you definitely can share the allocator tool, the typing tool. So what you would wanna do is just, you know, publish the typing tool or simulator, as a dashboard, and then you can share the link to the dashboard with your client, and then they're free to use it themselves. Give me a second.
So, Matt, you know, related to the t to, Tadishi's question, Matt asked, is it possible to take the simulator out of Displayr or to work offline?
Unfortunately, Matt. I mean, there is a way. You could recreate it in Excel, but, yeah, you can't just, like, import it into, into Excel.
Okay. So Scott Watson asked, why do you say that latent class analysis is better than k means? Great question. Let's take a look.
Okay. And the short answer, Scott, is there's just a lot of more hoops you need to jump through when doing k means cluster analysis. And a lot of things that you need to do correct to do a, you know, a good k means analysis, not cluster analysis, you just don't need to worry about with latent class analysis.
From my own experience too, latent class analysis is just more flexible and powerful and can handle different types of data. Where I think k means cluster analysis, my understanding is that, you know, you need to it's limited in terms of the type of data that you can feed into it. It can handle a mix of different types of data like white in class analysis can.
We keep getting some great questions.
So someone's asking how to use segments to respond to sentiment analysis. You could certainly perform a sentiment analysis and see if there are any differences across the segments in their sentiments, but, you know, I think that's largely up to you.
Okay. So Raven's asking, what is an acceptable accuracy when evaluating the model? And, yeah, we recommend that you aim for at least eighty percent accuracy.
So you can see here, you know, after adding age and gender to the predictive model, we're right at if you round up, we're at eighty percent. But, yeah, eighty percent at least eighty percent, sample accuracy should be your target, Raymond.
Oh, great question. So Matt's asking, if you have missing data already, should you still code don't knows as missing? Matt, yes. The answer is I would do that. Yep. So I would I mean, it's but it's also you could try both. Leave the don't knows in the analysis, see how the segmentation looks, and I would also exclude them and mark them as missing and see what that looks like. And, again, you know, segmentation is just as much an art as it is a science. Use your judgment to see, you know, what gives you the best, segmentation.
Someone asked, if we run different scenarios, let's say, you know, we tried four segments, five segments, six segments, how to choose the best? And the answer is there's no, you know, right or wrong. It is a judgment call. That's the art of segmentation. You need to use your judgment to decide which segmentation, regardless of how many segments are in the various, you know, scenarios you've run, you need to pick the one that you think resonates. Which one has the best strategic implications?
Which has the best distribution for sample sizes segment sizes across, across the you know, in your segment? So, again, you know, there's no definitive answer when picking the best for, you know, segmentation. It's a judgment call. You know, the one thing I would say is that four segments is by far and away, again, the most common number of segments that people land on, and I do think it's important to take care that you don't have too many segments. Okay? Sometimes you do need a lot of segments. It's just necessary. That's what the data is really telling you. But, you know, from my own experience, you do sometimes run the risk when you have a lot of segments of having it be really difficult to digest. You know, it's hard to remember, you know, who defines, you know, six plus segments, for example. So, you know, that's a judgment call as well. Okay.
Someone asked, can you weight questions in Displayr? And the answer is yes. It's very easy to do so.
Someone asked, is random forest behind the typing tool as well? And the answer is yes. So you can see here the random forest that I ran. And when I select that, once its options appear in the object inspector right here, you can see you actually have an option to create the simulator. In other words, the tightening tool directly from the random forest.
Someone says, I rarely have a lot of complete cases in survey data. Can latent class analysis handle that? The answer is yes. That's one of the benefits of latent class analysis. Returning to our segment comparison table like you see here.
Someone asked, you know, what do the red and blue shading refer to? What does it mean? Okay?
So the blue and red shading calls out what distinguishes the segments from all the other segments. So you can see here the blue shading or highlighting indicates things that are more important to a particular segment compared to the other segments.
And red points out things that are less important to a particular segment relative to the other segments. Just give me a second. Just finding, the other questions I haven't answered.
So Scott asked, can you the allocation tool allocate multiple respondents at once?
The answer is no, Scott. You do have to enter, you know, each respondent individually.
Okay. Let me see.
Yeah. So someone asked right here, again, piggybacking off the you know, continuing to talk about the segment comparison table.
So, you know, what are these values that we see here? And these are the average values for each segment.
Alright. So we got a good question here just asking about how the standard standardization that I did, how I go scaled ratings that I showed in the second case study, you know, how that works. So happy to show you, Mark, how that works. Let me go to that. So what's going on here is we can look at the original importance ratings in the data adder.
You can see, you know, a bunch of ratings, one to five. And right now, I have the labels applied, but we can also look at just the values.
And what we're doing with the scaled ratings is standardizing or scaling those one to five so that they have an average of zero. And we'll look at those in the data editor. So what it means is that if we go ahead and create a basic summary table, There we go.
So you can see here what we're doing is basically each respond all the ratings, we do have a lot of missing data. But if you were to average out the ratings for each respondent, the average rating would be for each respondent would be zero.
So that's what we're doing when we're, again, scaling or standardizing the importance ratings within each respondent or each case or record.
So I see we're now at, at the end of our time. Just wanna thank everyone, for joining today. And for those of you that asked questions that I didn't answer, just keep an eye out for an email from me, and I will follow-up and answer your questions via email.