how to use cluster analysis?

Published

im conducting a study on the characteristics of tb patients in terms of their treatment outcome (successful and unsuccessful). my independent variable/data sets include sociodemographic/economic factors(age, gender, income, civil status etc) of tb patients. is it possible to use cluster analysis out of this? can cluster analysis perform the similarities of patients who have successful outcome and patients with unsuccessful outcome base on the dependent variables? i'm confused between cluster analysis and discriminant analysis. please help. thanks

Specializes in Nursing Professional Development.

Sounds more like discriminate analysis to me. In discriminate analysis, your dependent variable is categorical (e.g. successful vs. unsuccessful) and you are building the equation (with the independent variables) to be able to predict those patient who will fall into one category or another.

In cluster analysis, you are trying to discover the variables that form groups (clusters). The groups are not pre-defined by a categorical dependent variable.

I hope that helps clarify the distinction in your mind.

so in order to use cluster analysis, the dependent variable should not be categorical? my research problem is to determine the similarities and characteristics of MDRTB patients in terms of treatment outcomes. isn't it possible to cluster the characteristics of people with successful outcome and those with unsuccessful outcome?

correct me if im wrong but this is what i understand about cluster analysis. a cluster analysis has no known groups/classification, like in my case the successful/unsuccessful is not known. so for me to perform cluster analysis on my research problem, i would enter all the patients data (independent variables) regardless if their treatment outcome is successful or unsuccesful in the software and once the clusters have been formed, it is the reseacher's discretion if which clusters belong to successful or which clusters belong to unsuccesful?

Specializes in Nursing Professional Development.

It seems to me you need to sit down with your advisor (and/or stats or methodology person on your dissertation committee) to sort this out. Or if it is just a hyothetical study for a single class, sit down with the professor. In many situations, there is no one right way to approach an issue -- and it may be that you need to reconceptualize your question to allign it with a statistic that will make sense to you. It sounds like you are in the stage of "playing around with possibitlities" ... and that's OK. But to help you sort it out and make a firm decision, you should be talking with your faculty.

Cluster analysis is advance stuff -- and I have a very limited experience with it. You need to be talking with someone who uses it (and other multivariate techniques) on a regular basis. Based on my understanding of cluster anaylsis, it does not seem to be a good match for you as your fundamental question is to compare the characteristics of one known group (successfull) to the characteristics of another known group (unsuccessful). To me, discriminate analysis seems a better fit because it would allow you to identify and weigh the characteristics that predict success or failure. -- but someone with more expertise might be able to help you shape your study so that cluster analysis would be a good fit.

+ Join the Discussion