--- title: "oneclick" author: "Tingying He, Petra Isenberg" output: html_document date: '2022-07-07' --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` # Summary ## EFA #Preparation ## Loading Required Packages ```{r} library(png) library(psych) library(EFA.dimensions) library(imager) #install XQuartz library(corrplot) library(knitr) library(kableExtra) #choose “no” when installing library(xtable) library(dplyr) library(tibble) library(ggplot2) ``` ## Loading Required Files ```{r} participantResponseFiles <- list.files(path= "./03_EFA/data",pattern = "\\.csv$") #names correspond to images, one participant per row, one word per column imageFiles <- list.files(path= "./03_EFA/images",pattern = "\\.png$") ``` # Helper Methods ```{r} #Clean the column names cleanColnames <- function(data){ newNames <- gsub("^.+?\\.(.+?)\\..*$", "\\1", colnames(data)) return(newNames) } ``` # Methods Used in the Analysis ## Methods to Test the Appropriateness of the Data for EFA ### Correlation "A subjective method is to examine the correlation matrix. A sizable number of correlations should exceed ±.30 or EFA may be inappropriate" ```{r} correlation <- function(num,data){ return(cor(data)) } ``` ### Bartlett’s test of sphericity An objective test of the factorability of the correlation matrix is Bartlett’s (1954) test of sphericity, which statistically tests the hypothesis that the correlation matrix contains ones on the diagonal and zeros on the off-diagonals. Hence, that it was generated by random data. This test should produce a statistically significant chi-square value to justify the application of EFA. If the p-value from Bartlett’s Test of Sphericity is lower than our chosen significance level (common choices are 0.10, 0.05, and 0.01), then our dataset is suitable for a data reduction technique. (https://www.statology.org/bartletts-test-of-sphericity/) ```{r} bartlettTest <- function(num,data){ bart <- cortest.bartlett(correlation(num,data), n = nrow(data)) if(bart[2]>0.05) cat("WARNING the p value is above 0.05") else cat("The p value is below 0.05. We are good to continue.") cat("\n\n") print(bart) return(bart) } ``` ### KMO Large sample sizes make the Bartlett test sensitive to even trivial deviations from randomness, so its results should be supplemented with a measure of sampling adequacy. The Kaiser-Meyer-Olkin (KMO; Kaiser, 1974) measure of sampling adequacy is the ratio of correlations and partial correlations that reflects the extent to which correlations are a function of the variance shared across all variables rather than the variance shared by particular pairs of variables. KMO values range from 0.00 to 1.00 and can be computed for the total correlation matrix as well as for each measured variable. * KMO values ≥.70 are desired * KMO values ≤.50 are generally considered unacceptable ```{r} KMOTest <- function(num,data){ kmo <- KMO(data) cat(paste("The overall measure of sampling adequacy is: ",kmo[1])) cat("\n\n") if(kmo[1]<.7) cat("WARNING the sampling adequacy has dropped below 0.7") else cat("The sampling adequacy is above 0.7. We are generally good.") cat("\n\n") return(kmo) } ``` ### The Number of Factors to Retain Measurement specialists have conducted simulation studies and concluded that parallel analysis and MAP are the most accurate empirical estimates of the number of factors to retain and that scree is a useful subjective adjunct to the empirical estimates. Unfortunately, no method has been found to be correct in all situations, so it is necessary to employ multiple methods and carefully judge each plausible solution to identify the most appropriate factor solution. ```{r} parallelAnalysis <- function(num,data){ cairo_pdf(paste(paste("results/ScreePlot-Image_",num,sep=""),'.pdf'), width=8, height=4) parallel <- fa.parallel(correlation(num,data), n.obs=nrow(data), fa="fa", n.iter=100, main="Scree plots with parallel analysis") nfact <- parallel$nfact dev.off() cat("\n\n") return(nfact) } ``` ### EFA #### Model of Factor Analysis * two models: PCA, common factor analysis * When the goal of research is to identify latent constructs for theory building or to create measurement instruments in which the researcher wishes to make the case that the resulting measurement instrument reflects a meaningful underlying construct, we argue that common factor analysis (EFA) procedures are usually preferable. * this distinction may make little difference when there are ≥40 measured variables #### Estimation Method * two estimation methods: ML and PA * Statistical simulations have found that PA outperforms ML when the relationships between measured variables and factors are relatively weak (≤.40), sample size is relatively small (≤300), multivariate normality is violated, or when the number of factors underlying the measured variables is misspecified ```{r} EFA <- function(num, factor, rotation,data){ efa <- fa(correlation(num,data), nfactors = factor, rotate = rotation, fm = "pa") #print(xtable(unclass(efa$Structure)),type="html") print(efa,sort=TRUE) #fa.diagram(efa,cut=.4,digits=2) #I don't fint this diagram particularly useful return(efa) } ``` # Analyzing Participant Responses Per Image ```{r,results='asis'} analyze_image <-function(num){ #First we plot the image that we are analyzing first image <- load.image(paste("03_EFA/images/",imageFiles[[num]],sep="")) plot(image) cat("\n\n") data <- read.csv(paste("03_EFA/data/",participantResponseFiles[[num]],sep=""), encoding="UTF-8") colnames(data) <- cleanColnames(data) #Then we go through the analysis steps. These are explained in detail above #1. Correlation # cat("### Correlation\n") # corr <- correlation(num,data) # pdf(paste(paste("generatedPlots-EFA/CorrelationMatrix-Image_",num,sep=""),'.pdf'), width=8, height=4) # corrplot(corr, method="square",tl.col="black",title=paste("Correlation for Image ",num),number.cex = 0.5) # dev.off() # cat("\n\n") # # cat("### Bartlett’s test of sphericity\n") # bartlettTest(num,data) # cat("\n\n") # # cat("### KMO\n") # KMOTest(num,data) # cat("\n\n") cat("## Scree Plot and Parallel Analysis\n") nfact <- parallelAnalysis(num,data) #number of factors cat("\n\n") cat("## Exploratory Factor Analysis - 1 Factor - No Rotation\n") efa_1factor <- EFA(num, 1, "none",data) cat("\n\n") #Exploratory Analyses below here cat("## Exploratory Factor Analysis - 2 Factors - Varimax Rotation(Orthogonal rotation)\n") efa_2factors_varimax <- EFA(num, 2, "varimax",data ) cat("\n\n") cat("## Exploratory Factor Analysis - 2 Factors - Promax Rotation(Pblique rotation)\n") efa_2factors_promax <- EFA(num, 2, "promax",data ) cat("\n\n") efa <- list(efa_1factor, efa_2factors_varimax, efa_2factors_promax, nfact) return(efa) } ``` ### EFA Results for all images ```{r,results='markup',fig.keep='all'} imageCount <- length(participantResponseFiles) # For debugging we can set the imageCount to whatever we want #imageCount <- 1 df <- NULL #number of factors suggested by parallel analysis list_nfactor <- NULL df_nfactor <- data.frame(matrix(ncol = 15, nrow = 0)) list_nfactor_column_name <- NULL for (i in 1:imageCount){ list_df_nfactor_column_name <- c(list_nfactor_column_name, paste("image", i)) cat(paste(paste("## Image ",i),"\n")) #efa_1factor efa_1factor <- analyze_image(i)[[1]] #we want to create a big table with all the factor loadings so we'll save the efa results here data <- NULL loadings <- as.data.frame(unclass(efa_1factor$loadings)) h2 <- efa_1factor$communality u2 <- efa_1factor$uniquenesses com <- efa_1factor$complexity data <- cbind(loadings, h2, u2, com) data <- tibble::rownames_to_column(data,"terms") data <- data %>% mutate_if(is.numeric, round, digits=2) data <- data %>% mutate_at(vars(com), funs(round(., 1))) write.table(data, paste("results/efa_1factor_image",i,".tsv",sep=""),row.names=FALSE,sep='\t') #create factor loading table #efa_2factors_varimax efa_2factors_varimax <- analyze_image(i)[[2]] data <- NULL loadings <- as.data.frame(unclass(efa_2factors_varimax$loadings)) h2 <- efa_2factors_varimax$communality u2 <- efa_2factors_varimax$uniquenesses com <- efa_2factors_varimax$complexity data <- cbind(loadings, h2, u2, com) data <- tibble::rownames_to_column(data,"terms") data <- data %>% mutate_if(is.numeric, round, digits=2) data <- data %>% mutate_at(vars(com), funs(round(., 1))) write.table(data, paste("results/efa_2factors_varimax_image",i,".tsv",sep=""),row.names=FALSE,sep='\t') #efa_2factors_promax efa_2factors_promax <- analyze_image(i)[[3]] data <- NULL loadings <- as.data.frame(unclass(efa_2factors_promax$loadings)) h2 <- efa_2factors_promax$communality u2 <- efa_2factors_promax$uniquenesses com <- efa_2factors_promax$complexity data <- cbind(loadings, h2, u2, com) data <- tibble::rownames_to_column(data,"terms") data <- data %>% mutate_if(is.numeric, round, digits=2) data <- data %>% mutate_at(vars(com), funs(round(., 1))) write.table(data, paste("results/efa_2factors_promax_image",i,".tsv",sep=""),row.names=FALSE,sep='\t') #number of factors nfact <- analyze_image(i)[[4]] list_nfactor <- c(list_nfactor, nfact) efa_1factor <- analyze_image(i)[[1]] if(i == 1){ df <- as.data.frame(unclass(efa_1factor$loadings)) colnames(df) <- c(paste("PA1 Image ",i)) df <- tibble::rownames_to_column(df,"terms") } else{ dftemp <- as.data.frame(unclass(efa_1factor$loadings)) colnames(dftemp) <- c(paste("PA1 Image ",i)) dftemp <- tibble::rownames_to_column(dftemp,"terms") df <- merge(df,dftemp,by="terms") } } ``` ### Table for number of factors for all images ```{r} colnames(df_nfactor) <- list_nfactor_column_name df_nfactor[nrow(df_nfactor) + 1,] <- list_nfactor row.names(df_nfactor) <- "Factors" write.table(df_nfactor, paste("results/factor_numbers.tsv",sep=""),row.names=FALSE,sep='\t') ``` ### Factor Loadings for all images from the chosen FA method ```{r,results='asis'} write.table(df,"results/factorLoadings_all_images.tsv",row.names=FALSE,sep='\t') print(xtable(df),type="html") ``` ### Terms for which all factor loadings are >.7 ```{r,results='asis'} remainingTerms <- df %>% filter_all(all_vars(. > 0.7)) write.table(remainingTerms,"results/factorLoadingsAbove_.7_all_images.tsv",row.names=FALSE,sep='\t') print(xtable(remainingTerms),type="html") ``` ## Reduce Items - Calculate Alpha This code calculates the average ratings each image received from participants and saves them as plots with CIs. This analysis is partly exploratory as we check what would have happened if participants had used some of our reduced scale. # Setup ```{r} source("03_EFA/CI-Functions.R") participantResponseFiles <- list.files(path= "03_EFA/data",pattern = "\\.csv$") #names correspond to images, one participant per row, one word per ``` ```{r} print(participantResponseFiles) ``` # Functions This one cleans up the column names for each image's responses ```{r} cleanColnames <- function(data){ newNames <- gsub("^.+?\\.(.+?)\\..*$", "\\1", colnames(data)) return(newNames) } ``` This functions draws a bar chart with confidence intervals ```{r} barChart <- function(resultTable, techniques, nbTechs = -1, ymin, ymax, xAxisLabel = "I am the X axis", yAxisLabel = "I am the Y Label",plotTitle){ #tr <- t(resultTable) if(nbTechs <= 0){ stop('Please give a positive number of Techniques, nbTechs'); } tr <- as.data.frame(resultTable) nbTechs <- nbTechs - 1 ; # seq will generate nb+1 #now need to calculate one number for the width of the interval tr$CI2 <- tr$upperBound_CI - tr$mean tr$CI1 <- tr$mean - tr$lowerBound_CI #add a technique column tr$technique <- factor(seq.int(0, nbTechs, 1)); breaks <- c(as.character(tr$technique)); print(tr) g <- ggplot(tr, aes(x=technique, y=mean)) + # geom_bar(stat="identity",fill = I("#CCCCCC")) + geom_errorbar(aes(ymin=mean-CI1, ymax=mean+CI2), width=0, # Width of the error bars size = 1.1 ) + #labs(title="Overall time per technique") + labs(x = xAxisLabel, y = yAxisLabel) + scale_y_continuous(limits = c(ymin,ymax),breaks=1:7) + scale_x_discrete(name="",breaks,techniques)+ coord_flip() + ggtitle(plotTitle) + theme(panel.background = element_rect(fill = 'white', colour = 'white'),axis.title=element_text(size = rel(1.2), colour = "black"),axis.text=element_text(size = rel(1.2), colour = "black"),panel.grid.major = element_line(colour = "#DDDDDD"),panel.grid.major.y = element_blank(), panel.grid.minor.y = element_blank())+ geom_point(size=2, colour="black") # dots print(g) } ``` This next function calculates the CIs of each image's responses depending on the scale items (terms) given. ```{r} calculateDrawResponseCIs <- function(scaleItems){ imageCount <- length(participantResponseFiles) pointEstimateVector = c() lowerBoundVector = c() upperBoundVector = c() imageVector = c() for (image in 1:imageCount){ data <- read.csv(paste("03_EFA/data/",participantResponseFiles[[image]],sep=""), encoding="UTF-8") terms <- cleanColnames(data) colnames(data) <- terms #exploratory trying to see what would happen if we had had a lot fewer participants #data <- data[sample(nrow(data), 24), ] data <- data[scaleItems] means <- rowMeans(data) imageVector <- append(imageVector,image) ci <- bootstrapMeanCI(means) pointEstimateVector <- append(pointEstimateVector,ci[1]) upperBoundVector <- append(upperBoundVector,ci[3]) lowerBoundVector <- append(lowerBoundVector,ci[2]) } df <- data.frame(image=imageVector,mean=pointEstimateVector,lowerBound_CI=lowerBoundVector,upperBound_CI=upperBoundVector) plotTitle <- paste(paste("Average Rating for the",length(scaleItems)),"item scale") barChart(df,df$image ,nbTechs = 15, ymin = 1, ymax = 7, "Image", "Average Ratings",plotTitle) ggsave(paste("./results/",paste(plotTitle,".pdf",sep=""),sep=""), width=8, height=4, device=cairo_pdf) print(df) return(df) } ``` As the ratings per image were done by different participant pools we don't actually want to compare the ratings of each image. ```{r} generatePerImageTables <-function(dfs,titles){ imageCounts <- length(participantResponseFiles) for(i in 1:imageCounts){ pointEstimateVector = c() lowerBoundVector = c() upperBoundVector = c() scaleVector = c() dflength <- length(dfs) for(d in 1:dflength){ scaleVector <- append(scaleVector,titles[d]) df <- dfs[[d]] pointEstimateVector <- append(pointEstimateVector,df$mean[df$image==i]) upperBoundVector <- append(upperBoundVector,df$upperBound_CI[df$image==i]) lowerBoundVector <- append(lowerBoundVector,df$lowerBound_CI[df$image==i]) } df <- data.frame(scale=scaleVector,mean=pointEstimateVector,lowerBound_CI=lowerBoundVector,upperBound_CI=upperBoundVector) plotTitle <- paste(paste("Image",i)," Ratings Per Scale") barChart(df,df$scale ,nbTechs = dflength, ymin = 1, ymax = 7, "Image", "Average Ratings",plotTitle) path<-paste("./results/",paste(plotTitle,".pdf",sep=""),sep="") print(path) ggsave(path, width=8, height=4,device=cairo_pdf) } } ``` And now we run our tests ## Average Ratings for the Test Scale ```{r} data <- read.csv(paste("03_EFA/data/",participantResponseFiles[[1]],sep=""), encoding="UTF-8") scaleItems <- cleanColnames(data) df31 <- calculateDrawResponseCIs(scaleItems) ``` ## Average Ratings for the 3-Item Scale ```{r} scaleItems = c("enjoyable","likable","pleasing") df3 <- calculateDrawResponseCIs(scaleItems) ``` ## Average Ratings for the 4-Item Scale ```{r} scaleItems = c("enjoyable","likable","pleasing","nice") df4 <- calculateDrawResponseCIs(scaleItems) ``` ## Average Ratings for the 5-Item Scale ```{r} scaleItems = c("enjoyable","likable","pleasing","nice","appealing") df5 <- calculateDrawResponseCIs(scaleItems) ``` ```{r} dfs <- list(df3,df4,df5,df31) generatePerImageTables(list(df3,df4,df5,df31),c("3-Item","4-Item","5-Item","31-Item")) ``` ## CFA ```{r} # import library library(lavaan) #CFA library(ltm) # Cronbach's alpha ``` ```{r} # import data data <- read.csv("04_CFA/validation results + demographics - valid - PID removed.csv" , encoding="UTF-8") data ``` ```{r} # split data # results for Sunburst - our scale data_sun <- data[, c('sunburst.enjoyable.', 'sunburst.likable.', 'sunburst.pleasing.', 'sunburst.nice.', 'sunburst.appealing.')] # results for Sunburst - classic aesthetic for website scale data_sun_ca <- data[, c('sunburst.aesthetic.', 'sunburst.pleasant.', 'sunburst.clear.', 'sunburst.clean.', 'sunburst.symmetric.')] # results for Beamtree - our scale data_beam <- data[, c('beamtree.enjoyable.', 'beamtree.likable.', 'beamtree.pleasing.', 'beamtree.nice.', 'beamtree.appealing.')] # results for Beamtree - classic aesthetic for website scale data_beam_ca <- data[, c('beamtree.aesthetic.', 'beamtree.pleasant.', 'beamtree.clear.', 'beamtree.clean.', 'beamtree.symmetric.')] # results for Startree - our scale data_star <- data[, c('startree.enjoyable.', 'startree.likable.', 'startree.pleasing.', 'startree.nice.', 'startree.appealing.')] # results for Startree - classic aesthetic for website scale data_star_ca <- data[, c('startree.aesthetic.', 'startree.pleasant.', 'startree.clear.', 'startree.clean.', 'startree.symmetric.')] ``` ```{r} #Clean the column name function cleanColnames <- function(data, visName){ # delete visName and . from column name of dataframe, only keep the terms as column name names(data) <- sub(paste(visName, ".", sep =""), "", names(data)) names(data) <- sub("\\.", "", names(data)) return(data) } ``` ```{r} # Clean column names for all data data_sun <- cleanColnames(data_sun, "sunburst") data_sun_ca <- cleanColnames(data_sun_ca, "sunburst") data_beam <- cleanColnames(data_beam, "beamtree") data_beam_ca <- cleanColnames(data_beam_ca, "beamtree") data_star <- cleanColnames(data_star, "startree") data_star_ca <- cleanColnames(data_star_ca, "startree") ``` ## Confirmatory Factor Analysis ```{r} CFA <- function(data){ APV.model ='aesthetic_pleasure =~ enjoyable + likable + pleasing + nice + appealing' fit <- cfa(APV.model,data = data,std.lv = TRUE) pvalue <- fitMeasures(fit, "pvalue") tli <- fitMeasures(fit, "tli") cfi <- fitMeasures(fit, "cfi") srmr <- fitMeasures(fit, "srmr") rmsea <- fitMeasures(fit, "rmsea") good_fit_list <- NULL good_fit_list <- c(pvalue, tli, cfi, srmr, rmsea) summary(fit, fit.measures=T, standardized=TRUE) return(good_fit_list) } ``` ```{r} #CFA factor numbers APV.model ='aesthetic_pleasure =~ enjoyable + likable + pleasing + nice + appealing' fit1 <- cfa(APV.model,data = data_sun,std.lv = TRUE) l1 <- lavInspect(fit1, what = "std.all", add.labels = FALSE, add.class = TRUE, list.by.group = TRUE, drop.list.single.group = TRUE)[[1]] fit2 <- cfa(APV.model,data = data_star,std.lv = TRUE) l2 <- lavInspect(fit2, what = "std.all", add.labels = FALSE, add.class = TRUE, list.by.group = TRUE, drop.list.single.group = TRUE)[[1]] fit3 <- cfa(APV.model,data = data_beam,std.lv = TRUE) l3 <- lavInspect(fit3, what = "std.all", add.labels = FALSE, add.class = TRUE, list.by.group = TRUE, drop.list.single.group = TRUE)[[1]] df_factor_loading_cfa <- cbind(l1, l2, l3) colnames(df_factor_loading_cfa) <- c("SunBurst", "StarTree", "BeamTree") rownames(df_factor_loading_cfa) <- c("enjoyable", "likable", "pleasing", "nice","appealing") df_factor_loading_cfa <- as.data.frame(df_factor_loading_cfa) df_factor_loading_cfa <- mutate(df_factor_loading_cfa, across(where(is.numeric), round, 3)) df_factor_loading_cfa write.table(df_factor_loading_cfa, paste("results/cfa_factor_loading.tsv",sep=""),row.names=TRUE,col.names=NA,sep='\t') ``` ```{r} good_fit_sun <- CFA(data_sun) good_fit_star <- CFA(data_star) good_fit_beam <- CFA(data_beam) good_fit_table <- cbind(good_fit_sun, good_fit_star, good_fit_beam) good_fit_table <- as.data.frame(good_fit_table) good_fit_table <- mutate(good_fit_table, across(where(is.numeric), round, 3)) good_fit_table write.table(good_fit_table, paste("results/cfa_good_fit.tsv",sep=""),row.names=TRUE,col.names=NA,sep='\t') ``` ## Convergent Validity ```{r} con_vali <- function(data1, data2){ # results of two scales you want to calculate correlation cor.test(rowMeans(data1), rowMeans(data2), method = "pearson") } ``` ## Discrinminant Validity ```{r} dis_vali <- function(data1){ # result of your scale cor.test(rowMeans(data1), data$age , method = "pearson") } ``` # Sunburst ```{r} CFA(data_sun) ``` ```{r} con_vali(data_sun, data_sun_ca) ``` ```{r} dis_vali(data_sun) ``` ```{r} cronbach.alpha(data_sun) ``` # Startree ```{r} CFA(data_star) ``` ```{r} con_vali(data_star, data_star_ca) ``` ```{r} dis_vali(data_star) ``` ```{r} cronbach.alpha(data_star) ``` # Beamtree ```{r} CFA(data_beam) ``` ```{r} con_vali(data_beam, data_beam_ca) ``` ```{r} dis_vali(data_beam) ``` ```{r} cronbach.alpha(data_beam) ``` ```{r} #Cronbach's Alpha for CFA df_alpha <- data.frame(matrix(ncol = 3, nrow = 0)) list_alpha <- c(cronbach.alpha(data_sun)$alpha, cronbach.alpha(data_star)$alpha,cronbach.alpha(data_beam)$alpha) colnames(df_alpha) <- c("SunBurst", "StarTree", "BeamTree") df_alpha[nrow(df_alpha) + 1,] <- list_alpha row.names(df_alpha) <- "Cronbach’s Alpha" df_alpha <- df_alpha %>% mutate_if(is.numeric, round, digits=2) df_alpha write.table(df_alpha, paste("results/cfa_alpha.tsv",sep=""),row.names=TRUE,col.names=NA,sep='\t') ``` ```{r} # Pearson correlation df_pearson <- data.frame(matrix(ncol = 3, nrow = 0)) list_pearson_con <- c(con_vali(data_sun, data_sun_ca)$estimate,con_vali(data_star, data_star_ca)$estimate,con_vali(data_beam, data_beam_ca)$estimate) list_pearson_dis <- c(dis_vali(data_sun)$estimate,dis_vali(data_star)$estimate,dis_vali(data_beam)$estimate) colnames(df_pearson) <- c("SunBurst", "StarTree", "BeamTree") df_pearson[nrow(df_pearson) + 1,] <- list_pearson_con df_pearson[nrow(df_pearson) + 1,] <- list_pearson_dis row.names(df_pearson) <- c("Classic Aesthetic", "Age") df_pearson <- df_pearson %>% mutate_if(is.numeric, round, digits=2) df_pearson write.table(df_pearson, paste("results/cfa_pearson.tsv",sep=""),row.names=TRUE,col.names=NA,sep='\t') ``` ### Rating Results This code calculates the average ratings each image received from participants and saves them as plots with CIs. # Setup ```{r} source("04_CFA/CI-Functions.R") participantResponseFiles <- list.files(path= "04_CFA/data",pattern = "\\.csv$") #names correspond to images, one participant per row, one word per ``` # Functions This one cleans up the column names for each image's responses ```{r} cleanColnames <- function(data){ newNames <- gsub("^.+?\\.(.+?)\\..*$", "\\1", colnames(data)) return(newNames) } ``` This functions draws a bar chart with confidence intervals ```{r} barChart <- function(resultTable, techniques, nbTechs = -1, ymin, ymax, xAxisLabel = "I am the X axis", yAxisLabel = "I am the Y Label",plotTitle){ #tr <- t(resultTable) if(nbTechs <= 0){ stop('Please give a positive number of Techniques, nbTechs'); } tr <- as.data.frame(resultTable) nbTechs <- nbTechs - 1 ; # seq will generate nb+1 #now need to calculate one number for the width of the interval tr$CI2 <- tr$upperBound_CI - tr$mean tr$CI1 <- tr$mean - tr$lowerBound_CI #add a technique column tr$technique <- factor(seq.int(0, nbTechs, 1)); breaks <- c(as.character(tr$technique)); print(tr) g <- ggplot(tr, aes(x=technique, y=mean)) + # geom_bar(stat="identity",fill = I("#CCCCCC")) + geom_errorbar(aes(ymin=mean-CI1, ymax=mean+CI2), width=0, # Width of the error bars size = 1.1 ) + #labs(title="Overall time per technique") + labs(x = xAxisLabel, y = yAxisLabel) + scale_y_continuous(limits = c(ymin,ymax),breaks=1:7) + scale_x_discrete(name="",breaks,techniques)+ coord_flip() + ggtitle(plotTitle) + theme(panel.background = element_rect(fill = 'white', colour = 'white'),axis.title=element_text(size = rel(1.2), colour = "black"),axis.text=element_text(size = rel(1.2), colour = "black"),panel.grid.major = element_line(colour = "#DDDDDD"),panel.grid.major.y = element_blank(), panel.grid.minor.y = element_blank())+ geom_point(size=2, colour="black") # dots print(g) } ``` This next function calculates the CIs of each image's responses depending on the scale items (terms) given. ```{r} calculateDrawResponseCIs <- function(scaleItems){ imageCount <- length(participantResponseFiles) pointEstimateVector = c() lowerBoundVector = c() upperBoundVector = c() imageVector = c() for (image in 1:imageCount){ data <- read.csv(paste("04_CFA/data/",participantResponseFiles[[image]],sep=""), encoding="UTF-8") #terms <- cleanColnames(data) #colnames(data) <- terms #exploratory trying to see what would happen if we had had a lot fewer participants #data <- data[sample(nrow(data), 24), ] data <- data[scaleItems] means <- rowMeans(data) imageVector <- append(imageVector,image) ci <- bootstrapMeanCI(means) pointEstimateVector <- append(pointEstimateVector,ci[1]) upperBoundVector <- append(upperBoundVector,ci[3]) lowerBoundVector <- append(lowerBoundVector,ci[2]) } df <- data.frame(image=imageVector,mean=pointEstimateVector,lowerBound_CI=lowerBoundVector,upperBound_CI=upperBoundVector) plotTitle <- paste(paste("Average Rating for the",length(scaleItems)),"item scale") barChart(df,df$image ,nbTechs = imageCount, ymin = 1, ymax = 7, "Image", "Average Ratings",plotTitle) ggsave(paste("results/",paste(plotTitle,".pdf",sep=""),sep=""), width=8, height=4,device=cairo_pdf) print(df) return(df) } ``` As the ratings per image were done by different participant pools we don't actually want to compare the ratings of each image. ```{r} generatePerImageTables <-function(dfs,titles){ imageCounts <- length(participantResponseFiles) for(i in 1:imageCounts){ pointEstimateVector = c() lowerBoundVector = c() upperBoundVector = c() scaleVector = c() dflength <- length(dfs) for(d in 1:dflength){ scaleVector <- append(scaleVector,titles[d]) df <- dfs[[d]] pointEstimateVector <- append(pointEstimateVector,df$mean[df$image==i]) upperBoundVector <- append(upperBoundVector,df$upperBound_CI[df$image==i]) lowerBoundVector <- append(lowerBoundVector,df$lowerBound_CI[df$image==i]) } df <- data.frame(scale=scaleVector,mean=pointEstimateVector,lowerBound_CI=lowerBoundVector,upperBound_CI=upperBoundVector) plotTitle <- paste(paste("Image",i)," Ratings Per Scale") barChart(df,df$scale ,nbTechs = dflength, ymin = 1, ymax = 7, "Image", "Average Ratings",plotTitle) path<-paste("results/",paste(plotTitle,".pdf",sep=""),sep="") print(path) ggsave(path, width=8, height=4,device=cairo_pdf) } } ``` And now we run our tests ## Average Ratings for the Test Scale ```{r} data <- read.csv(paste("04_CFA/data/",participantResponseFiles[[1]],sep=""), encoding="UTF-8") scaleItems <- colnames(data) scaleItems df31 <- calculateDrawResponseCIs(scaleItems) ```