Using the “students.csv” file to complete the first two questions below, the data contain gender, hair color, and eye color. Follow the same general procedures you used in the Skills exercise to answer the questions below. REMEMBER: capitalization, spacing, and parentheses are important when using datasets and functions in R!
Include all code for each question, plus output from R or graphs, and answer all sub-questions.
1) Let’s work with contingency tables with two variables.
a. Create a Contingency Table using the gender and hair color data located in the “students.csv” file. Add margins and use the table to answer the following questions.
b. What is the probability of having brown hair if you are male?
c. What is the probability of having red hair if you are female?
d. If you have blond hair, what is the probability you are female?
2) Let’s work with contingency tables with three variables.
a. Create two Contingency Tables – one for males and one for females with eye color and hair color. Add margins and use the table to answer the following questions. Note: order is important when entering data into the table() function; hair, eye, gender should be the order of variables.
b. What are the probabilities of having hazel eyes for males and for females?
c. What is the probability of having blue eyes if you are blond (for females and for males)?
d. For males, is having brown eyes independent of hair color? Why or why not? To figure this out, calculate whether the probability of having brown eyes is the same as or different from the conditional probability of having brown eyes given that you have each hair color.
3) We are going to practice with data simulation.
a. Using the rbinom() function, ‘simulate’ 10,000 trials of flipping a coin 100 times (you should end up with a vector of 10,000 numbers ranging from 0-100).
b. Make a histogram of the results from these 10,000 trials. Add a title and label the axes of your histogram.
c. Use the sample() function to select a number between 0 and 100 and repeat this 10,000 times. Make a histogram of the results (include appropriate title and axis labels).
d. Compare the two histograms, what shape should each be and why? Should they look the same or different and how do your actual histograms look (compared to what you think they should look like)?
4) There are 10 teams in the Pacific Division of the MLS (Major League Soccer).
a. How many different ranking orders are possible?
b. Of the 10 teams in the Pacific Division, only 4 are guaranteed to move on to the playoffs. How many possible combinations are there?
5) There is a total of 20 fans sitting in a row at the seventh game of the Stanley Cup Finals, 4 are crying because their team lost, 7 are screaming because their team won, 6 are trying to leave so they can beat the traffic, and 3 are asleep because they had bit too much to drink. How many different seating arrangements are possible for this collection of 20 fans?