Forensic statistic, Probability & Hypothesis testing Probability (i) In 1993 a team of scientists from John Hopkins University and the University of Helsinki reported in Science [1993,vol.260,p751] the discovery of a genetic marker for so-called familial cancer of the colon. The scientists estimated that one person in 200 carries the defective gene, that 95% of people with the gene will develop cancer, and that of those who get cancer, 60% will get cancer of the colon. (a) From these figures what percentage of people will develop cancer of the colon from this mechanism? Some people are considered to be at increased risk of developing cancer of the colon because of a strong family history of the disease. It is believed that 75% of these will find that they do not have the genetic marker and that these people bear only the average risk of developing cancer of the colon, which is 1 chance in 20. (b) What proportion of those with a strong family history will get cancer of the colon? (c) What proportion of those who get cancer of the colon carry the defective gene? (ii) You are suspicious about a coin but it is not in your hands; you cannot look at it. You think it may be two-headed or it may be a fair coin with a head and a tail. Suppose there is an equal chance of either of these and there is no other possibility. (a) Calculate the odds ratio for the coin being two headed (b) You watch the coin being tossed 10 times and ten heads come up. Calculate the likelihood ratio and hence the posterior odds for this evidence. (c) Use Tables 11.3 and 11.4 in Lucy to give a verbal interpretation of the result. Part 2: Hypothesis testing (i) The frequencies of three blood types A, AB and B among 151 children from parents whose blood types are both AB are shown in the following table: Blood type A AB B Total Number observed 39 70 42 151 A law of genetics postulates that the ratios of A:AB:B are 1:2:1. Do the observations support the law? Carry out a hypothesis test to answer this question. Ensure that you include all the steps for hypothesis testing. (ii) A statistical model was built for predicting reconviction based on three years of post-prison follow up of 347 men who had been imprisoned for crimes against persons or property. It was possible to classify the prisoner as having either low risk, medium risk or high risk of re-offending. To see how useful the classification was a further 225 prisoners were studied on leaving prison. The results are shown in the table below. Risk group Low Medium High Total Reconvicted 23 50 53 126 Not reconvicted 52 25 22 99 Total 75 75 75 225 Is there an association between the risk group and whether the prisoner is reconvicted? Carry out a hypothesis test to answer this question. Ensure that you include all the steps for hypothesis testing. (iii) A small random sample of cannabis seizures in Australia over a few years is given in the assignment data in the tab labelled Cannabis. It contains the weight of the cannabis seized in grams and the seizure type, either in a mail item or other method of entry into the country. (a) Draw a histogram of the weight of cannabis seized by mail and by other; that is draw two histograms. (b) Describe any problems that you see. (c) What one measure of location and one measure of spread would you use to describe these two data sets? (d) Transform the weight variable to log(weight). (e) Redraw the histograms using log(weight) and comment on any differences you see between these histograms and the ones drawn in (a). (f) Assess the Normality for the two data sets for weight and log(weight). (g) Perform a hypothesis test to determine whether the weight of mail item seizures differs from other seizures. Pay careful attention to the distributional assumptions.