Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade - PDF Free Download

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Size: px
Start display at page:

Download "Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade"

Transcription

1 Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements of temperature in Fahrenheit, and Montreal in degrees centigrade. Boston measures pollution in particles per cubic yard of air; Montreal uses cubic meters. Both report a correlation of exactly 0.58 between temperature and pollution. Which of the following is true: A. Boston really has the higher correlation, because Fahrenheit temperatures are higher than Centigrade. B. Montreal really has the higher correlation, because cubic meters are bigger than cubic yards. C. Both cities have the same correlation, because correlation is independent of the units of measurement. D. We do not know which city has the really higher correlation. Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade 2. Which of the following is NOT a possible value of the correlation coefficient? A. negative 0.9 B. zero C. positive 0.15 D. positive 1.5 E. negative.05 Answer: D. Correlations are always between -1 (perfect negative) and +1 (perfect positive) 3. We measure heights and weights of 100 twenty-year old male college students. Which will have the higher correlation: A. corr(height, weight) will be much greater than corr(weight, height) B. corr(weight, height) will be much greater than corr(height, weight) C. Both will have the same correlation. D. Both will be about the same, but corr(weight, height) will be a little higher. E. Both will be about the same, but corr(height, weight) will be a little higher. Answer: C. Correlation is independent of the order in which the variables enter. 4. Two lists of numbers, X and Y, have a correlation of 0.3; X and Z have a correlation of We know that: A. the stronger correlation is the correlation of X and Y, since it is positive. B. the stronger correlation is the correlation of X and Z. C. the two correlations are equally strong, since = 0.3 D. We cannot tell which is stronger without more information. Answer: B. The stronger correlation is determined by the absolute value, since it measures the scatter of points about a line. Whether the line has a positive or negative slope, the less the scatter, the greater the absolute value of the correlation. 5. Suppose men always married women who were 10 percent shorter than they were. The correlation coefficient of the heights of married couples would be: A if the correlation were computed with corr(male.height, female.height) B if the correlation were computed with corr(female.height, male.height) C no matter which way the correlation were computed. D. 1.0 since the height of the man is always predictable from the height of the woman. Answer: D. All points in a graph of the correlation would line on the straight line Hf = 0.9 * Hm where Hf = female height and Hm = male height.

2 6. In one class, the correlation between the final and the midterm was 0.5, whereas the correlation between the final and the homework grades was This means that: A. the relation of the final and the midterm was twice as linear as the relation between the final and the homework. B. the relation of the final and the homework was twice as linear as the relation between the final and the midterm. C. More of the variation of the final was explained by the homework grades than by the midterm grades. D. More of the variation of the final was explained by the midterm grades than by the homework grades. Answer: D. The phrase "more linear" is meaningless; a line is either linear or curved; and correlation is always correlation around a straight line. It does indicate (when squared) the percentage of the variance explained, and the midterm grade explained square(.5) =.25 or 25 percent of the variance, where the homework explained only square(.25) =.0625 or 6.25 percent of the variance. 7. The unemployment rate is related to inflation by the Phillips curve, which is typically a negative sloped curve looking like a hyperbola -- inflation is very high at very low rates of unemployment, and it takes very high rates of unemployment to bring inflation down to zero. We compute a correlation coefficient between unemployment rates and inflation, and find it is negative 0.5. The true relation between the two is most probably: A. exactly as reported by the correlation coefficient. B. stronger than reported by the correlation coefficient, due to the non-linearity. C. weaker than reported by the correlation coefficient, due to the great scatter of points around the line. Answer: B. The correlation coefficient reports the cluster of points around a straight line; if the true relation is curvilinear, the correlation will understate the strength of the relation. 8. An investigator is studying the relation between the physical and intellectual growth of primary schoolchildren (grades 1-6). At each grade level, she notes that the correlation between the height of children and the size of their vocabulary is zero. For all students in the school, the correlation is likely to be: A. Positive B. Negative C. About zero D. Cannot tell. Answer: A. While height and vocabulary size have nothing to do with each other, both have a lot to do with age, especially for ages 6 to 12. The common cause will lead to a strong correlation of the two, which of course could not be said of (say) ages 60 to A study is done of students commuting to a large university by bicycle. The correlation between the time spent waiting at traffic lights and total cycling time was This means: A. The average rider spent half his cycling time waiting at traffic lights. B. The more time a rider spends waiting at traffic lights, the higher is total time is likely to be. C. If the rider's time at traffic lights increases by 5 minutes, he will spend an additional 10 minutes commuting, on the average. D. If the rider's time at traffic lights increases by 10 minutes, he will spend an additional 5 minutes commuting, on the average. Answer: B. You would need a regression equation to see whether or not time at lights predicted an exact time of commute. If the distances of the commute varied, using time at lights alone would lead to an omitted variable problem for a regression of total time on time at lights

3 10. The correlation between the ages of the husbands and wives in the United States was which of the following? A B C. zero D E Answer: B. Men usually marry women of about their own age, but of course there are enough exceptions that the correlation is not perfect. 11. A study is done of the impact of a drug on body temperature and blood pressure. We have three observations: Temperature(F) Pressure (mg) Temp - 97 Pressure The correlation coefficient is closest to A B C D. zero E Hint: a good guess is possible by plotting the data carefully. a better one is possible by subtracting the lowest number in each list from all the other numbers in that list. Answer: A. Transform the first column of data by subtracting 97 from each number; transform the other column by subtracting 80 from each. The result is given above; the line Pressure = 10*Temp holds perfectly for all points. 12. The correlation between the average midterm score of each of 10 classes of statistics and the average final exam score was found to be A statistics instructor concludes that if a student has a B on the midterm, he has an 85 percent chance of a B on the final. This conclusion is: A. correct B. incorrect, because it is an ecological correlation, which means there is usually much more variance of individuals than of averages C. incorrect, because it is an average correlation, and there is usually much more variance of averages than of individuals. Answer: B.

4 13. A regression tries to predict student GPA percentile (on a zero-100 scale) from their SAT-Math score. You are to make the relevant prediction for a student who has a SAT math score of 600 (note that SAT scores go from 200 to 800). He would be expected to be closest to the -- percentile of GPA on the basis of the regression output below: Coefficient t-stat Intercept SAT-Math R-squared: 0.36 A. 10th percentile B. 20th percentile C. 60 th percentile D. 80th percentile E. 100th percentile Answer: D. Write the regression equation as GPA percentile = * SAT math, so GPA percentile = * 600 = We know, on the basis of the above regression, that the strongest part of the explanation of the score is due to the coefficent of the: A. intercept term, because the coefficient is larger. B. intercept term, because the t-stat is smaller C. SAT-math term, because the coefficient is smaller. D. SAT-math term, because the t-stat is larger. Answer: D. The size of the t-statistic, not the size of the coefficient, indicates the explanatory strength of a variable. 15. We know that the correlation coefficient between SAT-math and the GPA percentile is closest to: A. zero B C D E Answer: C. The R-squared is the square of the correlation coefficent r, so r = sqrt(.36) = If we knew that the SD of the SAT math score was 100 and the SD of the GPA percentile was 25, we could calculate the standard error of the regression (the RMSE) to be closest to (look at the next question for a hint at the formula): A. 20 B. 40 C. 60 D. 80 E The formula for the standard error of the regression (RMSE) is A. (1 - R2) * SD(x) B. Covariance (x,y) / SD(x) * SD(y) C. sqrt(1- R2) * SD(y) D. sqrt(1-r2) * SD(x) E. r * SD(y) / SD(x) Answer: C. The standard error of the regression gets smaller as the R-squared gets larger; if R-sq = 1, error = The formula for the slope of the regression line is (same options as the previous question) Answer: E 19. The formula for the correlation coefficient is (same options as the previous question) Answer: B

5 20. We plot the data lying behind the question 13 regression with the R command >>> Plot(SAT-Math, GPA) To draw a regression line on the plot, use the R command: A. >>> abline(0.36) B. >>> abline(20, 0.1) C. >>> abline (0.3, 5.0) D. >>> abline (0.1, 5.0) E. We cannot draw a regression line because we do not know the correlation coefficient. Answer: B 21. The regression line is drawn so that: A. The line goes through more points than any other possible line, straight or curved B. The line goes through more points than any other possible straight line. C. The same number of points are below and above the regression line. D. The sum of the absolute errors is as small as possible. E. The sum of the squared errors is as small as possible. Answer: E 22. In a regression, the --- that the standard error of the regression is, the greater the accuracy of the prediction will be. A. smaller. B. larger C. we do not know unless we know whether the slope of the regression is positive or negative. Answer: A. 23. In order for the regression technique to give the best and minimum variance prediction, all the following conditions must be met, EXCEPT for: A. The relation is linear. B. We have not omitted any significant variable. C. Both the X and Y variables (the predictors and the response) are normally distributed. D. The residuals (errors) are normally distributed. E. The variance around the regression line is about the same for all values of the predictor. Answer: C 24. Note that the last question said that all the conditions are needed for a best and minimum variance prediction. Very few real regressions can meet all the criteria, but their predictions may still be quite good -- at least unbiased, though not perhaps minimum variance. The cases in which the predictions will almost certainly be biased are: A. conditions A and B are not met. In case C, there is no problem, so regression will give the best and minimum variance prediction. In cases D and E, regression will not give the minimum variance prediction. 25. Ecological correlations are weaker than other types of correlation because: C. They are based on averages rather than individuals 26. If a regression has the problem of heteroscedasticity, A. The predictions it makes will be wrong on average. B. The predictions it makes will be correct on average, but we will not be certain of the RMSE C. It will also have the problem of an omitted variable or variables. D. It will also be based on a non-linear equation. Answer: B. Heteroscedasticity implies that the variance will differ for different values of the regressor.

6 The following regression is based on a randomly chosen subset of the ecgrow data set, with 30 countries. Call: ols(grate ~ invest + edu + gdp60 + openness + pop) Coefficients: Estimate t-stat Conf.interval (0.95) (Intercept) (-0.96, 1.28) invest (0.05, 0.14) edu (-0.12, 0.22) gdp (-0.05,-0.02) openness (1.19, 2.62) pop (-0.24, 0.35) SE of regression(or RMSE) = R-squared = If we had to drop two variables from the regression, we would pick: A. Edu and pop, because they have the lowest t-stats. B. GDP60 and edu, because they have the lowest coefficients. C. GDP60 and pop, because they have the lowest t-stats D. GDP60 and openness, because they have the highest t-stats in absolute value. Answer: A. 28. If we wanted to make a 95 percent confidence interval prediction, we would make a point prediction with the equation, but place around that a margin of error of about: A. +/ B. +/ C *.6493 D. +/- 2 *.9155 E. +/ Answer: D Suppose we had a much simpler regression, obtained with the command >>> regress(gdp85 ~ gdp60) Coefficients: Estimate t-stat Conf.interval (0.95) (Intercept) (0.63, 7.03) gdp (0.94, 1.13) SE of regression(or RMSE) = R-squared = If GDP in 1960 were 20 percent of US GDP (so gdp60 were 20), our prediction for gdp85 would be closest to: A. 20 B. 25 C. 30 D. 40 E. 50 Answer: B, since gdp85 = * (20) = = 24.63

. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches)

. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches) PEARSON S FATHER-SON DATA The following scatter diagram shows the heights of 1,0 fathers and their full-grown sons, in England, circa 1900 There is one dot for each father-son pair Heights of fathers and

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

Scatter Plot, Correlation, and Regression on the TI-83/84

Scatter Plot, Correlation, and Regression on the TI-83/84 Scatter Plot, Correlation, and Regression on the TI-83/84 Summary: When you have a set of (x,y) data points and want to find the best equation to describe them, you are performing a regression. This page

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

Homework 11. Part 1. Name: Score: / null

Homework 11. Part 1. Name: Score: / null Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Worksheet A5: Slope Intercept Form

Worksheet A5: Slope Intercept Form Name Date Worksheet A5: Slope Intercept Form Find the Slope of each line below 1 3 Y - - - - - - - - - - Graph the lines containing the point below, then find their slopes from counting on the graph!.

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

The Correlation Coefficient

The Correlation Coefficient The Correlation Coefficient Lelys Bravo de Guenni April 22nd, 2015 Outline The Correlation coefficient Positive Correlation Negative Correlation Properties of the Correlation Coefficient Non-linear association

More information

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Determine If An Equation Represents a Function

Determine If An Equation Represents a Function Question : What is a linear function? The term linear function consists of two parts: linear and function. To understand what these terms mean together, we must first understand what a function is. The

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability).

1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Examples of Questions on Regression Analysis: 1. Suppose that a score on a final exam depends upon attendance and unobserved factors that affect exam performance (such as student ability). Then,. When

More information

Using Excel for Statistical Analysis

Using Excel for Statistical Analysis Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

Statistics 151 Practice Midterm 1 Mike Kowalski

Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-

More information

Introduction to Linear Regression

Introduction to Linear Regression 14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression

More information

Statistics E100 Fall 2013 Practice Midterm I - A Solutions

Statistics E100 Fall 2013 Practice Midterm I - A Solutions STATISTICS E100 FALL 2013 PRACTICE MIDTERM I - A SOLUTIONS PAGE 1 OF 5 Statistics E100 Fall 2013 Practice Midterm I - A Solutions 1. (16 points total) Below is the histogram for the number of medals won

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

table to see that the probability is 0.8413. (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: 60 38 = 1.

table to see that the probability is 0.8413. (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: 60 38 = 1. Review Problems for Exam 3 Math 1040 1 1. Find the probability that a standard normal random variable is less than 2.37. Looking up 2.37 on the normal table, we see that the probability is 0.9911. 2. Find

More information

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0. Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph. MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?

More information

Name: Date: Use the following to answer questions 2-3:

Name: Date: Use the following to answer questions 2-3: Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Solución del Examen Tipo: 1

Solución del Examen Tipo: 1 Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical

More information

Homework 8 Solutions

Homework 8 Solutions Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (a-d), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Formula for linear models. Prediction, extrapolation, significance test against zero slope. Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Descriptive statistics; Correlation and regression

Descriptive statistics; Correlation and regression Descriptive statistics; and regression Patrick Breheny September 16 Patrick Breheny STA 580: Biostatistics I 1/59 Tables and figures Descriptive statistics Histograms Numerical summaries Percentiles Human

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

More information

College Readiness LINKING STUDY

College Readiness LINKING STUDY College Readiness LINKING STUDY A Study of the Alignment of the RIT Scales of NWEA s MAP Assessments with the College Readiness Benchmarks of EXPLORE, PLAN, and ACT December 2011 (updated January 17, 2012)

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

Introduction to Linear Regression

Introduction to Linear Regression 14. Regression A. Introduction to Simple Linear Regression B. Partitioning Sums of Squares C. Standard Error of the Estimate D. Inferential Statistics for b and r E. Influential Observations F. Regression

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1. Joint probabilit is the probabilit that the RVs & Y take values &. like the PDF of the two events, and. We will denote a joint probabilit function as P,Y (,) = P(= Y=) Marginal probabilit of is the probabilit

More information

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but Test Bias As we have seen, psychological tests can be well-conceived and well-constructed, but none are perfect. The reliability of test scores can be compromised by random measurement error (unsystematic

More information

Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Experiment #1, Analyze Data using Excel, Calculator and Graphs. Physics 182 - Fall 2014 - Experiment #1 1 Experiment #1, Analyze Data using Excel, Calculator and Graphs. 1 Purpose (5 Points, Including Title. Points apply to your lab report.) Before we start measuring

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

Stata Walkthrough 4: Regression, Prediction, and Forecasting

Stata Walkthrough 4: Regression, Prediction, and Forecasting Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002

More information

Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking

More information

Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

More information

MATH 60 NOTEBOOK CERTIFICATIONS

MATH 60 NOTEBOOK CERTIFICATIONS MATH 60 NOTEBOOK CERTIFICATIONS Chapter #1: Integers and Real Numbers 1.1a 1.1b 1.2 1.3 1.4 1.8 Chapter #2: Algebraic Expressions, Linear Equations, and Applications 2.1a 2.1b 2.1c 2.2 2.3a 2.3b 2.4 2.5

More information

Chapter 5 Estimating Demand Functions

Chapter 5 Estimating Demand Functions Chapter 5 Estimating Demand Functions 1 Why do you need statistics and regression analysis? Ability to read market research papers Analyze your own data in a simple way Assist you in pricing and marketing

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

4. Multiple Regression in Practice

4. Multiple Regression in Practice 30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look

More information

Econometrics Problem Set #2

Econometrics Problem Set #2 Econometrics Problem Set #2 Nathaniel Higgins nhiggins@jhu.edu Assignment The homework assignment was to read chapter 2 and hand in answers to the following problems at the end of the chapter: 2.1 2.5

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command Forecast Forecast is the linear function with estimated coefficients T T + h = b0 + b1timet + h Compute with predict command Compute residuals Forecast Intervals eˆ t = = y y t+ h t+ h yˆ b t+ h 0 b Time

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information