Multiple regression in r studio

    

Use the appropriate function to read the following data into R as a data frame named lab7.data: CLICK HERE                Download CLICK HERE              

The data set represents several attributes on data breaches across  several organizations. It includes 500 observations and 10 variables.  The names and descriptions of each variable in the data set is provided  below.

  • event_ID: used to label each event
  • data_type: the type of data breached
  • num_people (in millions): the number of people impacted by a data breach, expressed in millions
  • num_people_v2: coded version of the variable num_people
  • num_records (in millions): the number of records breached, expressed in millions
  • per_sensitive: percent of sensitive data breached
  • per_sensitive_v2: coded version of the variable per_sensitive
  • dys_impact: the length of the negative financial impact from the data breach
  • dys_detect: the number of days it takes to detect the breach
  • cost_controls (in millions): the amount of money spent on security controls, expressed in millions

1.  

Use the best subsets approach to determine which  variable(s) would best predict the cost of controls. Please be sure to  exclude categorical variables such as event_ID.

NOTE: Please note that predictors  is being used as a placeholder for the actual predictors in your  model. In your answer below, make sure you replace all the blanks, such  as [1] and [2], with the correct syntax so that the lines of code work.   Make sure you also include the variable names of your predictors in  place of predictors.

bestsubsets = [1]([2]~ predictors, data = [3], [4])

[5]([6], [7] = “adjr2”)

2. 

          The single best one-variable model includes which of the following variables?       

                   Group of answer choices                                data_type       
 
              num_people       
 
              num_people_v2       
 
              num_records       
 
              per_sensitive       
 
              per_sensitive_v2       
 
              dys_impact       
 
              dys_detect       
 
 3. 

          The single best two-variable model includes which of the following variables?       

                   Group of answer choices                                data_type       
 
              num_people       
 
              num_people_v2       
 
              num_records       
 
              per_sensitive       
 
              per_sensitive_v2       
 
              dys_impact       
 
              dys_detect       
 
 4. 

          The single best three-variable model includes which of the following variables?       

                   Group of answer choices                                data_type       
 
              num_people       
 
              num_people_v2       
 
              num_records       
 
              per_sensitive       
 
              per_sensitive_v2       
 
              dys_impact       
 
              dys_detect       
 
 5. 

          The single best four-variable model includes which of the following variables?       

                   Group of answer choices                                data_type       
 
              num_people       
 
              num_people_v2       
 
              num_records       
 
              per_sensitive       
 
              per_sensitive_v2       
 
              dys_impact       
 
              dys_detect       
 
 6.  

Run five separate regression models that represent the five  models shown in the best subsets plot in R. Number your models  sequentially from Model 1 to 5 based on the number of predictors it  includes. Provide the Adjusted R2 values for each of your five models below.

Note: Please the report the values as displayed in R. Do not round them.

Model 1:  

Model 2:  

Model 3:  

Model 4:  

Model 5:  

7. 

          After examining the significance of the predictors in each model and their Adjusted R2, which of the following models provides the best fit for predicting the cost of controls?       

                   Group of answer choices                                Model 1       
 
              Model 2       
 
              Model 3       
 
              Model 4       
 
              Model 5       
 
              Model 6       
 
 8.  

Evaluate Model 5 for multicollinearity and provide the estimates below.

The highest VIF among your predictors is:  

The lowest tolerance among your predictors is:  

Note: Please report each of these values as displayed in R. Do not round them.

9. 

          Based on the results from your calculations for the tolerance, you can conclude that for Model 5 there is:       

                   Group of answer choices                                a potential concern for multicollinearity.       
 
              a serious concern for multicollinearity.       
 
              no concern for multicollinearity.       
 

10.  

          Based on the results from your calculations for the VIF, you can conclude that for Model 5 there is:       

                   Group of answer choices                                a concern for multicollinearity.       
 
              no concern for multicollinearity.       
 
 11.  

          Use R to generate a correlation matrix for the predictors used  in Model 5. Based on your results, the strongest correlation can be  found between which of the following two predictors?       

                   Group of answer choices                                num_people       
 
              num_records       
 
              per_sensitive       
 
              dys_impact       
 
              dys_detect       
 
              cost_controls       
 
 
 12.  

          The value of the strongest correlation between your predictors is:       

 
 







Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more

Get 15% OFF on your FIRST order. Use the coupon code: new15