Exercises for Chapter 2.1

Exercises for Chapter 2.1#

Exercise 2.1

Define the terms population, sample, variable, data value, data, experiment, parameter, and statistic.

Exercise 2.2

Can you provide an example from the field of actuarial science or business where you can identify the population, sample, variable, data value, data, experiment, parameter, and statistic to demonstrate a clear understanding of these key components in a practical context?

Exercise 2.3 Adecco’s Salary Guide 2023

The Adecco Group Thailand, a world-leading HR solutvions agency, has revealed its Salary Guide 2023 which showcases the information of minimum and maximum salaries.

Alt Text

Image from: https://www.nationthailand.com/business/corporate/40024522

  1. What is the population?

  2. Describe the sample used for this report.

  3. Identify the variables used to collect this information.

Example 2.4 mtcars dataset

In this exercise, we use the “mtcars” dataset, which is included in the base R package. It contains information about various car models, including attributes such as miles per gallon (mpg), number of cylinders, and horsepower.

You can stratify the sampling based on the number of cylinders in this dataset.

Tasks:

  1. Explore Cylinder Subgroups in mtcars Dataset

    • Analyze mtcars to understand vehicle characteristics based on cylinder count.

  2. Introduce “CylinderGroup” Variable

    • Use ‘cut’ to create “CylinderGroup” for 3-4, 5-6, and 7-8 cylinders.

  3. Facilitate Subgroup Analysis

    • Utilize “CylinderGroup” for efficient subgroup analysis.

  4. Implement Stratified Sampling

    • Ensure sampling reflects cylinder group proportions.

  5. Enhance Analytical Approach

    • Extract insights from unique characteristics within each cylinder subgroup.

Tip

Here’s a template for using the cut function to create subgroups in R:

# Create a categorical variable for stratification
your_dataframe <- your_dataframe %>%
  mutate(NewVariable = cut(as.numeric(your_variable), breaks = c(break_points), labels = c("Label1", "Label2", "Label3")))

Replace the following placeholders:

  • your_dataframe: Replace with the name of your data frame.

  • your_variable: Replace with the variable you want to categorize.

  • NewVariable: Replace with the desired name for your new categorical variable.

  • break_points: Replace with the specific break points you want to use for subgrouping.

  • "Label1", "Label2", "Label3": Replace with the labels you want for each subgroup.

For example, using the mtcars dataset:

# Create a categorical variable for stratification
mtcars <- mtcars %>%
  mutate(CylinderGroup = cut(as.numeric(cyl), breaks = c(3, 4, 6, 8), labels = c("Low", "Medium", "High")))

Exercise 2.5 freMTPL2freq dataset from CASdatasets package

We explore the freMTPL2freq dataset from the R package CASdatasets, which represents a French motor third-party liability (MTPL) insurance portfolio with observed claim counts in a single accounting year. Use R to delve into this dataset and provide answers to the following questions:

  1. How many observations are there in the dataset?

  2. What is the structure of the dataset?

  3. What is the data type of the ‘IDpol’ variable?

  4. How many levels does the ‘IDpol’ variable have?

  5. Which variables are qualitative (categorical) in nature?

  6. How many quantitative variables are there in the dataset?

  7. What are the possible levels for the ‘VehGas’ variable?

  8. What is the range of values for the ‘VehPower’ variable?

  9. Which regions are included in the dataset?

  10. What is the distribution of values in the ‘Area’ variable?

Exercise 2.6

  1. How can one summarize a qualitative variable? Which chart types are suitable for graphing a qualitative variable?

  2. How does one summarize a quantitative variable? What chart types can be utilized for graphing a quantitative variable?