# Simulated employee satisfaction dataset
set.seed(123)
<- data.frame(
employee_data work_life_balance = rnorm(300, mean = 3, sd = 0.5),
monthly_income = rnorm(300, mean = 5000, sd = 1000),
job_role = sample(c("Engineer", "Manager", "Clerk", "Sales"), 300, replace = TRUE),
satisfaction_level = rnorm(300, mean = 0.7, sd = 0.1)
)
# Scale satisfaction_level to 0-1
$satisfaction_level <- pmin(pmax(employee_data$satisfaction_level, 0), 1)
employee_data
# View dataset
head(employee_data)
7 Deep Learning for Social Sciences
Deep learning is a subset of machine learning that leverages neural networks to analyze complex patterns in data (Chollet & Allaire, 2021; LeCun et al., 2015). While traditionally used in domains like image recognition and natural language processing, deep learning has significant potential in social sciences for uncovering insights from large and unstructured datasets.
7.1 Introduction to Deep Learning
7.1.1 What is Deep Learning?
Deep learning involves training artificial neural networks to recognize patterns in data. These networks mimic the way the human brain processes information, using multiple layers to learn hierarchical representations.
7.1.2 Key Features of Deep Learning
- Representation Learning: Learns features directly from data.
- Nonlinear Transformations: Captures complex relationships in data.
- Scalability: Handles large and unstructured datasets.
7.3 Neural Networks: The Basics
A neural network consists of: - Input Layer: Takes raw data (e.g., age, income, survey responses).
- Hidden Layers: Process data using weights, biases, and activation functions.
- Output Layer: Produces the final prediction or classification.
7.3.1 How Neural Networks Learn
- Weights and Biases: Adjusted during training to minimize prediction errors.
- Activation Functions: Introduce non-linearity (e.g., ReLU, sigmoid).
- Backpropagation: Updates weights by minimizing loss using gradient descent.
7.4 Implementing a Simple Neural Network
We’ll use the keras
package in R to build a neural network that predicts survey responses.
7.4.1 Install and Load Keras
install.packages("keras")
library(keras)
7.4.2 Install TensorFlow backend
install_keras()
7.4.3 Build and Train the Model
model <- keras_model_sequential() %>%
layer_dense(units = 16, activation = "relu", input_shape = c(2)) %>%
layer_dense(units = 8, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")
model %>% compile(
optimizer = "adam",
loss = "binary_crossentropy",
metrics = c("accuracy")
)
history <- model %>% fit(
x = train_data[, c("age", "income")],
y = train_data$outcome,
epochs = 50,
batch_size = 10,
validation_split = 0.2
)
7.4.4 Evaluate the Model
model %>% evaluate(test_data[, c("age", "income")], test_data$outcome)
7.5 Ethical Considerations in Deep Learning
1- Bias in Training Data:
- Models trained on biased data may perpetuate inequalities.
- Example: Overrepresentation of certain groups in survey data.
2- Privacy Concerns:
Deep learning often requires large datasets, raising questions about consent and data security.
7.6 Summary
In this chapter, we:
- Introduced deep learning and neural networks.
- Explored applications of deep learning in social sciences.
- Implemented a simple neural network using keras.
- Discussed ethical considerations in deep learning research.
7.7 Case Study: Predicting Employee Satisfaction with Neural Networks
7.7.1 Introduction
Employee satisfaction is a crucial metric for organizational success. Predicting satisfaction levels based on survey responses can help companies identify areas for improvement. In this case study, we demonstrate how to build and train a neural network to predict satisfaction levels using the keras
package.
7.7.2 Objective
This case study demonstrates:
1. Building a neural network using the keras
package.
2. Training the network on employee satisfaction survey data.
3. Evaluating model performance and interpreting results.
7.7.3 Dataset
We simulate an employee satisfaction dataset for this case study.
7.7.4 Step 1: Data Splitting
library(tidymodels)
# Split the data
set.seed(123)
<- initial_split(employee_data, prop = 0.8)
employee_split <- training(employee_split)
train_data <- testing(employee_split) test_data
7.7.5 Step 2: Data Preprocessing
# Define a recipe
<- recipe(satisfaction_level ~ ., data = train_data) %>%
employee_recipe step_dummy(all_nominal_predictors()) %>%
step_normalize(all_numeric_predictors())
# Prepare the recipe
<- prep(employee_recipe)
prepared_recipe <- bake(prepared_recipe, new_data = NULL)
processed_train <- bake(prepared_recipe, new_data = test_data) processed_test
7.7.6 Step 3: Build the Neural Network
library(keras)
# Define the model
<- keras_model_sequential() %>%
model layer_dense(units = 16, activation = "relu", input_shape = ncol(processed_train) - 1) %>%
layer_dense(units = 8, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")
# Compile the model
%>% compile(
model optimizer = "adam",
loss = "mse",
metrics = c("mae")
)
# Summary of the model
summary(model)
7.7.7 Step 4: Train the Model
# Separate features and target
<- as.matrix(processed_train[, -ncol(processed_train)])
x_train <- as.matrix(processed_train$satisfaction_level)
y_train
# Train the model
<- model %>% fit(
history x = x_train,
y = y_train,
epochs = 50,
batch_size = 16,
validation_split = 0.2
)
# Plot training history
plot(history)
7.7.8 Step 5: Evaluate the Model
# Separate features and target
<- as.matrix(processed_test[, -ncol(processed_test)])
x_test <- as.matrix(processed_test$satisfaction_level)
y_test
# Evaluate the model
<- model %>% evaluate(x_test, y_test)
evaluation evaluation
7.7.9 Step 6: Make Predictions
# Predict satisfaction levels
<- model %>% predict(x_test)
predictions
# Combine predictions with actual values
<- data.frame(
results Actual = y_test,
Predicted = predictions
)
# View results
head(results)