Financial Consumer Complaints

Consumer complaints on financial products & services for Bank of America from 2017 to 2023

CASE

In the 2017-2023 period there were many complaints from customers regarding Bank of America's financial products and services covering many aspects. From this case we will analyze and look for answers to the following questions:

Do consumer complaints show any seasonal patterns?
Which products present the most complaints? What are its most common issues?
How are complaints typically resolved?
What can be learned from the complaints with untimely responses?

To solve the problem, Dataset created in Microsoft Excel from Maven Analytics will be used to provide concrete answers. Here Pandas will also be used for data search and data visualization.

INSTALLING NECESSARY RECOURSES

Let's get started. The first thing we need to do is we have to install the resources we need such as Pandas for analyzing data & MatPlotLib and Seaborn for data visualization

											
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

EXPORTING DATASET

after that we have to download the dataset from Maven Analytics and export it to Jupyter Notebook

											
file = pd.read_csv(Data/complaints.csv')

DATA CLEANING

Then we have to clean the data. Here i have dropped duplicates and delete every data whose complaints does not have sub-product in it (sub-product is NULL), because every product should have sub-products

											
file = file.drop_duplicates() # dropping duplicates
file = file.dropna(subset=['Sub-product']) # deleting data without sub-product

STANDARDIZING DATA

Its aims are to make the data uniform and prevent us from errors that might occur due to the diversity of the data. For example, I have set the date data to the same format

											
file['Date submitted'] = pd.to_datetime(file['Date submitted'])
file['Date received'] = pd.to_datetime(file['Date received'])

QUESTION 1 ~ Do consumer complaints show any seasonal patterns?

Here we need to look at the number of complaints each year and search the average in order to analyze this question

											
file['Year'] = file['Date submitted'].dt.year # Extracting Years from column 'Date submitted'
yearly_complaints = file.groupby('Year')['Complaint ID'].count() # Count the number of complaints each year
avg_complaints = yearly_complaints.mean() # Searching average

# Visualizing data
plt.figure(figsize=(15, 8))
yearly_complaints.plot(marker='o')
plt.title('Seasonal Complaint Trends')
plt.xlabel('Year')
plt.ylabel('Number of Complaints')
plt.grid(True)
plt.show()

The pattern we can see here is that the highest number of complaints is in 2022, the lowest in 2017 and average on 8929 complaints each year. Furthermore, There is a consistent increase from 2019-2022. This indicates that there is a serious problem on the company side. Most likely it lies in the service and products offered. It is true that in 2023 the number of complaints have been decreasing, but compared to previous years, the number of complaints is still relatively high. Even so, this indicates that there have been improvements to the existing system, perhaps starting from the service system, products or policies. Consumer behavior may also be very influential in reducing the number of complaints. It should be noted that the strategy currently used by the company should be maintained and always reviewed so that the number of complaints continues to decrease and customers are always satisfied with the products offered.

Then we also need to look at complaint trends based on the products offered by the company to see what products need to be reviewed.

											
result = file.groupby([file['Date submitted'].dt.year, 'Product']).size() # Calculate the number of complaints for each year and product combination
result = result.reset_index(name='Total Complaints') # Reset the index to be able to use its columns in the plot

# Visualizing data
plt.figure(figsize=(18, 13))
for product, data in result.groupby('Product'):
plt.plot(data['Date submitted'], data['Total Complaints'], marker='o', label=product)

plt.title('Seasonal Complaint Trends by Product')
plt.xlabel('Year')
plt.ylabel('Total Complaints')
plt.legend()
plt.grid(True)
plt.show()

From the graph above, it can be seen that the number of complaints per product each year was quite stable and even tends to decrease in 2018-2019, then increased rapidly in 2020-2022, except for "Mortgage" and "Debt Collection" products. And in 2023, all products will show a significant decline. 2 products that need special attention in this case are "Checking or savings account" and "Credit Card or Prepaid Card". It can be seen that there are a lot of customers who use this product, so product quality and service speed need to be improved so that customer comfort is guaranteed.

QUESTION 2 ~ Which products present the most complaints? What are its most common issues?

Here we have to calculate the total complaints from each product and look for the problems that most frequently arise from each product. Here I will write the syntax to find the product with the highest total complaints and the problems that occur most frequently in each product.

											
# Product with most complaints
product_complaints = file.groupby('Product').size() # Count the number of complaints for each product 

# Visualizing data
plt.figure(figsize=(22, 8))

sorted_products = product_complaints.sort_values(ascending=False)

bars = plt.bar(range(len(sorted_products)), sorted_products.values, color=plt.cm.get_cmap('Pastel1')(range(len(sorted_products))))

plt.title('Total Complaints by Product')
plt.xlabel('Product')
plt.ylabel('Total Complaints')
plt.xticks([])
plt.grid(False)

for bar, value in zip(bars, sorted_products.values):
    plt.text(bar.get_x() + bar.get_width()/2, value + 0.1, str(value), ha='center', va='bottom')

plt.legend(bars, sorted_products.index, title='Product')

plt.show()

Here you can see that the product with the highest complaints is "Checking or savings account" while the lowest is "Student loan". For products with the highest complaints, companies need to review the product starting from the services offered, programs and services.

Then we will see what problems customers often complain about with a product

												
# Most common issues
product_issue_counts = file.groupby(['Product', 'Issue']).size().reset_index(name='Count') # Calculate the number of complaints for each combination of product and problem 
top_issue_per_product = product_issue_counts.sort_values(['Product', 'Count'], ascending=[True, False])
top_issue_per_product = top_issue_per_product.groupby('Product').head(1).reset_index(drop=True)

# Visualizing data
plt.figure(figsize=(14, 8))

unique_combinations = top_issue_per_product[['Product', 'Issue']].apply(lambda x: f"{x['Product']} <=> {x['Issue']}", axis=1)
colors = plt.cm.tab20(range(len(unique_products)))
legend_handles = []

for i, product in enumerate(unique_products):
    product_data = top_issue_per_product[top_issue_per_product['Product'] == product]
    bars = plt.bar(product_data['Issue'], product_data['Count'], color=colors[i])
    legend_handles.append(bars[0])

plt.title('Most Common Issue for Each Product')
plt.xlabel('Count of Issues')
plt.ylabel('Number of Complaints')
plt.xticks([])

plt.legend(legend_handles, unique_combinations, title='Product <=> Most Issues')

for index, value in enumerate(top_issue_per_product['Count']):
	plt.text(index, value, str(value), ha='center', va='bottom')

plt.tight_layout()
plt.show()

You can see each product with its problems. It is best to immediately find a solution that can reduce the number of complaints and improve quality. An example that can be taken is the product "Checking or savings account", a possible solution that could be offered to customers is to make instructions for managing the account properly or if the company uses a website or application that is related to this problem, then this website / application should have an easier interface.

QUESTION 3 ~ How are complaints typically resolved?

Here we will identify how problems are solved and how often these solutions are used to solve a problem

											
file['Monetary Relief'] = (file['Company response to consumer'] == 'Closed with monetary relief').astype(int)
file['Explanation'] = (file['Company response to consumer'] == 'Closed with explanation').astype(int)
file['Non-Monetary Relief'] = (file['Company response to consumer'] == 'Closed with non-monetary relief').astype(int)

# Calculate the amount for each condition
result = pd.DataFrame({
'Monetary Relief': file['Monetary Relief'].sum(),
'Explanation': file['Explanation'].sum(),
'Non-Monetary Relief': file['Non-Monetary Relief'].sum()
}, index=[0])

dfmi = pd.DataFrame(result.values.reshape(1, -1), columns=pd.MultiIndex.from_product([['Monetary Relief', 'Explanation', 'Non-Monetary Relief']]))

# Visualizing data
if 'Total' in dfmi.columns:
    ax = dfmi.drop(columns=['Total']).plot(kind='bar', figsize=(10, 6))
else:
    ax = dfmi.plot(kind='bar', figsize=(10, 6))

for p in ax.patches:
    ax.annotate(str(p.get_height()), (p.get_x() + p.get_width() / 2., p.get_height()),
                ha='center', va='center', xytext=(0, 5), textcoords='offset points')

plt.title('Complaint Resolution Types')
plt.xlabel('Category')
plt.ylabel('Number of Complaints')
plt.xticks(rotation=0)  # Rotasi label sumbu x menjadi horizontal
plt.legend(title='Resolution Type')
plt.show()

"Words trump all". that's the principle companies use to "appease" customers. With good explanations, all problems complained by customers can be resolved well.

QUESTION 4 ~ What can be learned from the complaints with untimely responses?

To answer this we need to look at complaints that are not responded to in a timely manner by also looking at aspects of the product, sub-product, problem and frequency of complaints and how long it takes for complaints to be responded to. Here I will display the top 5 from all categories

											
# Filter untimely complaints
untimely_responses = file[file['Timely response?'] == 'No']

# Descriptive analysis
product_distribution = untimely_responses['Product'].value_counts()
sub_product_distribution = untimely_responses['Sub-product'].value_counts().head()
issue_distribution = untimely_responses['Issue'].value_counts().head()
state_distribution = untimely_responses['State'].value_counts()

# Visualization with annotations
def add_annotations(barplot):
	for i in barplot.patches:
	barplot.annotate(format(i.get_width(), '.0f'),
	(i.get_width(), i.get_y() + i.get_height() / 2),
	ha='left', va='center',
	xytext=(5, 0),
	textcoords='offset points')

# Distribution of Untimely Responses by Product
plt.figure(figsize=(10, 6))
barplot = sns.countplot(data=untimely_responses, y='Product', order=product_distribution.index)
plt.title('Distribution of Untimely Responses by Product')
add_annotations(barplot)
plt.show()

# Distribution of Untimely Responses by Sub-Product
plt.figure(figsize=(10, 6))
barplot = sns.countplot(data=untimely_responses, y='Sub-product', order=sub_product_distribution.index)
plt.title('Distribution of Untimely Responses by Sub-Product')
add_annotations(barplot)
plt.show()

# Distribution of Untimely Responses by Issue
plt.figure(figsize=(10, 6))
barplot = sns.countplot(data=untimely_responses, y='Issue', order=issue_distribution.index)
plt.title('Distribution of Untimely Responses by Issue')
add_annotations(barplot)
plt.show()

# Time Analysis
untimely_responses['Date submitted'] = pd.to_datetime(untimely_responses['Date submitted'])
untimely_responses['Date received'] = pd.to_datetime(untimely_responses['Date received'])
untimely_responses['response_time'] = (untimely_responses['Date received'] - untimely_responses['Date submitted']).dt.days
		
# Visualization with annotations for response time histogram
plt.figure(figsize=(10, 6))
histplot = sns.histplot(untimely_responses['response_time'], bins=30, kde=False)
plt.title('Distribution of Response Time for Untimely Responses')
plt.xlabel('Response Time (days)')
plt.ylabel('Complain Frequency')

plt.show()

Product Analysis

Sub Product Analysis

Issues Analysis

Complain Frequency Analysis

From all the data above, it can be said that customers need a user guide or interface that makes it easier for them. Apart from that, the system and human resources owned by the company need to be improved because there are still many errors which can be seen from the issue "Incorrect information on your report" which has quite a high value. Apart from that, efforts are made to stabilize the response time at 1-2 days because according to the data above, there are still complaints that are even answered more than 30 days later.

Conclusion

From the analysis above, we can conclude that:
1. Seasonal Patterns:

The year 2022 witnessed the highest number of complaints, indicating significant issues in our products and services.
Complaints have shown a consistent increase from 2019 to 2022, suggesting persistent challenges that need urgent resolution.
While 2023 saw a decline in complaints, the numbers remain relatively high compared to previous years, signaling ongoing improvements but highlighting areas that still require enhancement.
Continued review and enhancement of our strategies are essential to sustain this positive trend and ensure customer satisfaction.

2. Most complaint Products & most common issues:

"Checking or savings account" and "Credit Card or Prepaid Card" are identified as products with notable complaint volumes.
These products require focused attention on improving service quality and operational efficiency to enhance customer experience and satisfaction.
Solutions should prioritize clear communication, streamlined processes, and user-friendly interfaces to address these concerns effectively.
Implementing customer-centric improvements, such as intuitive account management guides or enhanced digital platforms, can mitigate these issues.

3. Resolution of complaints:

Problems should also be solved with real action and not limited to explanations

4. What to do with untimely responses:

Improving the quality of services and products is very important
Training and empowering customer service team to provide comprehensive explanations and swift resolutions are crucial steps forward.
Addressing delays in response times, particularly those exceeding 30 days, requires operational refinements and resource allocation adjustments.
Establishing a standard response time of 1-2 days should be a priority to meet customer expectations consistently.

Check out my other interesting projects !!!

Great American Coffee Test

Around 4000 People with variable background were served 4 different Coffee

Learn more

CRM Sales Opportunities

Analysing sales opportunities from CRM

Learn more

Coming Soon

Will be out soon !!!