Pandas is a powerful data manipulation and analysis library for Python. It provides data structures like DataFrames and Series that make it easy to work with structured data. This article presents three practical examples of data analysis using Pandas in Python, demonstrating its capabilities in various scenarios.
In this example, we will analyze sales data from a retail company to understand revenue trends over time. This use case is common for businesses looking to optimize their sales strategies.
import pandas as pd
# Sample sales data
sales_data = {
'Date': ['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04'],
'Revenue': [200, 220, 250, 300],
}
# Create a DataFrame
sales_df = pd.DataFrame(sales_data)
# Convert 'Date' to datetime
sales_df['Date'] = pd.to_datetime(sales_df['Date'])
# Set 'Date' as index
sales_df.set_index('Date', inplace=True)
# Calculate the rolling average of revenue over 2 days
sales_df['Rolling_Avg'] = sales_df['Revenue'].rolling(window=2).mean()
print(sales_df)
Notes: This example demonstrates how to analyze sales data and calculate a rolling average, which helps in understanding trends over time. You can adjust the window
parameter in the rolling function to change the period for averaging.
In this example, we will use customer data to segment them based on their purchasing behavior. This is vital for targeted marketing strategies.
import pandas as pd
# Sample customer data
customer_data = {
'CustomerID': [1, 2, 3, 4, 5],
'Annual_Spend': [500, 1500, 300, 700, 1200],
'Loyalty_Score': [5, 10, 4, 6, 9]
}
# Create a DataFrame
customer_df = pd.DataFrame(customer_data)
# Define a function to categorize customers
def categorize_customer(spend):
if spend < 700:
return 'Low'
elif spend < 1000:
return 'Medium'
else:
return 'High'
# Apply the function to create a new column
customer_df['Segment'] = customer_df['Annual_Spend'].apply(categorize_customer)
print(customer_df)
Notes: This example showcases how to categorize customers into segments based on their annual spending. You can modify the thresholds in the categorize_customer
function to suit your needs.
In this example, we will analyze website traffic data to identify patterns over time. This is useful for digital marketing and website optimization.
import pandas as pd
import numpy as np
# Generate sample website traffic data
np.random.seed(0) # For reproducibility
traffic_data = {
'Date': pd.date_range(start='2023-01-01', periods=30, freq='D'),
'Visitors': np.random.randint(100, 500, size=30)
}
# Create a DataFrame
traffic_df = pd.DataFrame(traffic_data)
# Set 'Date' as index
traffic_df.set_index('Date', inplace=True)
# Calculate the daily percentage change in visitors
traffic_df['Pct_Change'] = traffic_df['Visitors'].pct_change()
print(traffic_df)
Notes: This example illustrates how to calculate the daily percentage change in website visitors, helping to identify trends and anomalies in traffic. You can also visualize this data using libraries like Matplotlib or Seaborn for better insights.
These three examples illustrate the versatility of Pandas in performing data analysis tasks. By applying these techniques, you can uncover valuable insights from your data.