Implementation Overview
Two Python scripts provide complete data generation and analysis capabilities.
All scripts are production-ready, fully documented, and use industry-standard libraries.
📜 Script 1: generate_survey_responses.py
Purpose
Generates statistically valid synthetic survey responses based on persona distributions and consistency rules.
Key Features
- Persona-based generation: 5 distinct personas with realistic distributions
- Consistency rules: 20+ rules ensure realistic response patterns
- Conditional logic: Properly handles branching survey questions
- Edge cases: Includes contradictions, minimal engagement, power users, and fatigue patterns (18-27%)
- Budget calculation: Feature multipliers and consistency adjustments
- Reproducibility: Seed-based random generation
Statistics
- Lines of Code: 642
- Functions: 15+ specialized generation methods
- Classes: 2 (Persona, SurveyResponseGenerator)
Usage
python3 generate_survey_responses.py [SEED] [COUNT] [OUTPUT_FILE]
# Example: Generate 500 responses
python3 generate_survey_responses.py 42 500 persona_web_hosting_data.csv
Parameters
| Parameter | Required | Description | Default |
|---|---|---|---|
| SEED | No | Random seed for reproducibility | 42 |
| COUNT | No | Number of valid responses to generate | 500 |
| OUTPUT_FILE | No | Output CSV filename | persona_web_hosting_data.csv |
Dependencies
import pandas as pd
import numpy as np
from scipy import stats
import random
from dataclasses import dataclass
from typing import List, Dict, Optional, Tuple
import csv
Key Methods
_select_persona()- Weighted persona selection_generate_q1() through _generate_q24()- Question-specific generation_calculate_budget()- Budget calculation with multipliers_apply_edge_cases()- Realistic variance injectiongenerate_response()- Main generation orchestration
📜 Script 2: analyze_wtp.py
Purpose
Comprehensive WTP analysis implementing "Monetizing Innovation" methodology with statistical rigor.
Key Features
- Statistical analysis: ANOVA, t-tests, correlation, regression models
- Machine learning: K-means clustering, cross-validation, multiple regression models
- Feature classification: Table Stakes, Performance Features, Delighters
- Persona profiling: Detailed monetization profiles for 5 personas
- Monetization models: Four models classification (Maximizer, Penetrator, Underdog, Champion)
- Professional visualizations: 6 publication-ready charts (300 DPI)
- Data exports: 4 CSV analysis outputs
Statistics
- Lines of Code: 1,232
- Functions: 30+ analysis methods
- Classes: 1 (WTPAnalyzer)
- Analysis Steps: 6 major components
Usage
python3 analyze_wtp.py
# Analyzes persona_web_hosting_data.csv by default
# Generates outputs in analysis/ directory
Output Files Created
Visualizations (PNG, 300 DPI):
wtp_distribution.png- WTP distribution analysisfeature_correlations.png- Feature impact analysisfeature_classification_matrix.png- Feature value matrixpersona_comparison.png- Persona characteristicsprice_sensitivity.png- Price sensitivity segmentationmonetization_models.png- Four models classification
Data Exports (CSV):
persona_profiles.csv- Persona statisticsfeature_correlations.csv- Feature-WTP correlationspricing_tiers.csv- Recommended tiersmonetization_models.csv- Model classifications
Dependencies
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, Lasso, Ridge
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.tree import DecisionTreeRegressor
Analysis Pipeline
- Data Preparation: Load, validate, create derived variables
- WTP Distribution Analysis: Overall stats, persona comparison, natural tiers
- Feature Correlation Analysis: Priority correlations, regression models, feature classification
- Persona Profiling: Create detailed monetization profiles
- Price Sensitivity Analysis: Segment by sensitivity, analyze value-seeking behavior
- Monetization Models: Apply four models framework, classify customers
- Pricing Recommendations: Generate tiered structure, value-based pricing
Key Methods
analyze_wtp_distribution()- Complete WTP analysisanalyze_feature_correlations()- Feature-price relationshipscreate_persona_profiles()- Detailed persona profilesanalyze_price_sensitivity()- Sensitivity segmentationapply_monetizing_innovation_framework()- Four models classificationgenerate_pricing_recommendations()- Strategic pricing
🛠️ Installation & Setup
System Requirements
- Python: 3.8 or higher
- Operating System: macOS, Linux, or Windows
- RAM: 2GB minimum (4GB recommended)
- Disk Space: 50MB for dependencies and outputs
Install Dependencies
# Using pip
pip install pandas numpy scipy matplotlib seaborn scikit-learn
# Using conda
conda install pandas numpy scipy matplotlib seaborn scikit-learn
# Verify installation
python3 -c "import pandas, numpy, scipy, matplotlib, seaborn, sklearn; print('All dependencies installed!')"
Quick Start
# Navigate to project directory
cd /Users/dkuciel/Visual\ Studio\ Code/2025-11\ WTP\ 2.0
# Generate survey responses
python3 generate_survey_responses.py 42 500 persona_web_hosting_data.csv
# Run complete analysis
python3 analyze_wtp.py
# View results
ls -l analysis/
📚 Code Examples
Custom Analysis Example
Extend the analysis with custom code:
import pandas as pd
from analyze_wtp import WTPAnalyzer
# Load analyzer
analyzer = WTPAnalyzer('persona_web_hosting_data.csv')
analyzer.load_and_prepare_data()
# Custom analysis: Find top features by persona
for persona in analyzer.df['Persona'].unique():
persona_data = analyzer.df[analyzer.df['Persona'] == persona]
avg_budget = persona_data['budget_numeric'].mean()
print(f"{persona}: ${avg_budget:.2f}/month")
# Export custom report
custom_df = analyzer.df[['Persona', 'budget_numeric', 'q14_feature_count']]
custom_df.to_csv('custom_analysis.csv', index=False)
Generate Custom Dataset
Create specialized datasets:
from generate_survey_responses import SurveyResponseGenerator
# Create generator with custom seed
generator = SurveyResponseGenerator(seed=2025)
# Generate 1000 responses
responses = generator.generate_responses(target_valid=1000)
# Export to custom file
generator.export_to_csv(responses, 'custom-responses-1000.csv')
🐛 Troubleshooting
Common Issues
Issue: ModuleNotFoundError
Solution: Install missing dependencies
pip install pandas numpy scipy matplotlib seaborn scikit-learn
Issue: File not found error
Solution: Ensure you're in the correct directory
cd /Users/dkuciel/Visual\ Studio\ Code/2025-11\ WTP\ 2.0
ls -l persona_web_hosting_data.csv
Issue: Low qualification rate
Solution: This is expected behavior (target: 73%). The script automatically generates additional responses until the target count of qualified responses is reached.