🐍 Python Implementation Scripts

Survey Generation & WTP Analysis Tools

Implementation Overview

Two Python scripts provide complete data generation and analysis capabilities.

All scripts are production-ready, fully documented, and use industry-standard libraries.

Total Scripts
2
Python Files
Lines of Code
1,874
Total Implementation
Dependencies
6
Core Libraries
Python Version
3.8+
Required

📜 Script 1: generate_survey_responses.py

Purpose

Generates statistically valid synthetic survey responses based on persona distributions and consistency rules.

Key Features

  • Persona-based generation: 5 distinct personas with realistic distributions
  • Consistency rules: 20+ rules ensure realistic response patterns
  • Conditional logic: Properly handles branching survey questions
  • Edge cases: Includes contradictions, minimal engagement, power users, and fatigue patterns (18-27%)
  • Budget calculation: Feature multipliers and consistency adjustments
  • Reproducibility: Seed-based random generation

Statistics

  • Lines of Code: 642
  • Functions: 15+ specialized generation methods
  • Classes: 2 (Persona, SurveyResponseGenerator)

Usage

python3 generate_survey_responses.py [SEED] [COUNT] [OUTPUT_FILE]

# Example: Generate 500 responses
python3 generate_survey_responses.py 42 500 persona_web_hosting_data.csv

Parameters

Parameter Required Description Default
SEED No Random seed for reproducibility 42
COUNT No Number of valid responses to generate 500
OUTPUT_FILE No Output CSV filename persona_web_hosting_data.csv

Dependencies

import pandas as pd
import numpy as np
from scipy import stats
import random
from dataclasses import dataclass
from typing import List, Dict, Optional, Tuple
import csv

Key Methods

  • _select_persona() - Weighted persona selection
  • _generate_q1() through _generate_q24() - Question-specific generation
  • _calculate_budget() - Budget calculation with multipliers
  • _apply_edge_cases() - Realistic variance injection
  • generate_response() - Main generation orchestration

📜 Script 2: analyze_wtp.py

Purpose

Comprehensive WTP analysis implementing "Monetizing Innovation" methodology with statistical rigor.

Key Features

  • Statistical analysis: ANOVA, t-tests, correlation, regression models
  • Machine learning: K-means clustering, cross-validation, multiple regression models
  • Feature classification: Table Stakes, Performance Features, Delighters
  • Persona profiling: Detailed monetization profiles for 5 personas
  • Monetization models: Four models classification (Maximizer, Penetrator, Underdog, Champion)
  • Professional visualizations: 6 publication-ready charts (300 DPI)
  • Data exports: 4 CSV analysis outputs

Statistics

  • Lines of Code: 1,232
  • Functions: 30+ analysis methods
  • Classes: 1 (WTPAnalyzer)
  • Analysis Steps: 6 major components

Usage

python3 analyze_wtp.py

# Analyzes persona_web_hosting_data.csv by default
# Generates outputs in analysis/ directory

Output Files Created

Visualizations (PNG, 300 DPI):

  • wtp_distribution.png - WTP distribution analysis
  • feature_correlations.png - Feature impact analysis
  • feature_classification_matrix.png - Feature value matrix
  • persona_comparison.png - Persona characteristics
  • price_sensitivity.png - Price sensitivity segmentation
  • monetization_models.png - Four models classification

Data Exports (CSV):

  • persona_profiles.csv - Persona statistics
  • feature_correlations.csv - Feature-WTP correlations
  • pricing_tiers.csv - Recommended tiers
  • monetization_models.csv - Model classifications

Dependencies

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, Lasso, Ridge
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.tree import DecisionTreeRegressor

Analysis Pipeline

  1. Data Preparation: Load, validate, create derived variables
  2. WTP Distribution Analysis: Overall stats, persona comparison, natural tiers
  3. Feature Correlation Analysis: Priority correlations, regression models, feature classification
  4. Persona Profiling: Create detailed monetization profiles
  5. Price Sensitivity Analysis: Segment by sensitivity, analyze value-seeking behavior
  6. Monetization Models: Apply four models framework, classify customers
  7. Pricing Recommendations: Generate tiered structure, value-based pricing

Key Methods

  • analyze_wtp_distribution() - Complete WTP analysis
  • analyze_feature_correlations() - Feature-price relationships
  • create_persona_profiles() - Detailed persona profiles
  • analyze_price_sensitivity() - Sensitivity segmentation
  • apply_monetizing_innovation_framework() - Four models classification
  • generate_pricing_recommendations() - Strategic pricing

🛠️ Installation & Setup

System Requirements

  • Python: 3.8 or higher
  • Operating System: macOS, Linux, or Windows
  • RAM: 2GB minimum (4GB recommended)
  • Disk Space: 50MB for dependencies and outputs

Install Dependencies

# Using pip
pip install pandas numpy scipy matplotlib seaborn scikit-learn

# Using conda
conda install pandas numpy scipy matplotlib seaborn scikit-learn

# Verify installation
python3 -c "import pandas, numpy, scipy, matplotlib, seaborn, sklearn; print('All dependencies installed!')"

Quick Start

# Navigate to project directory
cd /Users/dkuciel/Visual\ Studio\ Code/2025-11\ WTP\ 2.0

# Generate survey responses
python3 generate_survey_responses.py 42 500 persona_web_hosting_data.csv

# Run complete analysis
python3 analyze_wtp.py

# View results
ls -l analysis/

📚 Code Examples

Custom Analysis Example

Extend the analysis with custom code:

import pandas as pd
from analyze_wtp import WTPAnalyzer

# Load analyzer
analyzer = WTPAnalyzer('persona_web_hosting_data.csv')
analyzer.load_and_prepare_data()

# Custom analysis: Find top features by persona
for persona in analyzer.df['Persona'].unique():
    persona_data = analyzer.df[analyzer.df['Persona'] == persona]
    avg_budget = persona_data['budget_numeric'].mean()
    print(f"{persona}: ${avg_budget:.2f}/month")

# Export custom report
custom_df = analyzer.df[['Persona', 'budget_numeric', 'q14_feature_count']]
custom_df.to_csv('custom_analysis.csv', index=False)

Generate Custom Dataset

Create specialized datasets:

from generate_survey_responses import SurveyResponseGenerator

# Create generator with custom seed
generator = SurveyResponseGenerator(seed=2025)

# Generate 1000 responses
responses = generator.generate_responses(target_valid=1000)

# Export to custom file
generator.export_to_csv(responses, 'custom-responses-1000.csv')

🐛 Troubleshooting

Common Issues

Issue: ModuleNotFoundError

Solution: Install missing dependencies

pip install pandas numpy scipy matplotlib seaborn scikit-learn

Issue: File not found error

Solution: Ensure you're in the correct directory

cd /Users/dkuciel/Visual\ Studio\ Code/2025-11\ WTP\ 2.0
ls -l persona_web_hosting_data.csv

Issue: Low qualification rate

Solution: This is expected behavior (target: 73%). The script automatically generates additional responses until the target count of qualified responses is reached.

📖 Further Reading