Estadistica Practica Para - Ciencia De Datos Y Python High Quality [upd]
| ✅ Do | ❌ Don’t | |------|---------| | Always visualize before testing | Trust p-values blindly | | Report effect size + CI, not just p | Ignore multiple comparisons | | Check assumptions (normality, equal variance) | Remove outliers without justification | | Use non-parametric tests if assumptions fail | Confuse statistical significance with practical importance | | Set significance level before seeing data | Cherry-pick variables in regression | | Use bootstrap for complex estimators | Forget to document random seeds |
import pandas as pd import numpy as np
Entender qué forma tienen tus datos determina qué herramientas puedes usar. | ✅ Do | ❌ Don’t | |------|---------|
# Filtrar outliers outliers = data[(data['salario'] < limite_inferior) | (data['salario'] > limite_superior)] print(f"Cantidad de outliers detectados: len(outliers)") limite_inferior) | (data['salario'] >
stat, p_valor = stats.shapiro(datos_normales) print(f"p-valor: p_valor:.4f") # Si p > 0.05, aceptamos normalidad. aceptamos normalidad. modelo = sm.OLS(y
modelo = sm.OLS(y, X).fit()