Evaluating Distributions and generating Experimental Crosstabs for the Evaluation of Experiments as well as Experimental Comparisons with T-Tests, Purpose Assumptions, Corrections and Implementations in Python
The purpose of the T-test is to compare if there are mean differences between two groups of interest. When we are interested in comparing statistical differences between more than two groups, and we conduct multiple t-tests, we will end up increasing the likelihood of a false positive (type I error) where we are incorrectly rejecting the null hypothesis that there are no statistical differences between groups. One way to address this is to use the Bonferroni correction. The Bonferroni correction, the namesake of Carlo Emilio Bonferroni, accounts for what we lose in a p-hacking quest in the experimentation, which is the justification for taking p-values at face value. By intuition, when we go searching for significant differences everywhere, the chance of seeing an apparent significant difference by chance anywhere increases. Using the Bonferroni correction, if the starting alpha/significance level is .05 and we are testing 10 hypotheses, then the corrected alpha/significance level we should use would be .005. Understanding the lack of an incentive to make such an adjustment is straightforward. Another way to address this is to first use ANOVA to detect statistical differences between all groups before deciding whether to use t tests to look for pairwise comparisons between groups.
T test comparisons uses the means, counts and standard deviations of a treatment and control in comparison to an idealized normal distribution to calculate a p value, which by intuition is the likelihood of seeing a mean difference of the same or more extreme magnitude between treatment and control as a result of chance. This is done through a comparison to an idealized normal distribution, through the calculation of a t-statistic. While the test statistic is assumed to follow an idealized normal distribution if the scaling term, but where the scaling term is unknown and it is instead estimated based on the data, which is assumed to follow the student's t distribution.This process can be thought of trying to disentangle the signal (mean difference and counts) from the noise variability. Here the mean difference is the direction of the signal and the counts are the strength of the signal.
import numpy as np
import dask.array as da
import pandas as pd
import sqlalchemy as db
from sqlalchemy import create_engine
import sqlite3
import pandas as pd
import seaborn as sns
import numpy as np
import pandas as pd
from statsmodels.stats.power import NormalIndPower, TTestIndPower
from scipy.stats import ttest_ind_from_stats
import numpy as np
import scipy
df = pd.read_csv('df_panel_fix.csv')
df_subset = df[["year", "reg", "province", "gdp", "fdi", 'it',"specific"]]
df_subset.columns = ["year", "region", "province", "gdp", "fdi", 'it',"specific"]
df=df_subset
df
# Add distributions by region
import matplotlib.pyplot as plt
#fig, axes = plt.subplots(nrows=3, ncols=3)
test_cells = ['East China', 'North China']
metrics = ['gdp', 'fdi', 'it']
for test_cell in test_cells:
for metric in metrics:
df.loc[df["region"] == test_cell].hist(column=[metric], bins=60)
print(test_cell)
print(metric)
df.hist(column=['fdi'], bins=60)
df.hist(column=['fdi'], bins=60)
sns.distplot(df['gdp'])
sns.distplot(df['fdi'])
sns.distplot(df['it'])
sns.distplot(df['specific'].dropna())
df.hist(column=['fdi'], bins=60)
import scipy.stats as stats
df['gdp_zscore'] = stats.zscore(df['gdp'])
df[abs(df['gdp_zscore'])>3].hist(column = ['gdp'])
df_no_gdp_outliers=df[abs(df['gdp_zscore'])<3]
df_no_gdp_outliers
df_no_gdp_outliers.hist(column=['gdp'], bins=60)
counts_fiscal=df.groupby('region').count()
counts_fiscal
counts_fiscal=df.groupby('province').count()
counts_fiscal
#df_no_gdp_outliers.pivot_table(index='grouping column 1', columns='grouping column 2', values='aggregating column', aggfunc='sum')
#pd.crosstab(df_no_gdp_outliers, 'year')
df_no_gdp_outliers_subset = df_no_gdp_outliers[['region', 'gdp', 'fdi', 'it']]
df_no_gdp_outliers_subset
def aggregate_and_ttest(dataset, groupby_feature='region', alpha=.05, test_cells = [0, 1]):
#Imports
from tqdm import tqdm
from scipy.stats import ttest_ind_from_stats
metrics = ['gdp', 'fdi', 'it']
feature_size = 'size'
feature_mean = 'mean'
feature_std = 'std'
for metric in tqdm(metrics):
#print(metric)
crosstab = dataset.groupby(groupby_feature, as_index=False)[metric].agg(['size', 'mean', 'std'])
print(crosstab)
treatment = crosstab.index[test_cells[0]]
control = crosstab.index[test_cells[1]]
counts_control = crosstab.loc[control, feature_size]
counts_treatment = crosstab.loc[treatment, feature_size]
mean_control = crosstab.loc[control, feature_mean]
mean_treatment = crosstab.loc[treatment, feature_mean]
standard_deviation_control = crosstab.loc[control, feature_std]
standard_deviation_treatment = crosstab.loc[treatment, feature_std]
t_statistic, p_value = ttest_ind_from_stats(mean1=mean_treatment, std1=standard_deviation_treatment, nobs1=counts_treatment,mean2=mean_control,std2=standard_deviation_control,nobs2=counts_control)
#fstring to print the p value and t statistic
print(f"The t statistic of the comparison of the treatment test cell of {treatment} compared to the control test cell of {control} is {t_statistic} and the p value is {p_value}.")
#f string to say of the comparison is significant at a given alpha level
if p_value < alpha:
print(f'The comparison between {treatment} and {control} is statistically significant at the threshold of {alpha}')
else:
print(f'The comparison between {treatment} and {control} is not statistically significant at the threshold of {alpha}')
aggregate_and_ttest(df_no_gdp_outliers_subset, test_cells = [0,1])
from tqdm import tqdm
for i in tqdm(range(10000)):
...
EastvNorth=pd.DataFrame()
EastvNorth= aggregate_and_ttest(df_no_gdp_outliers_subset, test_cells = [0,1])
EastvNorth
experimental_crosstab = df_no_gdp_outliers_subset.groupby('region').agg(['size', 'mean', 'std'])
experimental_crosstab.index
df = experimental_crosstab.T
df
#experimental_crosstab.reset_index().unstack()
experimental_crosstab.iloc[0,1]
experimental_crosstab.index
experimental_crosstab
experimental_crosstab.columns = ['_'.join(col) for col in experimental_crosstab.columns.values]
experimental_crosstab
experimental_crosstab.loc['East China', 'gdp_size']
experimental_crosstab.to_csv('fiscal_experimental_crosstab.csv')