'Causal Inference where the treatment assignment is randomised

I have mostly worked with Observational data where the treatment assignment was not randomized. In the past, I have used PSM, IPTW to balance and then calculate ATE. My problem is: Now I am working on a problem where the treatment assignment is randomized meaning there won't be a confounding effect. But treatment and control groups have different sizes. There's a bucket imbalance.

Now should I just analyze the data as it is and run statistical significance and Statistical power test? Or shall I balance the imbalance of sizes between the treatment and control using let's say covariate matching and then run significance tests?



Solution 1:[1]

In general, you don't need equal group sizes to estimate treatment effects.
Unequal groups will not bias the estimate, it will just affect its variance - namely, reducing the precision (recall the statistical power is determined by the smallest group, so unequal groups is less sample-efficient, but not categorically wrong).

you can further convince yourself with a simple simulation (code below). Showing that for repeated draws, the estimation is not biased (both distributions perfectly overlay), but having equal groups have improved precision (smaller standard error).

simulation showing equal groups has better precision over repeated trials

import statsmodels.api as sm
import numpy as np
import pandas as pd
import seaborn as sns

n_trials = 100
balanced = {
    True: (100, 100),
    False: (190, 10),
}
effect = 2.0
res = []
for i in range(n_trials):
    np.random.seed(i)
    noise = np.random.normal(size=sum(balanced))
    for is_balanced, ratio in balanced.items():
        t = np.array([0]*ratio[0] + [1]*ratio[1])
        y = effect * t + noise
        m = sm.OLS(y, t).fit()
        res.append((is_balanced, m.params[0], m.bse[0]))

res = pd.DataFrame(res, columns=["is_balanced", "beta", "se"])
g = sns.jointplot(
    x="se", y="beta",
    hue="is_balanced",
    data=res
)
# Annotate the true effect:
g.fig.axes[0].axhline(y=effect, color='grey', linestyle='--')
g.fig.axes[0].text(y=effect, x=res["se"].max(), s="True effect")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ehudk