Abstract(s)
Design quality guidelines typically recommend that multiple baseline designs include
at least three demonstrations of effects. Despite its widespread adoption, this recommendation does not appear grounded in empirical evidence. The main purpose of our
study was to address this issue by assessing Type I error rate and power in multiple
baseline designs. First, we generated 10,000 multiple baseline graphs, applied the dualcriteria method to each tier, and computed Type I error rate and power for different
number of tiers showing a clear change. Second, two raters categorized the tiers for 300
multiple baseline graphs to replicate our analyses using visual inspection. When
multiple baseline designs had at least three tiers and two or more of these tiers showed
a clear change, the Type I error rate remained adequate (< .05) while power also
reached acceptable levels (> .80). In contrast, requiring all tiers to show a clear change
resulted in overly stringent conclusions (i.e., unacceptably low power). Therefore, our
results suggest that researchers and practitioners should carefully consider limitations in
power when requiring all tiers of a multiple baseline design to show a clear change in
their analyses.