At the crux of my argument about power analysis is that it's not as linear as you might think. What you probably think is that you develop a research question, run a power analysis, and that power analysis tells you how many people you need to test.

In practice, power analysis is much more cyclical. It involves your budget, your research questions, analysis plan, data structure, how big of an effect you can reasonably expect to find, and the power analysis simultaneously and in combination with one another. Something more like this:

When I'm planning a study, I go around and around this interconnected web for a few days before narrowing down what we will actually do for a project. It's complex, deep thinking work. But it's also really fun. Have a read of my power-point introduction to how I approach power analysis, and consider whether using it will be useful as you are planning your own.

If you want to cite the presentation, please use the following:

Logan,Jessica(2019). Introduction to Power Analysis. figshare. Presentation. https://doi.org/10.6084/m9.figshare.8236409.v1

]]>If you want to cite the presentation, please use the following:

Logan,Jessica(2019). Introduction to Power Analysis. figshare. Presentation. https://doi.org/10.6084/m9.figshare.8236409.v1

Do I need to keep actively parenting my children? They are relatively big as far as kids who live at home go - 12 and 8. They wouldn't burn the house down if I ignored them, and probably wouldn't do anything stupid enough to warrant a hospital visit. But I find myself _unable_ to ignore them. Desperate for a sense of normalcy and a powerful and pressing need to show them I love them...just in case. I am simultaneously in constant need of connection and space; begging them to give me time and space to work, but once I get it I am unable to do anything but miss them. This time is scary and we need each other.

Under such circumstances, it is absolutely impossible to have deep thinking time. It is always interrupted by something that seems far more pressing (be it a band-aid request, a funny line in a book someone is reading, or making plans for lunch). I need to be there for them, but my patience is thinner than usual, plagued by a constant sense of guilt for not getting things accomplished. I give a hug, half listening to the story of the stubbed toe, worrying "Oh shoot, did I agree to a manuscript review? What journal was that? When is it due?".

We are living, right now, under the increasingly likely possibility of someone in my house needing to teach my kids about algebra and five paragraph essays next year. Home schooling. I am such a strong advocate of the public school system, and the thought of teaching my kids at home is counter to my identity as a scholar as well as my support for equitable access to education. Even with my largely independent children, teaching them at home will take a lot of work. So much work. Even if we decide that fourth and eighth grades can just be basically skipped, and that my children can be ignored, they still need some supervision and they still need some love.

I am at a complete loss as to how to continue to contribute to science, though I desperately want to do so. Under different circumnstances, I would be providing you with data. There are so many people who have this so much worse than I do as well. My BIPOC friends who are currently being asked to serve on 100 extra committees on top of their other work, life, and home demands. My friends who are parents of young children, children who

I am thinking about you all all the time. I will keep trying to convert my anxious energy into positive energy and send it out into the world. I may not write as many papers, put in as many grants, or do as many reviews in 2020/2021 as my tenure committee would like. But goodness gracious do I want to keep doing this work. I look forward to reading all of your papers, if my worry will give me more space to do it. And gosh I would love to exchange ideas with you a conference sometime. Maybe we can again someday.

Please wear a mask, wash your hands, and keep staying home so we have a shot at doing things like that again.

]]>

By

By

]]>

Within the field of Education, the story is quite different. The percentages of women earning doctorates much higher than the rest of the sciences. Here is a figure depicting the percentages of Education Ph.D.s awarded to women from 1949 - 2012. It starts very low, around 18% in 1949, but increases fairly quickly; stabilizing at 68% around the year 2000*.

If the problem was the pipeline, then education seems to have solved it. All of these women getting Ph.D.s should translate into academic jobs.

To examine that link, I looked at the NCES datalab (https://nces.ed.gov/datalab/powerstats/output.aspx). After narrowing the results to the field of Education, we can look at the percentages of professors at each rank. Here we can see two different stories. For all females in academic jobs within education, 24% of them are tenured. For all males, it was 33%. However, taking the percentages the other way, tells a seemingly more positive story. Of all academics in the field of education, 62% of them are women. But of all*tenured* faculty approximately half (51%) are women.

To examine that link, I looked at the NCES datalab (https://nces.ed.gov/datalab/powerstats/output.aspx). After narrowing the results to the field of Education, we can look at the percentages of professors at each rank. Here we can see two different stories. For all females in academic jobs within education, 24% of them are tenured. For all males, it was 33%. However, taking the percentages the other way, tells a seemingly more positive story. Of all academics in the field of education, 62% of them are women. But of all

To provide some perspective, I looked up those figures for the Natural Sciences, where 25% of all faculty and 18% of tenured faculty are women. Clearly women represent more of the education sciences than they do of the natural sciences, however there is still more work to be done.

In Education, where 68% of Ph.D.s awarded to women, it is clear that the field is not disproportionally losing women at the initial job stage. Representation is approximately the same at the Ph.D. level as for faculty not on the tenure track (64% - 67%) or who are on the tenure track but not yet tenured (69%). It is somewhere during the tenure process that they leave the field.

This finding is one of the reasons that I have worked with the other executive board members to found Providing Opportunities for Women in Education Research (POWER). We seek to connect, support, and advocate for women who are working in this field to help them stay in the pipeline and continue to contribute to science. Read more about POWER and our mission HERE.

*Note: I pulled data from the Survey of Earned Doctorates https://www.nsf.gov/statistics/doctorates/. The datasets were called: "Doctorate recipients, by subfield of study and sex", which I pulled for each year. I harvested the number and percent of men/women for the education field from each dataset and combined them into this graph.

]]>In Education, where 68% of Ph.D.s awarded to women, it is clear that the field is not disproportionally losing women at the initial job stage. Representation is approximately the same at the Ph.D. level as for faculty not on the tenure track (64% - 67%) or who are on the tenure track but not yet tenured (69%). It is somewhere during the tenure process that they leave the field.

This finding is one of the reasons that I have worked with the other executive board members to found Providing Opportunities for Women in Education Research (POWER). We seek to connect, support, and advocate for women who are working in this field to help them stay in the pipeline and continue to contribute to science. Read more about POWER and our mission HERE.

*Note: I pulled data from the Survey of Earned Doctorates https://www.nsf.gov/statistics/doctorates/. The datasets were called: "Doctorate recipients, by subfield of study and sex", which I pulled for each year. I harvested the number and percent of men/women for the education field from each dataset and combined them into this graph.

I was listening while washing dishes and ran across the house with wet hands to find a pen to write this quote down, because I am a power analysis believer. It's true that it can sometimes be fuzzy, and maybe more effort goes into them than is necessary, but ultimately I believe they are a good tool. I've broken down Greg and Patrick's arguments against a-priori power analysis into three basic parts:

They argue, essentially that if you're running a complex model that Cohen's ideas about what an effect size represents doesn't even really apply. If you have a complex multi-indicator latent factor model, there are too many pathways to consider."Take a simple growth model with five indicators... what is the power of that?? Are you interested in the intercept, the slope, the correlation between the two... "

When I'm running power analyses, I am

Yes. This is why the specificity is even more important. Do you want to know if third graders grow in their language skills less than second graders do? Then you need to fit a growth model, and you need to estimate the power you have

I do this with simulations. Where the values of those simulations are seeded by pilot work or other large scale studies that have used the same or similar measures. I use the variance almanac to determine the intra-class correlation due to schools (when that is relevant). I use this delightful article to determine whether my effect sizes are meaningful or important or at all interesting.

Agreed, and this is why I always present a power analysis under multiple scenarios. I will always estimate my power to detect the critical pathway for a given hypothesis both with and without covariates included, for different levels of attrition, for different key variables of that particular construct.

To sum up, I do agree that running a power analysis for an entire model is useless. But it is useless because if the only power analysis you can think to run is one for the model itself, then you probably don't have a very well defined question. For any one study, I will report two to three power analyses

I think our differences in opinion can be boiled down to differences in funding mechanisms. The NIH gives you one paragraph, while IES, who I write most of my grants for, expects something closer to a page with a table dedicated to the power analysis. It's also different because I have the luxury of frequently working with randomized control trials. Usually a primary aim of the study is to determine whether a treatment group is different from a control group.

Don't get me wrong. A lot of my time is spent on power analysis. If Patrick and Greg can convince the federal funders to drop power, or to switch to an emoji-based system, I could learn to knit or something with all of the extra time I would have on my hands. But for now, if you're in need of a power analysis, try mapping your research question on to the actual equation or latent path model you intend to run. Where in that equation or on that diagram could your hypothesis be disproven? If you don't know, try writing a better, more specific question.

]]>

The (Brightstart Impact Paper) is the culmination of a five-year randomized control trial funded by the Institution of Education Sciences (IES). Yes I said culmination and I said five-year, so the plans for how to collect this data had long been settled by the time I wanted to preregister the analysis plan. However, with IES grants, we are required to include extensive and detailed documentation of the plans not only for the data collection, but also for the sample size, and analysis plans. So we did have a detailed record of the study design and analysis plan, it had just never been published. In that way, completing some of the preregistration paperwork was relatively straightforward.

After reaching out to twitter for suggestions, I landed on the "AsPredicted" version of the preregistration form. The "nine questions" on this form are, to me, rather bizarrely ordered on the OSF and omit some key information. I can see how if I was planning an "in-lab" experiment with undergraduates that this might be a good option, however for trying to complete this about a complex study for which the data had already been collected I ran into several road blocks. I recently published my second registration (Unique contribution of language gains to children’s kindergarten and grade 3 reading skills), and ran into the same issues again so I wanted to document them here.

First, is a very small problem, but there was no place to include authors!?! I had to include our authors in the "other information" section. That seems completely bizarre and fixable. Second, I was surprised that this form asks for no background or motivating information. The methods and analysis plan should be completely dependent on that background, and without it I find it difficult to determine whether the study design or analysis plan is appropriate. Third, there was no dedicated space to list information about predictors and covariates in the model, so I tucked them away in the analysis section. Third, some of the sample ascertainment questions are tough with secondary data - the plan is to use every single data point I can get my hands on, and I have no idea if that will be 800 or 8,000 data points, and I won't know until I get into the data. For those sections, I settled on reporting a power analysis for a minimally meaningful effect size.

In all, I plan to continue pre-registering my planned papers, even with these road blocks. I think a dedicated form for secondary analysis would be very beneficial for the field of education as a whole. A group of researchers at the Center for Open Science has already developed a template, available HERE, and I look forward to using it once it is functional.

A disclaimer to the negative tone of this... review?... I really don't like fitting my ideas into superseding categories, and generally I find filling out forms to be anxiety provoking at best. Manuscript submission portals make me want to crawl under my desk and hide. I'm hoping that with continued practice I will get over that fear for this process. Time will tell.

]]>

I'm teaching regression right now and a question that comes up a lot is how an intercept can be negative when the outcome has a range of scores of something like a grade on a vocabulary test in percent correct (0-100%). Worse, sometimes the regression suggests that the negative intercept is significantly different from zero! Can we really conclude that there are people who got a negative score on a vocabulary test?

Regression is based around the idea of model comparisons; where models with added predictors are compared to an unconditional model (a model with no predictors) to determine if better fit has been achieved. In the unconditional model, each person’s observed score on the Y (the outcome) is always equal to the mean of Y (represented as B0) plus error:

Y = B0 + e

Because B0 is equal to the mean of Y, in our hypothetical example of percentage correct on a final test, the intercept will never be negative. It could be really small if your class of students absolutely bombed their test, but it won't ever be negative. But we don't really ever interpret unconditional models. In a regression, you will always have at least one predictor. We add a predictor in an attempt to make the error smaller. In other words, we attempt to explain more of the error, and better fit the observed data points.

So let's suppose that my vocabulary test is completed by 200 students, aged 10-15. I generally expect that older children will have larger vocabularies (and thus, higher scores on this vocabulary test). When we add a predictor to this equation, it is now a conditional model. In regression, the conditional model predicting vocabulary from age will look like this:

Y = B0 + B1*age + e

Where Y = the expected vocabulary score for someone who is zero years old. Why for someone who is zero? Because if the x-value plugged in for age is zero, then the weight (B1) is reduced to zero, and so Y = B0. The key phrase here is “someone who is zero years old” because it is what makes the expected score of Y conditional. The expected value of Y (and so the estimate of B0, which is also called the Y-intercept or often just “intercept”) is always equal to the mean of Y when all predictors are zero.

So in our hypothetical example, our age variable ranges from 10-16, and so does not include zero. If the age coefficient is significantly positive, there's a strong possibility that the intercept may be negative. Inventing some results, I graphed them at the right: With the equation Y = -5 + 2* Age.

Here I've estimated that a 10 year old has an expected vocabulary score of 15, and that every year age increase corresponds to an expected 2 point increase in vocabulary score. That means we have a negative intercept (B0 = -5). The negative intercept is completely possible, because we are trying to estimate the vocabulary score of... a newborn baby. The whole shaded part of that figure represents a place on the distribution where we don't have any data.

So. Only interpret your intercept, negative or not, when the zero point lies within the observed range of your data. If it doesn't, you have two choices. 1) Don't interpret it, just ignore it. or 2) re-center your predictors so that they all do contain your zero point, and now you can interpret the intercept. What's centering? We'll do that another day.

]]>Regression is based around the idea of model comparisons; where models with added predictors are compared to an unconditional model (a model with no predictors) to determine if better fit has been achieved. In the unconditional model, each person’s observed score on the Y (the outcome) is always equal to the mean of Y (represented as B0) plus error:

Y = B0 + e

Because B0 is equal to the mean of Y, in our hypothetical example of percentage correct on a final test, the intercept will never be negative. It could be really small if your class of students absolutely bombed their test, but it won't ever be negative. But we don't really ever interpret unconditional models. In a regression, you will always have at least one predictor. We add a predictor in an attempt to make the error smaller. In other words, we attempt to explain more of the error, and better fit the observed data points.

So let's suppose that my vocabulary test is completed by 200 students, aged 10-15. I generally expect that older children will have larger vocabularies (and thus, higher scores on this vocabulary test). When we add a predictor to this equation, it is now a conditional model. In regression, the conditional model predicting vocabulary from age will look like this:

Y = B0 + B1*age + e

Where Y = the expected vocabulary score for someone who is zero years old. Why for someone who is zero? Because if the x-value plugged in for age is zero, then the weight (B1) is reduced to zero, and so Y = B0. The key phrase here is “someone who is zero years old” because it is what makes the expected score of Y conditional. The expected value of Y (and so the estimate of B0, which is also called the Y-intercept or often just “intercept”) is always equal to the mean of Y when all predictors are zero.

So in our hypothetical example, our age variable ranges from 10-16, and so does not include zero. If the age coefficient is significantly positive, there's a strong possibility that the intercept may be negative. Inventing some results, I graphed them at the right: With the equation Y = -5 + 2* Age.

Here I've estimated that a 10 year old has an expected vocabulary score of 15, and that every year age increase corresponds to an expected 2 point increase in vocabulary score. That means we have a negative intercept (B0 = -5). The negative intercept is completely possible, because we are trying to estimate the vocabulary score of... a newborn baby. The whole shaded part of that figure represents a place on the distribution where we don't have any data.

So. Only interpret your intercept, negative or not, when the zero point lies within the observed range of your data. If it doesn't, you have two choices. 1) Don't interpret it, just ignore it. or 2) re-center your predictors so that they all do contain your zero point, and now you can interpret the intercept. What's centering? We'll do that another day.

An applied statistician is essentially always working outside of their area of expertise. Working outside of your area of expertise is not fun and is not easy. Imagine if to get your summer salary covered you were asked to write two papers in a completely new field. Do you study social psychology? Congratulations, this summer you’ve been assigned to collect data and write two papers about the diversity of insect species in the wetlands of Florida. It’s ludicrous. You wouldn’t even know where to start. This is similar to what you ask statisticians to do when they are involved in a project at the last minute. You essentially say: “Here’s an area of work I’ve spent the last 10 years of my life thinking about, can you tell me if my intervention works by the end of the month?”. Yes, I absolutely can, but if this is the first time you’ve talked to me, I can pretty much guarantee you’re not going to like my answer. All of the underlying groundwork; hundreds of methodological choices have been made without my input. Making decisions at this point is the equivalent of dropping me in a Florida wetland to count bugs. I am going to do it wrong; make the wrong choices, otherwise invalidate your best intentions.

My goal here is not to scare you. Rather my goal is to encourage you. To tell you that science can be better. That you can get closer to answering the questions you really want to answer with a little more forethought. When you want to design a study, you should reach out to someone with expertise in how to design a study. My call to content researchers is to consider the methodologist as a research partner with an area of expertise, and not as someone who can provide some last minute help to run a model. Statisticians will make the wrong choices, and will analyze your data wrong if they don’t understand the underlying theories behind what you’re trying to do, and what new knowledge you’re trying to bring into the world.

]]>

- Alpha (how often are you OK with thinking there's an effect there when there actually isn't? Typically in developmental science these are set at .05; or 5% of the time)
- Beta (How often are you OK with missing an effect that might actually be there? Usually we say 20% of the time: Power is 1- that probability, so .80)
- Sample Size (number of people)
- Effect size (how big can you expect the difference or relation you are looking for to be?)

In education,we also often need to consider multiple factors like how the data are structured, like how random assignment occurred (if it's happening at all), or whether kids are nested in classrooms. These factors add additional elements to these equations. In this powerpoint presentation, I go over some of the features of how to calculate a power analysis when you're planning education research.

]]>

“Measurement residual variance contains variance due to measurement error as well as variance unique to an indicator but not common to the other indicators of the factor. Because measurement error is random, by definition, the measurement error component of the variance of the measurement residual cannot be correlated with other factors. The Specific variance component on the other hand… may have auto-correlation over time”

“if not taken into account, the stability of the model may be overestimated…… Although the inclusion of correlated measurement residuals.. could be decided on the basis of significance tests, they are generally included in longitudinal models a priori (Mheaton, Muthen, Alwin, & Summers, 1977). Except for the cost of the degrees of freedom, there is typically no harm in including these estimates.

- Newsom, J. T. (2015).

]]>