Whether the NVI data is representative is actually a very complicated question, and it is very difficult to give a simple yes or no. When we say “representative,” we just mean: do we think the answers we got would be similar if everyone answered? In this case, the answer is mostly yes—with a few things to keep in mind. Generally speaking, this data is representative overall, especially for the city as a whole and for each City Council district. That means the people who answered the survey give us a good idea of what most people in Detroit might say—if we had asked everyone. At the neighborhood zone level, the data is a little less representative, but still helpful.
That said, we know that some groups were more likely to take the survey than others. In our case, we know that the types of people who took our survey look different than Detoiters overall. That means if we are using the data to make decisions, we should remember where the data might not represent every voice equally.
A sample is a smaller group of people selected from a larger population to participate in a survey or study. Instead of asking everyone, we ask just a portion of the group and use their responses to make educated guesses about the whole population. When selected carefully, a sample can give us reliable insights—saving time, money, and effort while still helping us understand the bigger picture.
Sample size is the number of people included in a survey or study. It plays a key role in understanding if the results are reliable and accurate.
We want to know how many people we need to answer a question to be confident that if we did ask everyone in Detroit the same question, we would get similar results. This is where sample size comes in.
When we compare sample sizes to the population size , we can estimate how accurately our survey results reflect the entire population’s opinions. These comparisons happen through different calculations which tell us how much confidence we can have that the NVI results represent the entire City of Detroit.
Two important tools help us make this assessment: the margin of error (MOE) and the confidence interval.
In short, the sample size, or number of people who completed the survey, determines the size of the margin of error and confidence interval. Both of these measures together tell us how accurate our results are likely to be.
While it’s true that larger samples give us more precision, there’s a point where it doesn’t make sense to continue to collect surveys. After a certain size, adding more people doesn’t significantly improve accuracy, but it does increase time and cost for outreach. That’s why surveys aim for a sample that’s just large enough to provide reliable results without wasting resources. For more about these reliability measures, check out Data Driven Detroit’s blog post or this video.
For each council district and neighborhood zone, we calculated the sample size needed to meet certain confidence intervals and margins of error based on the overall population in each place. In the table below you can see which ones met the threshold. Overall, each council district passes the threshold for total surveys met 95% confidence interval and 5% margin of error. Sixteen neighborhood zone meets the threshold for 80% confidence interval and 5% margin of error, seven zones did not meet the threshold.
Council District | Surveys Collected | Surveys needed to meet a 95% confidence interval with 5% margin of error | Did we meet the 95% confidence interval with a 5% margin of error threshold? |
---|---|---|---|
1 | 614 | 382.5 | Yes |
2 | 605 | 382.5 | Yes |
3 | 410 | 382.3 | Yes |
4 | 619 | 382.3 | Yes |
5 | 756 | 382.5 | Yes |
6 | 555 | 382.5 | Yes |
7 | 502 | 382.5 | Yes |
Neighborhood Zone | Surveys Collected | Surveys needed to meet a 80% confidence interval with 5% margin of error | Did we meet the 95% confidence interval with a 5% margin of error threshold? |
---|---|---|---|
1a | 212 | 163.2 | Yes |
1b | 153 | 161.8 | No |
1c | 249 | 162.3 | Yes |
2a | 248 | 162.4 | Yes |
2b | 193 | 162.0 | Yes |
2c | 164 | 162.0 | Yes |
3a | 167 | 162.3 | Yes |
3b | 150 | 162.1 | No |
3c | 93 | 161.8 | No |
4a | 194 | 162.3 | Yes |
4b | 230 | 162 | Yes |
4c | 195 | 161.9 | Yes |
5a | 248 | 162.2 | Yes |
5b | 224 | 162.0 | Yes |
5c | 284 | 162.2 | Yes |
6a | 154 | 162.2 | No |
6b | 227 | 161.8 | Yes |
6c | 120 | 162.1 | No |
6d | 54 | 159.6 | No |
7a | 209 | 162.2 | Yes |
7b | 104 | 161.9 | No |
7c | 186 | 162.2 | Yes |
However, we cannot say that our survey is representative based on these numbers alone, these numbers only work if certain things, which statisticians call assumptions, are true. In our case, we know that we don’t meet those assumptions, and we want to ensure that we are letting people know about those limitations.
All the calculations we’ve discussed—sample size, margin of error, confidence intervals—are based on an important assumption: that a sample is random. A truly random sample means that every person in the population had an equal chance of being selected to take the survey. When that’s true, we can feel confident that our results reflect the broader group.
But in reality, it’s hard to get a perfectly random sample—especially when doing community-based or opt-in surveys like ours. Most people who saw our survey likely had internet access and had previously signed up to receive emails from an organization, agency, or newsletter. This means our responses are not random and may over-represent certain types of people—especially those who are more digitally connected and don’t mind taking surveys.
To try and counteract that bias, we took additional steps. We promoted the survey at in-person events, made it available on paper, and sent random outreach emails to increase reach beyond our usual circles. These steps helped, but they didn’t eliminate the issue.
While we can still learn a lot from this data, it’s important to be transparent about the fact that our sample isn’t fully random. This affects how confidently we can say our findings apply to the whole population and not just those who took the survey.
After looking at our sample size and potential sources of bias, the next step is to examine how well the people who took the survey match the broader population. One way we assess this is by comparing the demographic characteristics of survey respondents—such as sex, age, and race—to the known demographics of the areas where they live.
For example, we know from broader research that women are more likely than men to take surveys. As a result, our data may reflect women’s experiences more than men’s, simply because women were more likely to respond. This is called oversampling, and while it’s common in community surveys, it’s still a limitation we need to acknowledge.
This analysis doesn’t fix the imbalance, but it gives us valuable insight into the strengths and limitations of the data—and helps guide future improvements in data collection and engagement.
In the tables below, we look at how closely the makeup of our sample aligns with the population in different geographies across the city, such as Council Districts and NVI Neighborhood Zones. This comparison helps us understand where our data may be more or less representative. If certain groups are consistently over- or underrepresented in particular areas, it may suggest a need for caution in interpreting results from those places.
(Also a representative sample of the population. For margins of error, you can check out State of the Child)
Council District | Total Male | Total Female |
---|---|---|
1 | 46% | 54% |
2 | 45% | 55% |
3 | 49% | 51% |
4 | 47% | 53% |
5 | 48% | 52% |
6 | 50% | 50% |
7 | 48% | 52% |
Council District | Total Male | Total Female | Total Non-Binary |
---|---|---|---|
1 | 20% | 79% | 1% |
2 | 24% | 76% | 1% |
3 | 26% | 73% | 1% |
4 | 22% | 77% | 1% |
5 | 30% | 68% | 2% |
6 | 29% | 70% | 1% |
7 | 21% | 78% | 1% |
Council District | Less than $20,000 | $20,000–$39,999 | $40,000–$74,999 | More than $75,000 |
---|---|---|---|---|
1 | 25% | 22% | 28% | 25% |
2 | 24% | 20% | 27% | 29% |
3 | 29% | 24% | 27% | 21% |
4 | 29% | 21% | 26% | 23% |
5 | 34% | 18% | 21% | 26% |
6 | 30% | 21% | 24% | 25% |
7 | 32% | 25% | 23% | 20% |
Council District | Less than $20,000 | $20,000–$39,999 | $40,000–$74,999 | More than $75,000 |
---|---|---|---|---|
1 | 26% | 20% | 31% | 22% |
2 | 22% | 23% | 31% | 24% |
3 | 38% | 29% | 23% | 9% |
4 | 31% | 27% | 21% | 21% |
5 | 25% | 19% | 27% | 29% |
6 | 35% | 19% | 26% | 21% |
7 | 39% | 26% | 24% | 11% |
Council District | 18–34 | 35–64 | 65+ |
---|---|---|---|
1 | 31% | 48% | 21% |
2 | 29% | 48% | 24% |
3 | 35% | 47% | 18% |
4 | 32% | 50% | 19% |
5 | 35% | 44% | 21% |
6 | 38% | 47% | 16% |
7 | 34% | 47% | 19% |
Council District | 18–34 | 35–64 | 65+ |
---|---|---|---|
1 | 17% | 56% | 27% |
2 | 20% | 55% | 25% |
3 | 22% | 58% | 20% |
4 | 20% | 54% | 26% |
5 | 26% | 48% | 26% |
6 | 30% | 51% | 19% |
7 | 19% | 53% | 28% |
Council District | Black Alone | While Alone | American Indian and Alaska Native Alone | Asian Alone | Native Hawaiian and Other Pacific Islander Alone | Some Other Race Alone | Two or More Races |
---|---|---|---|---|---|---|---|
1 | 90% | 5% | 0% | 0% | 0% | 1% | 4% |
2 | 91% | 5% | 0% | 1% | 0% | 1% | 3% |
3 | 80% | 9% | 1% | 7% | 0% | 1% | 3% |
4 | 90% | 7% | 0% | 0% | 0% | 0% | 2% |
5 | 74% | 18% | 0% | 3% | 0% | 1% | 4% |
6 | 32% | 26% | 1% | 1% | 0% | 27% | 12% |
7 | 79% | 13% | 0% | 0% | 0% | 2% | 6% |
Council District | Black Alone | While Alone | American Indian and Alaska Native | Asian alone/Native Hawaiian and Other Pacific Islander | Arab American or Middle Eastern | Some Other Race | Prefer not to Answer |
---|---|---|---|---|---|---|---|
1 | 80% | 13% | 2% | 1% | 0% | 1% | 6% |
2 | 82% | 11% | 2% | 1% | 0% | 0% | 6% |
3 | 83% | 8% | 2% | 2% | 1% | 2% | 6% |
4 | 74% | 16% | 2% | 2% | 0% | 1% | 7% |
5 | 63% | 25% | 3% | 1% | 2% | 1% | 7% |
6 | 44% | 29% | 2% | 2% | 2% | 1% | 5% |
7 | 82% | 8% | 2% | 0% | 1% | 1% | 7% |
It is well-documented that the federal guidelines for collecting race/ ethnicity data is not sufficient for identifying some groups of residents, especially Arab American or Middle Eastern. There is a new recommendations that the Census and other federal data collection projects add in this category in the future so NVI collects this group, which will change the responses of some people who otherwise might identify as White or Some Other Race. Due to some of our sample sizes, we were not able to report out Asian or Native Hawaiian and Other Pacific Islander separately. However, we are reporting them combined below to give a sense for that demographic within the NVI sample.
So, back to our initial question, is this data representative? The answer is: it’s complicated. For now, we are saying that yes, the data is representative, but proceed with caution. Over the following months, we are going to do some robustness checks, meaning we are going to take a deep dive into the data to understand how the groups of people who are overrepresented in our sample might impact our data. We will share our results when we finish our analysis.
As we work with our partners to increase the survey responses we get from Detroiters, the more representative this data can be. Reach out if you want to help! And if you want to talk to someone at D3 about how you plan to use the data and what you might want to consider, feel free to ask us by reaching out on our contact page.