## Technical Glossary

age adjustment
A form of risk adjustment that compares the ages of residents in a given place to the age breakdown in the country as a whole.
aggregate statistic
A data point which represents the sum or average of individual data points.
Read more: “What kind of data is in Metopio?”
average
The sum of all values, divided by the number of values. The average of the numbers 1, 2, 3, 5, 9 is 4 (20 divided by 5). Also known as the mean.
confidence interval
The data point plus or minus the margin of error. If the data point is \$10,000 and the margin of error is \$500, then the true value could be anywhere between \$9,500 and \$10,500.
Read more: “What is a confidence interval or margin of error?”
correlation, not causation
Without further research, any trend visible only suggests that one topic might affect another.
Read more: “What does the scatterplot show?”
curate
To clean, standardize, and validate data to ensure its quality.
Read more: “What kind of data is in Metopio?”
data point
A value that gives us numeric information. In Metopio, all data points are aggregate: they describe a group of people or things in a defined place.
Read more: “What kind of data is in Metopio?”
GEOID
Every place in Metopio has a GEOID (geographic ID) that uniquely identifies it. The GEOID may be the place itself (for ZIP codes), an abbreviation (for states), or the FIPS code assigned by the Census Bureau.
Read more: “How do I find the GEOID for a place?”
insight
A saved visualization in Metopio, such as a chart, map, or scatterplot.
logistic
On a log (logistic) scale, each step represents a multiple of the value (such as doubling or multiplying by 10), rather than a set amount. This makes it easier to see numbers that are very close together or very far apart.
margin of error
Represents how much the true value might differ from the reported one. Metopio uses 90% margins of error, meaning that they describe where the true value should lie 90% of the time. A confidence interval is the reported value ± the margin of error.
Read more: “What is a confidence interval or margin of error?”
mean
Same as the average: the sum of all values divided by the number of values.
median
The middle value, when all data points are ordered from smallest to largest.
percentage
Represents a share or fraction, such as of people, expressed out of 100 or with % (or as a decimal out of 1). For instance, "50.8% of Americans are female" means that, for every 100 Americans, 50.8 of them are female. "Per cent" is Latin for "out of 100".
percentage point
A unit of one percent. Percentage points are used to describe change relative to a baseline in absolute, rather than relative, terms. For instance, relative to a baseline value of 20%, 21% is 1 percentage point higher (21-20), or 5% higher ((21-20)/20).
percentile
A ranking on a scale of 0 to 100, expressed as a percent of all values. 95th percentile means "in the top 5%", and 1st percentile means "in the bottom 1%". This is not the same as a percentage.
p-value
A statistic measuring how likely it is that a relationship between two (or more) topics is real, not a fluke. Values below 0.05 are generally considered significant (worthy of further study).
Read more: “How do I interpret p-values?”
range
The span of all data points, from the lowest value to the highest.
rate
The number of incidents expressed relative to the population, such as "8 hospitalizations per 100 people". This differs from a percentage in that the same person could have multiple incidents, so there could be more incidents than number of people.
risk adjustment
A process that improves data by removing demographic variance by comparing every place to a common reference population, usually the United States as a whole. This keeps places from appearing less healthy just because they have a lot of elderly residents.
sample
A sample is a group of people (or businesses, households, etc.) that is selected, usually at random, from a larger population. We can survey a sample of people and use statistics to generalize about the full population. Most surveys work this way.
standard deviation
A measure of the variation or dispersion of a set of data points. A low standard deviation means the values tend to be close together (close to their mean). The standard error is the standard deviation for a sample.
standard error
Standard errors are similar to standard deviations and describe the precision of the data points. They are used to construct margins of error: the 90% margin of error is the standard error multiplied by 1.645.
Read more: “What is a standard error?”
statistical significance
In statistics, significance is a concept meaning that a relationship, trend, or comparison is unlikely to be a fluke - there's something real there. By convention, this applies when there is less than a 5% chance that it is a fluke (p-value < 0.05).
statistics
A field of mathematics that concerns how we can make conclusions from our imperfect world. More specifically, the process of using samples (such as surveys) to generalize about the population as a whole.
variance
The variability in a collection of data, or how spread out it is. “Variance” is also a statistical term meaning the square of the standard deviation (and, not coincidentally, describing variation in the data).
ZCTA
A Zip Code Tabulation Area is the geographical equivalent of a ZIP code. ZIP codes are meant for sending mail, not distinct areas on a map. The Census Bureau translates ZIP codes into ZCTAs; most ZIP codes are the same as the ZCTA.
Read more: “Why can't I see my ZIP code in Metopio?”