A cohort is a group of people who share a common characteristic or experience within a defined period (e.g., are born, leave school, lose their job, are exposed to a drug or a vaccine, etc.). Thus a group of people who were born on a day or in a particular period, say 1948, form a birth cohort. The comparison group may be the general population from which the cohort is drawn, or it may be another cohort of persons thought to have had little or no exposure to the substance under investigation, but otherwise similar. Alternatively, subgroups within the cohort may be compared with each other.
In medicine, a cohort study is often undertaken to obtain evidence to try to refute the existence of a suspected association between cause and disease; failure to refute a hypothesis strengthens confidence in it. Crucially, the cohort is identified before the appearance of the disease under investigation. The study groups, so defined, are observed over a period of time to determine the frequency of new incidence of the studied disease among them. The cohort cannot therefore be defined as a group of people who already have the disease. Distinguishing causality from mere correlation cannot usually be done with results of a cohort study alone.
The advantage of cohort study data is the longitudinal observation of the individual through time, and the collection of data at regular intervals, so recall error is reduced. However, cohort studies are expensive to conduct, are sensitive to attrition and take a long time to generate useful data.
Some cohort studies track groups of children from their birth, and record a wide range of information (exposures) about them. The value of a cohort study depends on the researchers' capacity to stay in touch with all members of the cohort. Some of these studies have continued for decades.
An example of an epidemiologic question that can be answered by the use of a cohort study is: does exposure to X (say, smoking) correlate with outcome Y (say, lung cancer)? Such a study would recruit a group of smokers and a group of non-smokers (the unexposed group) and follow them for a set period of time and note differences in the incidence of lung cancer between the groups at the end of this time. The groups are matched in terms of many other variables such as economic status and other health status so that the variable being assesed, the independent variable (in this case, smoking) can be isolated as the cause of the dependent variable (in this case, lung cancer).
In this example, a statistically significant increase in the incidence of lung cancer in the smoking group as compared to the non-smoking group is evidence in favor of the hypothesis. However, rare outcomes, such as lung cancer, are generally not studied with the use of a cohort study, but are rather studied with the use of a case-control study.
Shorter term studies are commonly used in medical research as a form of clinical trial, or means to test a particular hypothesis of clinical importance. Such studies typically follow two groups of patients for a period of time and compare an endpoint or outcome measure between the two groups.
Randomized controlled trials, or RCTs are a superior methodology in the hierarchy of evidence, because they limit the potential for bias by randomly assigning one patient pool to an intervention and another patient pool to non-intervention (or placebo). This minimises the chance that the incidence of confounding variables will differ between the two groups.
Nevertheless, it is sometimes not practical or ethical to perform RCTs to answer a clinical question. To take our example, if we already had reasonable evidence that smoking causes lung cancer then persuading a pool of non-smokers to take up smoking in order to test this hypothesis would generally be considered quite unethical.
An example of a cohort study that has been going on for more than 50 years is the Framingham Heart Study.
The largest cohort study in women is the Nurses' Health Study. Started in 1976, it is tracking over 120,000 nurses and has been analyzed for many different conditions and outcomes.