Physicians have long sought a way to accurately predict cancer patients’ survival outcomes by looking at biological details of the specific cancers they have. But despite concerted efforts, no such clinical crystal ball exists for the majority of cancers.
Now, researchers at the Stanford University School of Medicine have compiled a database that integrates gene expression patterns of 39 types of cancer from nearly 18,000 patients with data about how long those patients lived.
Combining the data from so many people and cancers allowed the researchers to overcome reproducibility issues inherent in smaller studies. As a result, the researchers were able to clearly see broad patterns that correlate with poor or good survival outcomes. This information could help them pinpoint potential therapeutic targets.
“We were able to identify key pathways that can dramatically stratify survival across diverse cancer types,” said Ash Alizadeh, MD, PhD, an assistant professor of medicine and a member of the Stanford Cancer Institute. “The patterns were very striking, especially because few such examples are currently available for the use of genes or immune cells for cancer prognosis.”
In particular, the researchers found that high expression of a gene called FOXM1, which is involved in cell growth, was associated with a poor prognosis across multiple cancers, while the expression of the KLRB1 gene, which modulates the body’s immune response to cancer, seemed to confer a protective effect.
A paper describing the research will be published online July 20 in Nature Medicine. Alizadeh shares senior authorship with Sylvia Plevritis, PhD, professor of radiology. Postdoctoral scholar Aaron Newman, PhD, and senior research scientist Andrew Gentles, PhD, share lead authorship of the paper.
The new database, which will be available to physicians and researchers, is called PRECOG, an abbreviation for “prediction of cancer outcomes from genomic profiles.”
In addition to identifying potentially useful gene expression patterns in cancers, the researchers also used Cibersort, a recently published technique developed by Newman in Alizadeh’s laboratory, to determine the composition of white blood cells that flock to a tumor. Cibersort assesses the relative levels of specific immune cells from a mishmash of cancer and normal cells and deduces the cell types from genes expressed in the bulk tumor — a process that Newman likens to analyzing a smoothie to identify its component fruits and berries.
“We were able to infer which immune cells are present or absent in individual solid tumors, to estimate their prevalence and to correlate that information with patient survival,” said Newman. “We found you can even broadly distinguish cancer types just based on what kind of immune cells have infiltrated the tumor.”
Putting it all together
Researchers have tried for years to identify specific patterns of gene expression in cancerous tumors that differ from those in normal tissue. By doing so, it may be possible to learn what has gone wrong in the cancer cells, and give ideas as to how best to block the cells’ destructive growth. But the extreme variability among individual patients and tumors has made the process difficult, even when focused on particular cancer types.
“There are many more genes in a cell than there are patients with any one type of cancer, and this makes discovering the important genes for cancer outcomes a tough problem,” said Gentles. “Because it’s easy to find spurious associations that don’t hold up in follow-up studies, we combined information from a vast array of cancer types to better see meaningful correlations.”
Gentles and Alizadeh first collected publicly available data on gene expression patterns of many types of cancers. They then painstakingly matched the gene expression profiles with clinical information about the patients, including their age, disease status and how long they survived after diagnosis. Together with Newman, they combined the studies into a final database.
“We wanted to be able to connect gene expression data with patient outcome for thousands of people at once,” said Alizadeh. “Then we could ask what we could learn more broadly.”
Seeing the forest
By looking at the forest, rather than the trees, the researchers made some surprising findings. They observed that prognostic genes were often shared among distinct cancer types, suggesting that similar biological programs impact survival across cancers. They were able to identify the top 10 genes that seemed to confer adverse outcomes, and the top 10 associated with more positive outcomes. Many of these genes are involved in aspects of cell division or are associated with distinct types of white blood cells that flood a tumor.
They were also able to identify combinations of white blood cells that appear favorable. In particular, the presence of elevated numbers of plasma cells, which secrete large amounts of antibodies, and certain types of T cells correlated with better patient survival rates across many different types of solid cancers, including lung and breast cancers. Conversely, a high proportion of neutrophils, also known as granulocytes, were associated with adverse outcomes.
The researchers hope that PRECOG and Cibersort will increase our understanding of cancer biology and aid in the development of new therapies for cancer patients. In addition, the researchers are applying these tools to better predict which patients will respond to new and emerging anti-cancer therapies. This is especially important given recent advances in the development of drugs that engage immune responses in cancer patients, but work well only for a subset of patients, said Alizadeh.
The study done by Stanford University Medical Center.