Researchers Found 1,700 Tiny Protein-Like Molecules Hiding in the Human Genome

The "dark proteome" is much more exciting than we thought.

by · ZME Science
3D structural model of the human heat shock protein 70. Credit: Wikimedia Commons

Scientists thought they had a working map of the human proteome. Some stretches of DNA made useful proteins, while a murkier, “dark” region seemed mostly inactive or irrelevant. Turns out their map was nowhere near complete.

Turns out, that idea is nowhere near complete.

In a new study published in Nature, an international team identified 1,785 tiny protein-like molecules made from overlooked regions of the human genome. The researchers call them peptideins. Most of them remain mysterious, but a few are already linked to cancer cell survival, immune recognition, and other disease-related processes.

Into the Dark Proteome

Current curated protein catalogues contain about 19,500 recognized human proteins. But the genome also contains thousands of short genetic stretches that looked suspiciously protein-like. Researchers called them non-canonical open reading frames, or ncORFs. Many were too small, too odd, or too poorly understood to enter official protein catalogues.

The TransCODE Consortium—more than 60 researchers across over 30 institutions—decided to search for them at an enormous scale. The team analyzed 7,264 such sequences using public protein data from 95,520 experiments. That meant sorting through 3.7 billion molecular spectra, the fingerprints scientists use to identify fragments of proteins.

About a quarter of those overlooked sequences produced detectable molecules.

Most were tiny. About 65% were shorter than 50 amino acids, the building blocks of proteins. By comparison, fewer than 1% of recognized human proteins are that short.

“We know that the current overview of recognized proteins doesn’t capture the full picture,” Dr. Sebastiaan van Heesch of the Princess Máxima Center, a co-leader of the study, said in a statement.

×

Get smarter every day...

Stay ahead with ZME Science and subscribe.

Daily Newsletter
The science you need to know, every weekday.

Weekly Newsletter
A week in science, all in one place. Sends every Sunday.
No spam, ever. Unsubscribe anytime. Review our Privacy Policy.

Thank you! One more thing...

Please check your inbox and confirm your subscription.

Meet the Peptideins

Predicted binding between a non-canonical open reading frame (blue) and traditional protein (yellow). Credit: Leron Kok/Princess Máxima Center for Pediatric Oncology.

Calling all of these molecules full-fledged proteins would be premature. In biology, a protein usually implies a defined role in normal cells. The researchers have physically detected many of these molecules, but they still do not know what most of them do, or if they do anything at all.

For now, we don’t know. So the researchers proposed a middle category: peptideins.

RelatedPosts

A Rogue Gene Explains Mysterious Cases of Diabetes and Epilepsy in Newborns
Scientists unveil the first human ‘pangenome’: a new frontier in genomics
New DNA sequencing device could decode your genome for just $1000
Nobody’s perfect: we all carry genetic variants that may cause diseases

Much like proteins, they are made of amino acids and exist in cells, but have an uncertain role. With more evidence, some peptideins may eventually become recognized proteins. Others may turn out to be biological leftovers.

The new label gives these molecules a formal place in research databases, so scientists can track them, compare them and test whether they have real biological roles.

“What we’re now seeing is a vast set of protein-like molecules that were effectively invisible before,” Jonathan Mudge, a co-first author at EMBL-EBI, said in a statement. “In a sense, we’ve been looking at biology through an incomplete lens.”

Major databases, including GENCODE, UniProt and PeptideAtlas, will now begin including the peptideins. Adding peptideins to databases gives researchers a name, a sequence, and a searchable record to use in future studies.

A New Trail for Cancer Research

Once the researchers had evidence that cells were making peptideins, they started to ask the important question: do any cells need them to survive?

Cancer cells offered a way to test that at scale. Regular cells can be harder to use this way because many don’t grow indefinitely in the lab and they can be more fragile. Cancer cell lines, by contrast, are built for this kind of experiment: they grow readily, divide quickly, and can be screened across hundreds of different genetic backgrounds.

Using CRISPR screening data and follow-up experiments, the researchers disrupted peptidein-producing sequences across cancer cell lines.

In simple terms, CRISPR lets researchers switch off or disrupt specific genetic sequences. So they disrupted sequences that produce peptideins across many cancer cell lines, then watched what happened.

One result stood out. A peptidein made from OLMALINC appeared to be important for survival. When researchers switched it off, 85% of more than 485 cancer cell lines struggled to grow.

The findings also point toward immunotherapy. Cells routinely chop up proteins and display some of the fragments on their surface using specific proteins, giving immune cells a way to inspect what is happening inside. If tumor cells display peptidein fragments, scientists may be able to use them to spot cancer cells or design treatments that train immune cells to attack them.

What Comes Next

The work is still at an early stage. The OLMALINC peptidein is not close to becoming a treatment, and researchers still need to learn what it does in healthy cells. But the experiments show that researchers can test peptideins directly and that some may play a real role in disease.

The same research network had seen a similar pattern before. In an earlier study of medulloblastoma, an aggressive childhood brain cancer, scientists found that a tiny protein called ASNSD1-uORF helped MYC-driven cancer cells survive. Researchers at the Princess Máxima Center are now testing whether the same molecule plays a role in other childhood cancers, including neuroblastoma.

Perhaps most importantly, the study creates a database that other scientists can use.

“We’re just beginning to see what this ‘dark proteome’ has to offer,” Dr. John Prensner, a pediatric neuro-oncologist at the University of Michigan Medical School and co-leader of the study, said in the Princess Máxima Center release. “It’s like the trailer to a movie.”

The study was published in the journal Nature.