Nearly 20 years ago, scientific leaders in the United Kingdom began an ambitious endeavor: to create UK Biobank, a uniquely powerful biomedical database for public health research. Between 2006-2010, UK Biobank recruited half a million volunteers from across the UK, collecting biological information and health-related data (including genetic, biomarker, lifestyle, imaging, and environmental information) to enable scientific discoveries into the prevention, diagnosis, and treatment of disease. The result? An impactful long-term population-cohort study, available to researchers everywhere to help improve our understanding of illnesses like heart disease, cancer, or depression so that more targeted therapies can be developed to treat, cure, and even prevent them.
In the fall of 2023, there was a major milestone in these efforts, as the scientific journal Nature published a paper sharing preliminary findings from the UK Biobank Pharma Proteomics Project (UKB-PPP), a collaboration that has generated the world’s largest proteomics dataset.
What is proteomics, and how can it help us drive progress toward precision medicine? We sat down with Chris Whelan, Ph.D. from our R&D Data Science & Digital Health team at Johnson & Johnson Innovative Medicine – who served as lead of the collaboration and corresponding author of the paper – to learn more about what this means for science and for patients.
Q: First things first, what exactly is proteomics?
Chris Whelan: When people think of the drivers of disease, they usually think of genetics – if your mother or father had a certain condition, you might have a higher chance of developing it yourself because of your genetic makeup. Advancements in our understanding of the human genome in recent years – including the completion of the
first complete genomic sequence by the Human Genome Project last year – have improved our understanding of disease tremendously.
But genes are only one part of a very complex puzzle. Take two people who are both genetically pre-disposed to develop a certain disease – one develops the disease, while the other does not. Why does this happen, biologically speaking?
A 3-D visualization of the United Kingdom, which is composed of antigens, surrounded by antibodies, which produce a protein concentration signal after binding the antigens.
This is where proteins come in. Proteins are the end products of genes – they’re the building blocks that make up all living organisms. Simply put, proteomics is the large-scale study of those proteins, including their structure, their composition, and how they function and interact with each other in the body. With proteomics, we can measure thousands of proteins circulating in a person’s bloodstream, building out the story of how the human genome and human proteome influence disease onset and progression.
Q: Tell us more about your recent study published in Nature.
Chris: The Nature article outlines the formation and preliminary research of the UK Biobank Pharma Proteomics Project (UKB-PPP), a collaboration between 13 biopharmaceutical companies, including Janssen. UKB-PPP was formed to measure almost 3,000 blood plasma proteins, gathered from 54,219 UK Biobank participants – making it the world’s largest blood biomarker study.
As a first step – and the focus of this recent publication – all thirteen companies collaboratively studied how concentrations of circulating proteins are genetically regulated. We conducted genome-wide association studies (GWAS) - hypothesis-free screens of the genome from chromosomes 1 through 23 - to build a library of over 14,000 gene variants influencing blood protein levels – 80 percent of which were previously unknown.
This library – which will be available to scientists worldwide through the UK Biobank in the coming weeks – can be used to paint a more nuanced picture of complex biological pathways, such as those of the immune system. It can also help us study causal relationships between proteins and diseases, helping us pinpoint which proteins could serve as effective new drug targets. Artificial intelligence (AI) and machine learning (ML) will play a key role here, enabling us to pick up on patterns it would be difficult to identify with the human eye.
Alongside the genetic analysis, we also directly compared protein levels in disease cases versus controls. This captured several well-established blood biomarkers used in hospital settings, like NT-proBNP for heart disease. We also found less-established disease associations that may, one day, make for powerful new biomarkers. For example, a protein called Growth Differentiation Factor 15 (GDF-15) was associated with 18 of the 20 most prevalent diseases in the UK Biobank, indicating this may be a general marker of cellular stress and, consequently, of a person’s health state. We also observed strong upregulation of inflammatory proteins in people experiencing depressive episodes.
Q: Why does this matter for science and for patients?
Chris: Gaining a better understanding of what’s driving complex illnesses – not only at the genetic level, but at the protein level – could have a huge impact on our efforts to fight diseases.
Most modern medicines are developed with a single target protein in mind. In most cases, however, multiple proteins interact to drive the disease. If we could develop medicines targeting collections of proteins simultaneously, preventing their disease-causing interactions, we could potentially enhance our ability to treat, cure, or even prevent diseases before they develop.
Proteomics may also help us understand fundamental biological differences between patients with the same disease. Take depression, for example – some patients experience insomnia, while others experience metabolic issues; some respond well to their first therapy, while others experience lifelong, treatment-resistant depression. Proteomics could help us uncover the biological pathways driving one ‘subtype’ of a complex disease versus another, which could, in turn, help us develop more targeted therapies and deliver the right medicines to the right patients at the right time.
Q: Beyond target identification, how else could this impact pharmaceutical R&D?
Chris: I’m most passionate about the potential to use proteomics as a prediction tool. For example, it could be hugely impactful if we could predict if, and when, a person might develop a disease and intervene early. Or if we could predict things like safety, efficacy and toxicity prior to real-life clinical trials even starting, which could help ensure we advance only the most promising candidates to clinical development, potentially shortening development times and bringing innovation to patients in need faster.
Q: We know this work was done in collaboration with many other partners as a part of the UK Biobank Pharma Proteomics Project (PPP). What was unique about this collaboration?
Chris: In addition to the sheer scope of this effort – one of the biggest industry consortia of its kind – UKB-PPP is also a collaboration in the truest sense of the word. The fact that scientists from companies that otherwise might be deemed “competitors” all worked together on this paper is a testament to the potential of proteomics as the next frontier in biotechnology. Over the past few years, I have been fortunate to witness how other scientists are using the dataset in impressive, innovative new ways. It’s rewarding to see how this dataset and the collaboration behind it are enriching the already-vast UK Biobank dataset and already paving the way for exciting new discoveries.
This has also sparked excitement and driven collaboration within Johnson & Johnson – between the R&D Data Science & Digital Health team that I’m a part of, our Discovery teams and colleagues from across our Therapeutic Areas. Science – and R&D in particular – it is truly a team sport. I feel very fortunate to work in this space at a time when advancements in biology, chemistry and technology are enabling such game-changing progress. I’m incredibly excited to see what the future holds.
Note: The paper was also co-authored by Liping Hou, Ph.D., Principal Scientist, Population Analytics & Insights, Artificial Intelligence, Machine Learning & Digital Health, and Dawn M. Waterworth, Ph.D., Senior Director, Translational Sciences, Immunology, at J&J IM R&D.
October 4, 2023