Team led by Indian compiles online protein encyclopaedia

February 5, 2008

By Killugudi Jayaraman, IANS

Bangalore : An international team led by an Indian biologist has pioneered an online encyclopaedia of human proteins that will help accelerate biomedical research and drug discovery.

The February 2008 issue of the prestigious journal Nature Biotechnology describes how the scientists’ creation, dubbed “Human Proteinpedia”, would help biologists around the globe by serving as a community portal for sharing and integration of human protein data.

“This is an encyclopaedia by the scientists, for the scientists,” said Akilesh Pandey, a scientist at the Johns Hopkins University of the US who led this effort in collaboration with the non-profit Institute of Bioinformatics in Bangalore, which he founded six years ago.

Like Wikipedia – the online encyclopaedia that anyone can edit – the Human Proteinpedia will be an online encyclopaedia that any biologist with proteomic data can edit.

However, unlike Wikipedia, the contributors are expected to provide experimental evidence for the data.

“All the public data contributed to Human Proteinpedia can be queried, viewed and downloaded,” Pandey told IANS.

The current version of Human Proteinpedia contains data on 15,230 human proteins provided by more than 71 laboratories from various parts of the globe – the largest and most diverse collection of experimental data pertaining to human proteins.

Pathologists, molecular biologists, biochemists, pharmacologists and geneticists from both academic institutions and companies contributed the data.

Pandey said: “This database should serve as a ready reckoner for researchers in their quest for drug discovery, identification of disease markers and promote biomedical research in general

“With this resource, biologists can quickly understand what is already known about proteins, allowing them to generate novel hypotheses that can be tested in their laboratories.”

The resource compiles accurate and complete data about a particular human protein from multiple experiments by manual curation – the reason why it took almost five years to develop.

Pandey said: “This effort would not have been possible without the active involvement of the proteomics community, including several leaders of the field.

“Our ultimate goal is to capture all protein data being generated and to present it in the appropriate biological context in a graphic form for biomedical researchers and in computer readable form for bioinformaticians.”

“Given the emerging interest of companies like Google, IBM, SUN and Microsoft in the life sciences, we hope to establish research partnerships to establish the infrastructure that will be required to accomplish our goals,” he added.