Blog

We're not going to commit to blogging regularly as we already know that it's difficult to achieve. However, we will blog when there is an issue that we feal passionate about or when we just want to have a rant about something in the news relevant to proteomics.

Second Blog - about the twitter feed material

Blog 2

A response to a tweet about stooping to scientific infomercials ;-)

January 28, 2017 by S J Osborne

CEO, CSO and Founder of Pastel BioScience

pologies for not responding sooner - been busy. Firstly I would like to make it clear that neither I nor Pastel have any connections whatsoever to any #proteomics companies other than that of Pastel BioScience (@PastelBio). I do occasionally link to other company's information (@Thermosci @BrukerMassSpec @SCIEXOmics @WatersCorp @Shimadzussi @Illumina @BioRad @RocheAppliedSci and many others etc. etc. etc.) but hopefully impartially and with no particular bias, but certainly do not imply any recommendations. As I've responded to similar comments in the past, I don't profess to read all of the articles or links in depth and therefore don't claim to make any judgements as to their scientific validity. My main aim in the twitter feed is to just point people in the direction of information that may be useful in the general field of #proteomics. Now, I haven't looked at the webinar of the link in question but my hope is that there will be some information, whether it be of a general nature or demonstrating technical specifications, which may be of use to at least someone out there. Obviously I hope that anyone that does access the information then takes a critical look as to the accuracy of any statements made.

My concern about self-censoring based on association to a 'company' is that many of them produce information and blogs that actually makes for useful or at least interesting reading so where does one draw the line ? Similarly some other twitter feeds and blogs that I link to regularly (e.g. @Proteomicsnews, @Proteomics_now) are either indirectly or directly linked to other companies and yet provide excellent commentary and information - should I stop quoting these ? One could then argue that I should only tweet scientific articles but even then the big-name groups have associations with various companies - and have even formed new startups of their own, so once again where does one draw the line and how much research and effort is required to uncover those ties ? My preference, in part it has to be said due to lack of time, is to just provide links to 'potentially' useful information and then let others make a judgement call on whether that information is useful to them or not, and to its accuracy.

Hope this clarifies my position but would welcome comments from others in the #proteomics community and may, but only may, change my position if there is a consensus ;-)

First Blog - hopefully not the last

Blog 1

'Fit for Purpose' Biomarkers - why do we have so few of them ?

November 22, 2013 by S J Osborne

CEO, CSO and Founder of Pastel BioScience

hile the basic rules for assessing good biomarkers have been known for many years (1, 2) more recently there has been a plethora of publications in both scientific journals (see as examples 3, 4, 5) and web-based articles (6, 7, 8) covering the thorny issue of how biomarkers found in discovery programmes have then not progressed to fulfil their promise when in validation studies and as a consequence the dearth of new biomarkers reaching the clinic. Even the FDA has issued a Guidance document (9) setting out the process for "Qualifying" biomarkers (drug development tools, DDTs) to be used in the drug development process in the hope that more biomarkers may be employed.

A number of said publications quite correctly not only identify the problem in its various nuances but also put forward ways by which the quality of biomarkers and their implementation in drug discovery and clinic may be improved. These range from the technical aspects such as considerations of specificity, prevalence, sample size in terms of 'power' analysis, blinding, through contributing factors such sample collection, storage of samples and inter/intra assay variability of any instrument measurements, to more basic issues including funding constraints and the seeming lack of appreciation by healthcare systems, as to the real value of biomarkers and diagnostics/prognostics in reducing overall costs.

The author does not doubt that many of the aforementioned, and various combinations of them, have contributed to the current state of affairs but believes there is an additional and more fundamental issue which appears to have been overlooked or perhaps more likely conveniently ignored by researchers in an overzealous rush to apply the -omics technologies to the exciting and entirely worthwhile endeavour of biomarker discovery and personalised medicine.

What might this be - well it appears that there is an inherent assumption by many of the researchers performing the discovery phase that, with a well powered sample set, good analytical techniques and experimental design, a 'fit for purpose' (ffp) biomarker must by default exist in their own particular -omics field be it genomics, transcriptomics, proteomics, metabolomics etc. What seems to be lacking is the appreciation or acceptance that while they may be able to find "good" biomarkers, a ffp-biomarker might only be present in one of the other -omics fields, or more likely still, be a combination of markers from each of the -omics fields. In fact it is almost certain that for most complex multifactorial diseases the absolute gold standard biomarker could well be a mix of both physiological measurements, lifestyle factors in combination with various -omics markers. That's not to say that ffp-biomarkers may not be found in individual -omics fields but more that it should not be assumed a priori to be so.

However, this is not the author's main criticism of the discovery phase on which all subsequent biomarker development hinges. What the author notes, and seems to have been curiously overlooked by others, is the apparent low probability of finding a biomarker, even if it is present in a single -omics field of study, when the number of markers to be surveyed is significantly less than the total number present. Now some will argue that a proportion of the potential markers in an -omics set may be related to structural components and not those intimately involved in the dynamics of a pathway, cell or disease state. This may be true but even taking these factors into account there appears to be a massive disconnect between what might be realistically achievable from a given sub-set of markers and what is hoped for by the researchers.

Using proteomics as the basis for the arguments that I'm trying to make, and simplifying the example to a very great extent, let's consider the human proteome of ~20,300 possible proteins (coded for by the ~20,300 corresponding genes) and conveniently forgetting the PTMs, isoforms etc.). It has been estimated (10, 11, 12) that of these 20,300 proteins roughly 1/3 of them have not been formally detected by any of the current technologies (MS, 2D-Gel Elec., microarrays and others). What are the consequences? Well, if a single marker that is the ffp-biomarker of choice for a given disease were randomly distributed in the proteome there would be a probability of 0.66 (2/3) that we could discover it, assuming that the experimental design is sound. A 66% chance doesn't sound too bad at first, however as we well know for most complex multifactorial diseases it is unlikely that a single marker will have the necessary requirements to act as a ffp-biomarker. So what happens to the probability of being able to detect a ffp-biomarker when the required number of markers, constituting it, increases? Unfortunately, probabilities of detection fall dramatically, see figure below, such that if 3 markers were to constitute the ffp-biomarker the probability of detection would be just 0.29 or 29% chance, even if the experimental design was perfect.

But let's take it a step further, while 1/3 of the proteome is not currently detectable, and therefore 2/3 (~14,000) is, the majority of discovery programmes will use MS runs that cover the 'high content' phase where 3,000 to 5,000 proteins can be detected in the first, roughly, 5 hours. Beyond this the 'low content' phase may deliver only an additional ~20-50 proteins/hr. So let us assume that 5,000 proteins are detectable; this is roughly one quarter of the proteome. How does this impact on the probability of detecting a ffp-biomarker? Well the short answer is - disastrously. Obviously with a single marker as a ffp-biomarker the probability is around 0.25 but as the number of markers required increase, there is a precipitous fall in probability (figure above). At just 3 markers the probability is a frightening 0.015 or 1.5% chance of detecting the ffp-biomarker if it were to be randomly distributed.

Now many will argue that the proteins constituting a ffp-biomarker are unlikely to be totally randomly distributed amongst the proteome and are more likely to reside in those groups for which we have good techniques for detection e.g. signalling proteins, kinases, etc. This may be true, although certainly not proven, and currently not demonstrated by the existing biomarker discovery rates. However, even if say 2 of the markers are among those that we can definitely detect and a third is required that is among those randomly distributed then the effect is still massive.

In the last paragraph, I tried to give a glimmer of hope, that is if at least some of the markers constituting a ffp-biomarker are definitely detectable then we have more of a chance. True, but the flip side is that (i) the probabilities are still very low and (ii) while many are now using MS as the tool of choice for discovery programmes, many other researchers are using microarrays limited to detecting hundreds to around a thousand markers at most.

If you do the maths for just 100 or 1,000 - well you really don't want to !!!

And just to throw another damper on the fire, in all of the aforementioned I have referred to solely detection of the marker but in reality what we are most likely talking about is not only detection but also quantification of the various markers making up a ffp-biomarker. The quantification, if not reproducible, will further significantly limit the potential detection of a suitable ffp-biomarker and more than likely kill it.

I have used proteomics as an example but, apart from genomics which has a more binary aspect to it and for which all genes are just about known and detectable, I believe most of the other -omics suffer in a similar manner during the discovery phase, albeit to a greater or lesser degree. So is it all bad news, well no. Having recognised the problem there are a number of options. The first is to apply the existing subsets of detectable markers, in any of the -omics fields, only to those diseases where we have categorical proof that the ffp-bioamarker will exist within the subset i.e. the biochemical pathways and secondary interaction paths have been conclusively mapped to the nth degree. However, few if any of such pathways can truly be considered 'completely' mapped. The alternative and preferred way forward is to significantly improve the -omics detection technologies such that they are able to rapidly detect and reproducibly quantify a much larger proportion of their respective complete '-omes'. At a 90% detectable '-ome' a 3 marker ffp-biomarker has a 73% chance of being detected while at 95% it becomes a very respectable 86% chance of detection.

Leaving aside things like poor experimental design, sample collection, storage, sample size and power analysis considerations and the numerous other failings of many biomarker studies; the number of articles that have appeared in the last few years both attempting and supposedly showing the discovery of 'good' biomarkers from 'small' if not 'idiotically small' panels of potential markers is to say the least disheartening. The fact that many if not most have then failed to get through validation is therefore not surprising. So, until the -omic technologies have improved considerably, should we not spend significantly more time, resources and money on their further development in terms of coverage, reproducibility and ease of use, and less on their employment in biomarker discovery studies that are almost certain to fail (i.e. P_success from outset <<< than it should be) ?

1. "Bias as a Threat to the Validity of Cancer Molecular-Marker Research," by David F. Ransohoff, Nature Reviews Cancer, February 2005.

2. "So, You Want to Look for Biomarkers," by Joshua LaBaer, Journal of Proteome Research, June 2005.

3. "Comparison of Effect Sizes Associated with Biomarkers Reported in Highly Cited Individual Articles and in Subsequent Meta-Analyses," by John P.A. Ioannidis and Orestis A. Panagiotou, JAMA, June 2011.

4. "Implementation of proteomic biomarkers: making it work" by Harald Mischak et al., European Journal of Clinical Investigation, September 2012

5. "Breaking a Vicious Cycle" by Daniel F. Hayes et al., Sci Transl Med, July 2013

6. http://archive.protomag.com/assets/problem-with-biomarkers

7. http://www.pharmaphorum.com/articles/how-can-we-reliably-discover-the-biomarkers-we-need-to-achieve-stratified-medicine

8. http://wingibbons.wordpress.com/2013/10/21/detailed-perspectives-on-biomarker-discovery-and-development/

9. Guidance for Industry: Qualification Process for Drug Development Tools, FDA, October 2010

10. Secretary General of HUPO, HUPO 2010 congress | http://lifescientist.com.au/content/molecular-biology/news/feature-quest-for-the-human-proteome-64921521

11. "The State of the Human Proteome in 2012 as Viewed through PeptideAtlas" by T. Farrah et al., J. Proteome Res., DOI: 10.1021/pr301012j

12. "Plasma Proteomics, The Human Proteome Project, and Cancer-Associated Alternative Splice Variant Proteins" by Gilbert S. Omenn, Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, November 2013