http://www.hgsi.com/invest/annual97/leader.htmThe link above takes you to the point in the HGSI financial report where they discuss hoe they have isolated 95% of the genes. Mind you this is from the 1997 report, a lot can have changed in 2 years.**********Here's what HGSIs 1997 annual report saysGene DiscoveryHGS' rapid gene discovery and sequencing activities continue to add valuable information regarding gene function in normal and diseased tissues. As part of our Human Anatomy Project, we have collected more than two million gene fragments from a large number of tissues that are expressed in a wide range of tissues. We believe that we have isolated at least one copy of more than 95 percent of all human genes. In addition, greater than 75 percent of these genes exist as full-length copies in our collection, each capable of producing a functional protein for drug discovery. Last year, the Company initiated a significant program to determine the full-length sequence of many thousands of these genes.What this means is that they have cDNA libraries that they estimate contain 95% of the genes. This means they are colonies on a petri plate. That is very different from sequencing them, mapping them, and inferring function from the sequence, or connecting them to a disease via their chromosomal location. It also means that there are huge numbers of duplicates in their clone collection. cDNAs are represented in the libraries in rough proportion to their abundance in the mRNA population of the cells used to make the library. That's why they isolated 2 million clones to get 95% of the genes. So if they start picking random clones to sequence, most of them will be ribosomal proteins, histones, and actins. Genes that are not expressed at reasonable levels in any of the cells they happened to use as a source for the library won't be in their collection. They know this; they're not trying to hide anything, it's just the way cDNA libraries are.Celera could use the same standard to say that they have collected 100% of the human genome, and plan to sequence many millions of those clones, and assemble it all into a complete map of the human genome. In fact that's what they did say when they started. One big difference is that the abundance of clones in a genomic library does not depend on the expression levels - they are all the same. So when Celera sequences the genome 10 times over they probably have all of it, but when someone else keeps sequencing cDNAs they will keep getting high proportions of stuff they've already seen, and the missing genes will take longer and longer and longer to get.Think of it this way. Celera is taking 10 encyclopedias that have been randomly ripped into paragraph size chunks (but not necessarily at the boundaries between paragraphs), and using computers to find the overlaps to reassemble the whole thing. HGSI is taking encyclopedias that have all been cut neatly between articles - no overlaps - and some articles are present in the pile of clippings only once, while some are present 10,000 times, and some are not there at all. They will never be able to assemble the original encyclopedia from this stack, and it will be an increasingly daunting task to keep going through it to find articles they haven't seen yet. The cDNA approach and the targeted clone sequence of the HGP were the only ways to do it, until Celera assembled the sequencing power and computing power to do it the genomic shotgun way. The cDNA approach is still very valuable, but it is not the same. Claims of having isolated 95% of the cDNAs should not be confused with knowing anything about the genes sitting in tubes in the refrigerator.
Best Of |
Favorites & Replies |
Start a New Board |
My Fool |
BATS data provided in real-time. NYSE, NASDAQ and NYSEMKT data delayed 15 minutes.
Real-Time prices provided by BATS. Market data provided by Interactive Data.
Company fundamental data provided by Morningstar. Earnings Estimates, Analyst Ra