informatics institute logo Informatics Institute UMDNJ logo
Bioinformatics Workshop: Information Integration
 

Literature and Data

Libraries
Bibliographic Software
 
 
 
Growth

Scope of the Information Explosion

The Good News; The Bad News

A major reason that "regular folks" (scientists and working scholars) have been drawn to Informatics is the volume of literature and data they are faced with. Where once a scholar might have a few books or papers on the desk, now the expectation is for complete access to the literature and data of the world.

So how bad (or good) is it?

The Literature

Literature Growth

Source: American College of Clinical Pharmacology http://www.accp.com/pod/p3b5pre06.pdf

How Many Journals Are There?

  • Scientific literature: exponential growth since 1750 when there were only 3 scientific journals.
  • By 2002 there were over 120,000 scientific journals.
  • Source: http://www.mco.edu/lib//education/filter.pdf (September 2002)

How active is the PubMed Universe?

As of this writing:

  • PubMed had reached 16 million citations. PubMed includes several sources:
    • Medline
      • contains over 14 million citations dating back to the mid 1960s.
    • OldMedline
      • contains nearly 2 million older citations (1950 -1965)
    • Other (~2% of total)
      • In Process and other non-indexed citations
  • Nearly 5,000 biomedical journals are indexed for Medline .
  • Every month:
    • Nearly 50,000 journal citations are added to Medline.
    • About 76 million searches are performed.

MedLine Stats

Source: NLM Bibliographic Services Division
http://www.nlm.nih.gov/bsd/index_stats.html


How Many Entries Are There in the Sequence Databases ?

Like many informatics databases, GenBank has seem enormous growth. At its birth in 1982, it catalogued 606 sequences; by 1992, 78,608 sequences were catalogued. Through the October, 2004, the number of sequences has fallen just short of doubling each year. (http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html) As of August, 2007, there were 79.5 billion bases ( up from 65.4 billion bases last year) from 76 million reported sequences, up from 61 million in August of 2006. (ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt)

growth_genbank

Page last updated October 8, 2007

UMDNJ logo Informatics Institute informatics institute logo informatics institute logo