Overview
OK, it's pretty easy just to click on a link in a database and get a journal
citation. But more often than not we need to do a more complex search. Our
experience with web searches leads us to understand the ease with which searches
can be
accomplished.
That
same
experience
also
shows
us
how
little
relevance the outcomes have to what we were looking for.
Understanding search strategies and the rules of an information system can
make us better scientists. We'll use two tools, PubMed and the libraries' licensed resource, OVID, to examine details of searching the literature.
Controlled Vocabularies
Mature literature databases employ a "controlled" vocabulary, relevant
to the discipline. There are many ways to say that you are interested in a
specific
disease, for example. A controlled vocabulary for searches in the biomedical
literature is defined by MeSH, "Medical Subject Headings." The primary value of a controlled vocabulary is that it accounts for synonyms
and variations in spelling. Controlled vocabularies enhance the specificity of searches.
Finding the correct term is not always easy. New terms are continuously evaluated and, sometimes, older terms require adaptation.
Examples:
- The MeSH term for the influenza in humans was "influenza" until
2006, when it was changed to "influenza, human".
- The MeSH term for haemagglutinin is "hemagglutinins" (with
additional narrower terms)
- The MeSH term for influenza virus hemagglutinin glycoprotein is " hemagglutinin
glycoproteins, influenza virus ."
- The MeSH term for bird flu was "influenza, avian" until
2006, when it was changed to "influenza in birds".
MeSH headings can be
- "major" or "focused" - specifically about the topic
- "minor" - including, but not necessarily about, the topic
MeSH headings are organized into a hierarchy.
- Articles are indexed by humans to the most specific, or narrow, term
- When searching a broad term, you need to "explode" the search
if you wish to also retrieve citations indexed under the narrower terms.
- PubMed explodes topics automatically for you.
- Ovid asks you to choose regarding the "explode" option.
DIfferent tools provide varied ways to hunt for the MeSH terms you will need
to do good literature searches. Some of these tools are public, some require
a subscription.
Consider the issue of "personalized medicine," a phrase used to describe how medical treatments may be customized to individuals' specific genetic makeup. We'll see how the term is handled in both "public" and "subscribed" spaces.
- Public space: PubMed or MESH Query = personalized medicine
- What's happening? PubMed
or MeSH combines four searches with the Boolian operators, promising to return hits with "personalized" in any field AND (as well as) a set of other properties enclosed in parentheses.
- TIAB searches Title/Abstract from any PubMed record that is not part of Medline and then searches through the entire Medline database for references including the MeSH terms "pharmaceutical preparations" OR "medicine," OR, finally, medicine as a text word in the record.
- Note: in this example, both PubMED and MESH databases translate to the same query. That is not always the case
|
 |
- Subscription space for UMDNJ - Ovid Query = personalized medicine
- What's happening? Ovid offers a selection of MeSH headings based on a proprietary algorithm.
In this case you would probably select "pharmacogenetics"
|


|
Keyword Searching - Uncontrolled Vocabularies
"Keywords" were a good idea gone bad. Some people thought that journal authors or web page authors could imagine, in an unstructured vocabulary, a set of terms that would be chosen by people who might be interested in their subjects. As computers got faster and databases encompassed entire text, the need for specific key words has shrunken.
Generally when we search for "key words," we are looking for words in an article or book title, an abstract, any controlled vocabulary, or possibly the full text of an article, book, or web page. Sometimes free-text
or keyword searching is the right way to go. For example,
- The topic may be too new to have a controlled term assigned.
- You may want to restrict your results to include a term that is not in the
controlled vocabulary.
- You haven't found many results using the controlled vocabulary. Keyword
searching will locate references that include your terms in the title, abstract, or full text. The choice of which of these to search may be yours.
- Unlike using a controlled vocabulary, you must think of all possible synonyms
and spellings.
Field Searching
Field searching is a technique that can be used for both controlled and uncontrolled vocabularies. Most of you are familiar now with the field structure of GenBank. MEDLINE, too,
has a decided structure of fields that can be used in searching.
- GenBank queries are sometimes best made with attention to special fields
such as
- LOCUS
- ACCESSION
- REFERENCE
- etc.
- MEDLINE queries through PubMed can specify one or more fields such as
- Author
- Affiliation (Institutional)
- Grant Number
- Secondary Source ID (Molecular sequence or structure)
- MESH
- Title (Article)
- Journal Title
- Title/Abstract
- etc.
Just as we earlier in the term examined the records and field definitions for GenBank, an exhaustive description of MEDLINE fields is at NCBI. Depending on the database or search interface you are using, field options may subtly change.
Web Searching
We are all familiar with dropping terms into Google to find interesting pages
on the web, but we have only a limited idea what it is doing behind the scenes,
and search engines will mix scientific journal articles with other, less well-founded
information.
Here's what Google
tells us about how it does its ranking. Web search engines use
private, or proprietary, algorithms for searching and ranking the results.
These algorithms
change often in a neverending battle with savvy webmasters who figure out how
to get their sites on the top of the list.
Google Scholar is an attempt
to assist web users to retrieve more scholarly literature than general search engines.
- Google has made arrangements
(and is having other fights) with some publishers to search full text and link to it, but actual retrieval of the full text depends on
whether the articles are freely available or your institution has a subscription
to
the e-journal.
- Set Google Scholar's preferences to "Show links to import citations into" your bibliographic software if you wish. More on that at the end of this workshop.
Structured Queries on the Web
Most search engines use some form of Boolean logic, allowing you to use AND/OR
or +/- to include or exclude terms and offer some kind of advanced search that
allows you to fine tune your results. Behind the scenes, they quietly combine
your terms and hopefully come up with a useful list of results. The advanced
search features add some form of structure to your query, but you are still
using key words and need to allow for synonyms and variations in spelling.
Google Scholar, as well
as general and specialized web search engines, will retrieve citations, and
sometimes full text, from scholarly literature. Google Scholar's advanced search
feature allows you to specify an author, journal, time frame, or general subject
area as part of your query.
Though they offer some basic form of structured searching, most web-based search
engines do not offer the highly precise and structured query interface available
through bibliographic database interfaces such as Ovid MEDLINE or PubMed.
Page last updated
March 4, 2008
|