Queries & Search Strategies
• Describe the process of conducting a query.
• Define natural and artificial vocabulary.
• Define controlled vocabulary and provide examples.
• Define methods of searching databases including the use of keywords and descriptors.
• Define, identify and apply search strategies (e.g., keyword, Boolean operators).
• Define truncation in searching and provide examples.
• Define a proximity operator in searching and provide examples.
• Define Boolean logic in searching and provide examples.
• Define citation searching and provide examples of how it is used.
• Apply technology tools to address reference and user services needs.
When conducting a search, most library users simply type the first word that pops into their mind and hope for the best. However a successful search depends on applying search strategies.
A query is a request submitted as input in a search tool to retrieve information relevant to a user’s search. The request is entered as a search statement such as Yellowstone National Park hikes.
Natural language involves the use of everyday spoken or written language.
A query might be written as “Where can I find out about hiking in Yosemite National Park?” or “What’s the current interest rate for home loans?”
Databases that allow natural language searches are sometimes referred to as using free terms or user-assigned terms.
The use of keywords is a type of natural vocabulary. The problem with keywords is that they can mean different things depending on the context.
The word heart can be associated with romance, cooking, or a medical/health context. Users need to be careful about their selection of words. For instance, coronary, artery, blood, and arteriosclerosis would be words to include along with the work heart for a health-related search.
While some systems allow natural language, others require the user to formulate search statements using artificial language.
Most library users are most familiar with doing a free-text search using keywords.
A free-text search allows natural language to be used as search terms.
A keyword is an important word likely to be found in the title, subject headings, abstract, or text of a record.
A problem with a keyword search is that it doesn’t consider that words have multiple meanings. For instance, the word weed is used to describe a wild plant growing where it’s not wanted. However it’s also common word for marijuana.
Artificial language is constructed using a pre-established set of rules.
Although it may use the vocabulary of natural language, it’s written in a form that the particular search software understands such as Marie Curie or su:curie.
Controlled vocabulary involves a pre-determined list of terms searchers must use for their query.
An alphabetical listing of these words or a thesaurus is usually made available to searchers.
Creating, maintaining, and updating this preferred list of terms is called vocabulary control. For instance, rather than using the word car, you might need to use the term automobiles.
Or, if you’re interested in hairstyles, you might need to use the term personal grooming.
Rather than dogs, you’d search for canines.
Controlled vocabulary is useful for a number of reasons. Within a discipline or across fields, a term may have the same or different meaning such as mercury the planet, mercury the god, mercury the element or mercury the automobile.
Uses of Controlled Vocabulary
Controlled vocabulary address a number of problems that face those developing search queries. For instance, mobile phone, cellular phone, and wireless are sometimes used synonymously, but they can also have different meanings.
Controlled vocabularies like the Library of Congress Subject Headings use subdivisions to refine major concepts such as Liver--Blood Vessels--Diseases.
In databases, a layer of controlled vocabulary may exist between what the user enters as keywords and database itself. Without this layer, users searching for the word iron may miss articles that refer to iron by its chemical element Fe.
Examples of Controlled Vocabulary
In addition to well-known subject headings tools such as the Library of Congress Subject Headings, many other controlled vocabulary tools are used by catalogers. Three examples include:
- MeSH: Medical Subject Headings (used by MEDLINE, PubMed, and other health science databases)
- ERIC: Thesaurus of Educational Resources Information Center Descriptors (used by ERIC EBSCO database)
- PsycINFO: American Psychological Association Thesaurus of Psychological Index Terms (used by PsycINFO)
Spend some time exploring examples of controlled vocabulary such as the Thesaurus for Graphic Materials in the abolitionist example.
Searching Databases: Keywords
Two options are generally provided when searching databases: keywords or descriptors.
Search Databases: Keywords
In keyword searching, the computer searches every word in a document’s record. This type of searching is also called natural language or text word searching.
It simply means that you can use terminology from everyday or specialized language to search a topic. Although many bibliographic databases allow free text searching, you will always get more precise search results using the controlled vocabularies.
For example, if you are looking for information on ethical and unethical practices in companies, what natural language terms or text words would you search for? You could use ethics, ethical, moral, morality, business, corporation, corporations, corporate, company, companies, white-collar crime, and many others.
Searching Databases: Descriptors
In descriptor searching, the computer focuses on a limited number of descriptors that have been assigned by a cataloger. These descriptors or subject headings are based on controlled vocabulary. Instead of needing to enter every possible search term, searchers can rely instead on the controlled vocabulary of the database. It allows users to enter one or several subject headings to guarantee that all of the relevant articles will be found regardless of the specific words the authors used to describe the concepts.
A controlled vocabulary is a thesaurus of descriptors, or a set of standardized terms, to which data are indexed. Nearly every bibliographic database will use some type of controlled vocabulary. Searching for controlled vocabulary terms or subject headings is often the most effective way to search a database. For instance, the searcher could find and use the standardized subject heading business ethics rather than entering a long list of keywords.
A person looking for information on the spread of tuberculosis among prison inmates could use tuberculosis or TB and also prisoner, prisoners, prison, prisons, jail, inmate, inmates, incarceration, and incarcerated. But if a variation was overlooked, relevant materials could be missed. Instead of guessing at which keywords the authors may have used, apply specific controlled vocabulary term instead. All of the articles that describe that concept will be indexed to that term. The effective use of controlled vocabulary or thesauri is one of the skills that distinguish a professional searcher from a novice searcher.
A search strategy is a systematic plan for conducting a search. The user describes the information need, identifies the main topics, and selects finding tools for the subject.
A search statement is created based on the requirements of the particular finding tool.
After the initial search, the user may modify the search statement by incorporating broader, narrower, or related terms to expand, restrict, or add to the search.
The selection of precise keywords is critical in conducting a search. Use of a thesaurus is helpful in identifying words and phrases.
Remember to use quotations around phrases if there is a concern about precision.
Search Strategy: Truncation
Truncation involves replacing characters with a symbol such as an * (asterisk) at the beginning, middle, or end of a keyword to allow for variant forms of the word.
For instance, *librar* would retrieve objects containing intralibrary, interlibrary, librarian, librarianship, libraries, and library. The use of the asterisk is sometimes called a wildcard symbol. Sometimes a question mark or pound sign is used such as wom#n.
Truncation in search statements is sometimes used to ensure that related works are found. For instance, a searcher may wish to use both the singular and plural of a word in the same search by using a truncation such as computer*. Some search tools automatically check for plurals. Some systems have elaborate mechanisms for truncation. For instance, you may be able to add a number to indicate how many letters should be replaced.
Search Strategy: Proximity Operator
A proximity operator allows a searcher to specify how close words are to each other in the resulting documents.
In other words, you don’t just want the words online and searching to be in the article, you want them to appear together such as online searching.
The proximity operator isn’t standardized, but it might appear like online adj1 searching to show you want the words to be adjacent (adj).
Sometimes the word NEAR is used such as search NEAR engine.
Sometimes you can add a number to the operator to show the number of words such as videogames n2 violence. It would find videogames with violence, but not videogames have been accused in recent shootings.
Search Strategy: Boolean Logic
Boolean is a type of logic that involves organizing search words or phrases into sets when conducting a search. Operators are used between words.
The AND command is used to narrow a search by requesting that multiple terms apply. In most search tools, the AND is implied automatically so it’s not necessary. For example milk AND osteoporosis will retrieve records that contain the word milk and the word osteoporosis.
The OR command is used to expand a search using related terms such as kayaking OR canoeing. In another example, use caffeine OR coffee.
The NOT command is used to eliminate unwanted terms from the search such as bullying NOT cyber. This would eliminate references to cyber bullying.
In some cases, parentheses can be used to indicate the sequence of the search such as Current River AND (kayaking OR canoeing).
Online tools like Boolify at help users practice conducting Boolean searches.
Search Strategies: Indiana Bat
Let’s formulate a strategy for locating articles on white-nose syndrome in Indiana bats.
- Identify the main themes and keywords.
- Identify some synonyms and related terms.white nose syndrome white-nosed syndrome
Myotis sodalis [the scientific name for the Indiana bat]
- Consider alternative spellings
white nose syndrome
- Consult a database’s thesaurus or glossary for subject headings.
Web of Science [just one of several possibilities]
- Identify broader and narrower terms.
Myotis [consider other bat genera too]
- Use Boolean operators and nesting to control your search. For example, you don’t want information about the character Batman.
(white nose OR white-nose OR white-nosed OR WNS) AND (Indiana bat OR Myotis sodalis)
- Use truncation or stemming to make sure you find all forms of a search term.
Use fung* to get fungal and fungus
- Use available filters, such as year of publication, format, etc., to limit your results.
- View and evaluate the results. Look at titles, abstract words, and subject headings to find clues for improving your original search strategy. You may need to use different terms or phrases that mean the same thing, and add truncation where possible. Don’t forget to eliminate any stop words.
It’s important that scholarly researchers learn from the work that has been done in the past. One of the most effective ways to gain this knowledge is from examining the list of works cited by others. Researchers follow citations to track down hard-to-find, original works.
Citation searching involves exploring the references or bibliographies at the end of a key work such as a research article. The researcher searches for the key author, journal article, or book.
The search will yield any article containing the author, article, or book in the bibliography. It’s an effective way to locate related works that have been previously published.
Backward and Forward
There are two approaches to citation searching: backward and forward.
Backward citation searching involves examining the list of sources cited by an author. This approach provides researchers with a look at the research and thinking at the time an article was published.
Forward citation searching involves determining whether an article was cited by others after its publication to determine the impact of the work. This approach makes it possible to determine how a particular work shaped future scholarship.
Databases: Citation Linking
Some databases offer citation linking. At the end of an article you may be able to hotlink to the items in the references or bibliography. Increasingly, databases provide a “cited by” or “times cited” feature to help with citation searching.
Google Scholar is an excellent example of an open web database focused on citation searching. The results of title search provides a “cited by” link. Users can also enter an author’s name and check out their profile to see articles and how they are cited by others.
Citation indexes are used to track references found in bibliographies of published papers. These databases allow users to search for the literature in a way not available through keyword searches.
Spend some time using Google Scholar. Look up the names of IUPUI faculty such as Annette Lamb, Andrea Copeland, or Rachel Applegate.
Bibliometrics involves analyzing scientific and technological literature. Citation analysis allows researchers to judge the impact of articles, authors, and research activity.
Institute for Scientific Information’s Web of Science allows users to search forward in time from a known article to more recent publications that cite the work. To conduct a search, enter the cited author, work, or year(s).
Reference and User Services
To address reference and user services needs, librarians must be able to match information needs with information tools and sources. Then, apply effective search strategies.
In some cases, Google may yield the more efficient and effective results. However in another case, a restricted-access database may be necessary.
Librarians must be able to assist users in conducting queries.
Both natural and artificial vocabulary can be used in most searches. However controlled vocabulary can often provide more effective and efficient searches.
Searching databases may include the use of keywords and descriptors.
Search strategies usually include the use of truncation, proximity operators, and Boolean logic.
Forward and backward citation searching is an important approach for scholarly researchers.