Skip to Main Content

Emerging AI Tools for Literature Review: Primo Research Assistant

This guide consolidates the teaching materials for library workshop Emerging AI Tools for Literature Review.

What Is Primo RA

Primo Research Assistant (Primo RA) is an AI-powered academic search tool integrated with PowerSearch to enhance literature discovery. Simply type in your questions in natural language, and the system will return 5 suggested sources and a summary

Like all AI tools, Primo RA has both strengths and limitations. This guide aims to help you better understand the tool so you can use it more effectively in your research process:

Practical Tips - Do's and Don'ts

 

 ✔ Do's 
  • Use pre-filters to make your search more specific. E.g. type (peer-reviewed) and date (last 5 years). 

  • Expand beyond the top 5 results. While Primo RA highlights key sources, important information may exist beyond this initial selection. Use the "View more results" option to access a wider range of materials and avoid missing important research.
  • Treat summaries as starting points only. Primo RA summaries provide quick topic overviews but lack the depth needed for thorough literature reviews. For rigorous research, always read and evaluate the full articles carefully.

 

 ✗ Don'ts 
  • Do not blindly trust the generated answers. Even when citations are real, they may not actually support the generated claims. Always read the source articles and verify.
  • Do not rely on it as your single source for literature search. Be aware of its content exclusions (e.g. content from Elsevier). You still need Google Scholar, scholarly databases, and other AI research tools for a comprehensive search.

 

How Primo RA Works


1. Query Conversion

When user inputs a natural language query, the LLM within Primo RA will convert it into a Boolean search query which typically contains keyword-based variations of original query, connected with “OR” operators.


2. Searching & Retrieving

The Boolean query is sent to Central Discovery Index (CDI), a giant database that contains over 5 billion scholarly material records, to retrieve the results. Note that this step is Keyword search (similar to directly typing keywords into PowerSearch) rather than Semantic search.


3. Re-ranking

The top results (up to 30) are re-ranked using embeddings to optimize the relevance matching, and identify 5 sources that best address the user's query. 


4. Summarizing

The top 5 results and their abstracts are sent to the LLM with the instructions to create the summary ("Overview") with in-text citations. The summary and sources are then shown to the user in the response.

 

Pros and Cons

 

👍 Pros 

👎 Cons / Limitations 

Content Coverage
  • Unlike other AI-powered research tools that rely on Semantic Scholar data and OA content, Primo RA searches through extensive Central Discovery Index (CDI) database, which covers non-journal content, including books, magazines, and even theses.
  • Excludes content from certain opted-out content owners (e.g. Elsevier, APA, JSTOR, DataCite), sources with insufficient metadata, retracted works, and news content. 
     
  • Only uses abstracts and metadata for RAG, not full text (even for OA content), which limits the quality of answer. This is similar to most AI-powered research tools. 
Search Capability
  • Support natural language queries. 
     
  • Offers prefilters for resource type (e.g. books, journal articles, peer-reviewed) and date (e.g. last 5 years). Users can also apply filters with natural language, e.g. "Suggest peer-reviewed articles about XXX in last 5 years."
     
  • Returns real citations (since works are from CDI). 
     
  • Filters out withdrawn or retracted works (but sometimes may not work). 
  • Search in natural language DOES NOT equal to semantic search. 

    Primo RA converts natural language query into a Boolean query and search in Primo (= keyword search). Some testing indicates poorer retrieval results compared to other similar tools. 
     
  • Limited to 5 results only. Users can expand search results through "View more results" option, but will get a different set of results. 
     
  • Results containing sensitive keywords may be filtered out even when relevant results do exist in Primo. E.g. "massacre",  "Gaza War”. (Certain keywords will trigger content filtering layers from Azure OpenAI, leading to error or no results even when relevant results do exist. Read more here)
Quality of Generated Summary -
  • Summary is based on the title and abstracts of the 5 sources only. Sometimes not all sources are cited in the summary (e.g. only 3 cited).
     
  • Even the citations are real, they may not support the generated claims in the summary. This is a common issue among AI research tools. 
     
  • Surface-level summaries, not comprehensive enough for in-depth literature reviews or systematic research. Current retrieval accuracy may affect the relevance of generated summary. 
LibGuide content by HKUST Library is licensed under CC BY-NC-SA 4.0, unless otherwise noted.