In autumn 2020, HKUST Library's Research Support Services did a small study on text and data mining (TDM) of Library subscribed resources. The findings appeared in a Research Bridge article, Text and Data Mining: Full-text Databases.
Majority of the publishers that support TDM offer the service free-of-charge. However, there are usually some rules and requirements to fulfill. The data mining and data delivery methods may also be quite different.
Commonly Seen Terms and Conditions:
For Cambridge and Sage, their TDM terms of use are very similar to the ones listed above.
Elsevier allows a certain amount of TDM to subscribed content.
"...there are no hard limits on the number of items that may be downloaded via our API. Nevertheless, a reasonable and customary rate limit remains in place to ensure equal access to the API for all users, and we continue to ask users to use our service responsibly.
We understand the need to be flexible and continue to monitor usage and consult with researchers. However, we do reserve the right to deactivate any API key if we believe usage is abusive or impacting the stability of our systems." - Text and data mining FAQs
More info available here: https://www.elsevier.com/open-science/research-data/text-and-data-mining
Text & Data Mining with Factiva requires a separate license.
The Library can provide contact person for researchers to ask for quote.
Gale (Cengage): Data Mining FAQs
Provides Text and data mining tools and teaching with JSTOR, Portico, and other IThaka collections.
JSTOR Dataset Services
Anyone can request a dataset through either of the two services below.
The Nexis Uni subscription from HKUST is good for students to use it for research, but not crawling or downloading large volume of data.
Lexis-Nexis has a section on their website where you can ask about using or purchasing their "Data as a Service" for larger datasets.
They also have a LexisNexis Bulk Content API mining personal consultation service.
Researchers can pay extra to text and data mine ProQuest content that HKUST Library already owns or subscribes to via the ProQuest TDM Studio
"Downloading articles from SAGE Journals for the purposes of text and data mining is expressly permitted in our standard licence agreements and our terms of use for no extra fee. You do not need to ask permission to systematically download articles provided that:
- https://journals.sagepub.com/page/policies/text-and-data-mining
Text and Data Mining at Springer Nature
HKUST Library's license with Clarivate (owner of Web of Science) allows creating & using custom data sets:
Limitations: You may not distribute, sublicense or publicize any portion of the custom dataset or derivative databases.
See Clarivate's Product/Service terms (p. 24)
Text and Data Mining Resources - by Reese Manceaux of the Atkins Library at UNCC
Text Mining Resources - Princeton University Library