Skip to Main Content

Text and Data Mining (TDM) from HKUST Licensed Material

This guide is developed to help HKUST users learn what publishers permit text and data mining via their regular subscriptions.

Practical Tips for Researchers

Before You Begin

  1. Check TDM policies first: Always review the TDM policy of the publisher or database, many of which are available in this guide.
  2. Understand licenses: For open access content, ensure the CC license (e.g. CC BY, CC BY-NC) aligns with your use, and check for third-party content.
  3. Contact support: When in doubt, reach out to library or publisher for clarification, especially for large-scale projects.
  4. Document permissions: Keep records of any permissions granted or agreements made.

 

Technical Considerations

  1. Respect rate limits: Many APIs restrict how quickly you can download content.
  2. Use proper authentication: Follow the publisher's requirements for API keys or registration.
  3. Store data securely: Maintain appropriate security for downloaded datasets.
  4. Consider computing resources: TDM often requires significant storage and processing power—check if your institution provides research computing services.

 

Ethical Considerations

  1. Citation and attribution: Properly acknowledge data sources in publications.
  2. Privacy awareness: Be mindful of potential personal information in datasets.
  3. Transparency: Document your data collection and analysis methods.
© HKUST Library, The Hong Kong University of Science and Technology. All Rights Reserved.