Skip to Main Content

Web Scraping 101: Workshop PPT

This guide is designed to facilitate the teaching of the library workshop "Web Scraping 101".

About the Workshop

Web scraping is a useful technique for researchers, as it allows them to quickly and automatically extract data from websites for analysis, making their research process more efficient and repeatable. This workshop is designed for non-technical researchers to choose the right tool for scraping content from the web. Module 1 of this workshop introduces the no-code approach to scraping, while Module 2 demonstrates how to use Python code to scrape content.

  Module 1 Module 2
Learning Goals
  • Understand the basic concepts and principles of web scraping
  • Identify available web scraping tools and their appropriate uses
  • Use Power Query and Web Scraper to extract content of interest from the web
  • Make wise decisions when collecting data from the web
  • Understand when coding is necessary for web scraping
  • Use Python code to scrape content from the web
  • Use ChatGPT (Poe) as a learning aid to read, draft, and debug code

* Note: Module 2 is recommended for those who have completed Module 1 or have a basic understanding of web scraping. Prior knowledge of Python can be helpful, but it is not required. 

Prep for  workshop  We suggest using a Windows PC or laptop for Power Query as the "Get data from Web" function is not available on Mac (learn more). You can use a Library PC in the classroom. 

Register a free account on Poe.com (to access ChatGPT)

© HKUST Library, The Hong Kong University of Science and Technology. All Rights Reserved.