Skip to Main Content

Data Curation & Management Toolkit: Share & Find Data outside HKUST

This guide is a brief introduction to Data Curation and the HKUST Data Management Services

DataSpace @ HKUST

DataSpace@HKUST is the data repository for HKUST research community (faculty members and research postgraduate students) to manage their research data.

  • Store
  • Share
  • Organize
  • Preserve
  • Publish 

You can use it as a data repository or as a data workspace.

Interested?  Contact the DataSpace Team or read more.

Data Repositories Outside HKUST

These repositories are places where you can put your own data, or search for and use other's data

Data Directories

General Sciences

  • Dryad
  • figshare- Accept a wide variety of file types, such as figures, datasets, images, and videos.

Life Science

  • Genbank - The NIH genetic sequence database
  • Protein Databank - A worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids

Social Science

Earth Science

Computer Science

  • GitHub - Millions of developers use GitHub to build personal projects, support their businesses, and work together on open source technologies.
  • SourceForge - is an Open Source community resource dedicated to helping open source projects be as successful as possible. We thrive on community collaboration to help us create a premiere resource for open-source software development and distribution.

Google also has a dataset search - to go through repositories

Open Data Worldwide

  1. Data.gov.hk - The datasets are provided by different government departments and public/private organizations.
  2. Data.gov - The home of the U.S. Government’s open data.
  3. The European Union Open Data Portal - Open data from European Union institutions.
  4. Data.gov.uk - Data from the UK government departments and agencies, public bodies, and local authorities.
  5. Canada Open Data is a pilot project with many government and geospatial datasets.
  6. The CIA World Factbook - Information on history, population, economy, government, infrastructure and military of 267 countries.
  7. Healthdata.gov - 125 years of US healthcare data including claim-level Medicare data, epidemiology and population statistics.
  8. Humanitarian Data Exchange (HDX) -  from the U.N. Office for the Coordination of Humanitarian (OCHA) H has UN data, Facebook public datasets, Red Cross, and hundreds of other organizations.
  9. ILOStat - data on labor, working conditions, prices, etc. from the International Labour Organisation (ILO)
  10. UNICEF offers statistics on the situation of women and children worldwide.
  11. WHO | Global Health Observatory (GHO) data offers world hunger, health, and disease statistics.
  12. Amazon | AWS Public Data Sets - a Huge resource of public data, including the 1000 Genome Project, an attempt to build the most comprehensive database of human genetic information and NASA’s database of satellite imagery of Earth.
  13. Google Public data includes data from world development indicators, OECD, and human development indicators, mostly related to economics data and the world.
  14. Google Trends - Statistics on search volume (as a proportion of total search) for any given term, since 2004.

Text and Data Mining (TDM) - HKUST Library Subscribed

To help HKUST students & staff who want to do text and data mining (TDM) with Library subscribed material, we created a  Library Guide to help you know different publishers and aggregators policies in TDM:

Data Sharing & Re-Use Eucation

  • NIH Data Sharing and Reuse Seminar Series
    • Recordings of the National Institutes of Health (NIH) Office of Data Science Strategy once a month seminar series highlighting exemplars of data sharing & reuse.

Data Discovery & Citation Workshop

Library workshops on Data Discovery & Citation are offered every semester.

 

© HKUST Library, The Hong Kong University of Science and Technology. All Rights Reserved.