Skip to Main Content

Data Curation & Management Toolkit: Share & Find Data outside HKUST

This guide is a brief introduction to Data Curation and the HKUST Data Management Services

DataSpace @ HKUST

DataSpace@HKUST is the data repository for HKUST research community (faculty members and research postgraduate students) to manage their research data.

  • Store
  • Share
  • Organize
  • Preserve
  • Publish 

You can use it as a data repository or as a data workspace.

Interested?  Contact the DataSpace Team or read more.

Data Repositories Outside HKUST

These repositories are places where you can put your own data, or search for and use other's data

Data Directories

General Sciences

  • Dryad
  • figshare- Accept a wide variety of file types, such as figures, datasets, images, and videos.

Life Science

  • Genbank - The NIH genetic sequence database
  • Protein Databank - A worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids

Social Science

Earth Science

Computer Science

  • GitHub - Millions of developers use GitHub to build personal projects, support their businesses, and work together on open source technologies.
  • SourceForge - is an Open Source community resource dedicated to helping open source projects be as successful as possible. We thrive on community collaboration to help us create a premiere resource for open-source software development and distribution.

Google also has a dataset search - to go through repositories

Open Data Worldwide

  1. Data.gov.hk - The datasets are provided by different government departments and public/private organizations.
  2. Common Spatial Data Infrastructure (HK Gov.)
  3. Data.gov - The home of the U.S. Government’s open data.
  4. The European Union Open Data Portal - Open data from European Union institutions.
  5. Data.gov.uk - Data from the UK government departments and agencies, public bodies, and local authorities.
  6. Canada Open Data is a pilot project with many government and geospatial datasets.
  7. Healthdata.gov - 125 years of US healthcare data including claim-level Medicare data, epidemiology and population statistics.
  8. Humanitarian Data Exchange (HDX) -  from the U.N. Office for the Coordination of Humanitarian (OCHA) H has UN data, Facebook public datasets, Red Cross, and hundreds of other organizations.
  9. ILOStat - data on labor, working conditions, prices, etc. from the International Labour Organisation (ILO)
  10. UNICEF offers statistics on the situation of women and children worldwide.
  11. WHO | Global Health Observatory (GHO) data offers world hunger, health, and disease statistics.
  12. World Bank Open Data   Over 2,000 indicators from World Bank data sources downloadable. Browse data by countries, indicators, topics or data sources. Also has pre-formatted tables and reports.

  13. Amazon | AWS Public Data Sets - a Huge resource of public data, including the 1000 Genome Project, an attempt to build the most comprehensive database of human genetic information and NASA’s database of satellite imagery of Earth.

  14. Google Public data includes data from world development indicators, OECD, and human development indicators, mostly related to economics data and the world.
  15. Google Trends - Statistics on search volume (as a proportion of total search) for any given term, since 2004.
  16. Alternative Sources of US Gov. Data from Butler University

HKUST Subscribed Data & Statistics

  1. CDMNext - has millions of macro-economic time series and statistics from 190+ countries. In-depth data is available for China & Indonesia
  2. GlobalEDGE Database on International Business (DIBS) -  Created & maintained by Michigan State University, it provides 5,000+ variables for 200+ countries from mid-1990's to the present;. Free for academic use. Registration required.
  3. National Statistical Offices -  Links to national statistical offices and bureaus across the world, maintained by the United Nations.
  4. Our World in Data -  Data created and maintained by researchers at the University of Oxford, (the scientific editors of the website content) the NGO, Global Change Data Lab
  5. Passport - Volume & value sales statistics for consumer, industrial and services markets across 82 countries worldwide. It also provides a wide range of data on business enterprise including consumer market sizes, consumer lifestyles, companies.
  6. Statista: the portal for statistics  - Over a million market data series on 80,000 topics; data can be exported in PowerPoint, Excel, PNG or PDF. & infographics  explored by subject.

Text and Data Mining (TDM) of HKUST Susbcribed Resources

To help HKUST students & staff who want to do text and data mining (TDM) with Library subscribed material, we created a  Library Guide to help you know different publishers and aggregators policies in TDM:

Text and Data Mining (TDM) from HKUST Licensed Material - Guide

Gale Digital Scholarship Lab

  • A platform for text analysis, data mining, and data visualization. 
  • You can create &  analyze content sets from licensed content in “Gale Primary Sources” collections,

Data Sharing & Re-Use Eucation

  • NIH Data Sharing and Reuse Seminar Series
    • Recordings of the National Institutes of Health (NIH) Office of Data Science Strategy once a month seminar series highlighting exemplars of data sharing & reuse.

Data Discovery & Citation Workshop

Library workshops on Data Discovery & Citation are offered every semester.

 

© HKUST Library, The Hong Kong University of Science and Technology. All Rights Reserved.