Skip to Main Content

Datasets: Finding Datasets

The Open data movement calls for data that can be freely shared and used, and promotes greater accountability and transparency in research practice and communication. 

  • Academic journals increasingly require authors to share the datasets associated with their research findings. 
  • Funding bodies may also require that grant recipients openly publish datasets from their funded studies.
  • Many governments worldwide practice open data practices and allow for the reuse of the data collected by government agencies. 

Using datasets

  • Datasets as part of research projects may include quantitative data in the form of spreadsheets, tables or databases, or qualitative data, for example, notes, videos or images.
  • Datasets allow you to analyze or replicate the results of a study, or, may be used to support new research questions and hypothesis.
  • Available datasets should be evaluated in terms of authority or provenance of the source (for example, government, university, or organization); comprehensiveness of metadata (description of the data); and size of data relevant to a new study.
  • Datasets should only be re-used according to the conditions set by the copyright owner. Re-use may require permission from the data owner, or the approval from a research body.

Dataset repositories

  • Datasets are hosted in different type of repositories including universities, research organizations hand/or discipline-specific repositories.

Dataset Repositories

Figshare allows researchers to share datasets and other research outputs. It also assigns DOI. 

Developed at Harvard University as an open source web repository application for archiving datasets. 

The home of the U.S. Government’s open data Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations,

A repository hosted by CERN that allows researchers from all fields to upload data in all formats for private use or public sharing.

How to cite Datasets

IEEE Format
Online

URL

G.R. Brakenridge, Global Active Archive of Large Flood Events. (September 2, 2019). Distributed by Dartmouth Flood Observatory, University of Colorado. Accessed: November 21, 2020. [Online]. Available: http://floodobservatory.colorado.edu/Archives/index.html

 DOI

R. Knutti, IPC Working Group I AR5 Snapshot: The RCP85 Experiment. (September 20, 2014). Distributed by World Data Center for Climate (WDCC) at KDRZ. doi: 10.1594/WDCC/ETHr8.

Linked Research Platforms

Finding Datasets and Dataset Repositories

Dataset search engines

Google Dataset Search

Google Dataset Search is a search engine for datasets. It searches data repositories across the Web, finding datasets with a simple keyword search. 

Microsoft Research Open Data

A collection of free datasets from Microsoft Research to advance state-of-the-art research in areas such as natural language processing, computer vision, and domain specific sciences. Download or copy directly to a cloud-based Data Science Virtual Machine for a seamless development experience.

Dataset registries

re3data.org

The Registry of Research Data Repositories includes information on more than 2000 research data repositories worldwide.

Open Access Directories

OAD is a list of discipline-specific data repositories.