OpenTalks #15 | 开放科学(Open Science)

内容来源：OpenScience

开源神经影像大数据集漫游指南

随着开放科学的发展，越来越多的神经影像大数据集已发布或正在发布给世界各地的研究人员开放使用。例如，Human Connectome Project (HCP)，Adolescent Brain Cognitive Development (ABCD) study，和UK Biobank。然而，成千上万的被试、成百上千的测查指标、大量的高维度成像数据，传统地数据下载和存储尤显得捉襟见肘。研究者们必须面临的问题是如何合理组织和使用这些大型数据集。Corey将就他新近发表在Nature Human Behaviour上的文章为大家呈现从下载和存储数据、到理解数据结构和测查指标、再到报告和共享研究结果的整个过程。

时间

北京时间[UTC+8] 1月23日(周六) 21:00

欧洲中部时间[CET] 1月23日(周六) 14:00

美国东部时间[EST] 1月23日(周六) 08:00

时长为1小时

zoom信息

Meeting ID: 913 9401 0836

报告流程

报告45分钟，提问15～30分钟

报告语言

英语

分享嘉宾

Corey Horien，在读博士
MD-PhD candidate at Yale University working with Professor R. Todd Constable. He studies individual differences in fMRI connectivity data in the developing brain.
耶鲁大学MD-PhD candidate，与R. Todd Constable教授共事。他的研究领域是发育中脑功能连接的个体差异。

题目：A hitchhiker’s guide to working with large, open-source neuroimaging dataset / 开源神经影像大数据集漫游
摘要

Large datasets are growing increasingly common in neuroimaging. Due to the simultaneous increasing popularity of open science, these state-of-the-art samples are more accessible than ever. Nevertheless, their sheer size presents a new set of challenges that might cause difficulties. In this talk, I will discuss tips for working with large datasets from the end user’s perspective. I will cover all aspects of the data lifecycle: from what to consider when downloading and storing the data to tips on how to become acquainted with a dataset one did not collect and what to share when communicating results. Emphasis is placed on practical solutions, as well as lessons our lab has learned working with large samples.

大型数据集在神经成像中越来越普遍。随着开放科学的发展，这些先进样本比以往任何时候都更容易获取；然而，它们的庞大规模带来了一系列新挑战。在本次演讲中，我将从用户的角度讨论处理大型数据集的技巧。我的报告将涵盖数据生命周期的所有方面：从下载和存储数据时要考虑的问题，到如何熟悉未收集的数据集，以及在交流结果时要分享的内容。本次报告的重点是实用的解决方案，以及我们实验室在处理大样本时学到的经验教训。

主持人

Han Zhang

Research Fellow, Laboratory for Medical Image Data Sciences, National University of Singapore