Data Workflows Tutorial – CNS 2023, Leipzig

Reema Gupta dda1b305c3 Added logos and cell metadata for slide rendering		před 10 měsíci
images	dda1b305c3 Added logos and cell metadata for slide rendering	před 10 měsíci
notebooks	dda1b305c3 Added logos and cell metadata for slide rendering	před 10 měsíci
.gitignore	c80e69547f Data files in gitignore	před 11 měsíci
LICENSE	7259ba8ee3 Initial commit	před 11 měsíci
README.md	9d36688863 Updated README with detailed schedule	před 10 měsíci
requirements.ipynb	308dcb4423 Initial requirement files	před 10 měsíci
requirements.txt	308dcb4423 Initial requirement files	před 10 měsíci

title: "CNS2023 Data Workflows Workshop" author: "Michael Denker, Moritz Kern, Thomas Wachtler and Reema Gupta"

date: 2023-07-15

Data Workflows Tutorial – CNS 2023, Leipzig

This repository contains (to read as will contain) all files required for the CNS*2023 Tutorial titled, "T08: Using open tools to build efficient workflows for data access, management and analysis".

Schedule

The workshop will take place on the 15th of July 2023 in two parts:

Overview

Session I: 9:00 -- 10:10 am

9:00-9:10 Welcome, Introduction
9:10-9:30 Introduction to GIN
9:30-9:40 Introduction to the Dataset
9:40-10:10 Primer on Neo I

----- 10:10-10:40 COFFEE BREAK -----

Session II: 10:40--12:10 am

10:40-11:00 Primer on Neo II
11:00-11:30 Data Analysis with Elephant
11:30-12:10 Data Organization and Storage with NIX

Requirements

To benefit from the workshop you need to have some experience with the Python programming language. To follow the tutorial, you may use any one of the following three options:

1. Working offline

Before attending the workshop please make sure that either the machine you are working on can run jupyter notebooks and install python packages.

Either download the contents of the CNS2023-Data-Workflows repository via the web or use the command line to clone the repository using git clone https://gin.g-node.org/CNS2023-Leipzig/CNS2023-Data-Workflows.git.

To make sure your machine is set up for the workshop, please install the Python requirements running pip install -r requirements.txt and start (jupyter noteboook requirements.ipynb) and run the requirements jupyter notebook before the workshop. We recommend using Anaconda as a Python virtual environment to make sure you are running the workshop in a clean Python environment.

2. EBRAINS Collaboratory

To interactively follow the tutorials online, we suggest creating a free EBRAINS account (https://www.ebrains.eu/page/sign-up) in advance.

3. Open Source Brain

TODO

Dataset Used

The Reach-2-Grasp experiment

Full data manuscript and dataset

Brochier, T., Zehl, L., Hao, Y., Duret, M., Sprenger, J., Denker, M., Grün, S. & Riehle, A. (2018). Massively parallel recordings in macaque motor cortex during an instructed delayed reach-to-grasp task, Scientific Data, 5, 180055. http://doi.org/10.1038/sdata.2018.55
https://gin.g-node.org/INT/multielectrode_grasp

Tutorial Abstract

Neuroscientists today face challenges in managing the growing volume and complexity of data generated through rapid technological and methodological advancements and sophisticated experimental paradigms. Data management tools and methods provide indispensable solutions for researchers to efficiently handle, organize, and analyze datasets, facilitating model validation, refinement, and simulation, while fostering collaborations. This tutorial presents examples combining multiple tools synergistically into a complete digitized workflow, to help researchers manage and control data and analysis processes.

odML (https://g-node.org/odml) is an open, lightweight and flexible format that provides a common schema (with implementations in XML, JSON, YAML) to collect, organize and share metadata in a human- and machine-readable way.
NIX (https://g-node.org/nix) is a lean data model and file format for storing fully annotated scientific datasets, i.e. the data together with rich metadata (odML) and their relations in a consistent, comprehensive format.
GIN (https://gin.g-node.org) is a platform for version-controlled (git and git-annex) data management and collaboration. It supports any file type and folder structure, provides both web and command-line access, option for local installation, and services including format validation and data publication (DOI).
Neo (http://neuralensemble.org/neo), provides programmatic data objects for working with and representing electrophysiological data, and can read data from many proprietary formats. In combination with NIX, Neo makes electrophysiological data interoperable with generic analysis scripts, tools and services.
Elephant (https://python-elephant.org) provides a large portfolio of standard and advanced methods for analyzing data from neuronal spike trains or time series data, such as LFPs. The Neo data model makes them easily accessible to scientists and applications.
Alpaca (https://alpaca-prov.readthedocs.io) enables simple capture of human-readable provenance of the data processing workflow.

Background reading:

Grewe, J., Wachtler, T., Benda, J., 2011. A Bottom-up Approach to Data Annotation in Neurophysiology. Frontiers in Neuroinformatics 5, 16. https://doi.org/10.3389/fninf.2011.00016
Zehl, L., Jaillet, F., Stoewer, A., Grewe, J., Sobolev, A., Wachtler, T., Brochier, T.G., Riehle, A., Denker, M., Grün, S., 2016. Handling Metadata in a Neurophysiology Laboratory. Frontiers in Neuroinformatics 10, 26. https://doi.org/10.3389/fninf.2016.00026
Sprenger, J., Zehl, L., Pick, J., Sonntag, M., Grewe, J., Wachtler, T., Grün, S., Denker, M., 2019. odMLtables: A User-Friendly Approach for Managing Metadata of Neurophysiological Experiments. Front. Neuroinform. 13, 62. https://doi.org/10.3389/fninf.2019.00062
Brochier, T., Zehl, L., Hao, Y., Duret, M., Sprenger, J., Denker, M., Grün, S., Riehle, A., 2018. Massively parallel recordings in macaque motor cortex during an instructed delayed reach-to-grasp task. Scientific Data 5, 180055. https://doi.org/10.1038/sdata.2018.55
Denker, M., Grün, S., Wachtler, T., Scherberger, H., 2021. Reproducibility and efficiency in handling complex neurophysiological data. Neuroforum 27, 27–34. https://doi.org/10.1515/nf-2020-0041

README.md

date: 2023-07-15

Data Workflows Tutorial – CNS 2023, Leipzig

Schedule

Overview

Requirements

1. Working offline

2. EBRAINS Collaboratory

3. Open Source Brain

Dataset Used

The Reach-2-Grasp experiment

Tutorial Abstract

Background reading: