OpenNeuroPipeline#
- class brainsets.utils.openneuro.OpenNeuroPipeline(raw_dir, processed_dir, args, tracker_handle=None, download_only=False)[source]#
Bases:
BrainsetPipeline,ABCAbstract base class for OpenNeuro dataset pipelines.
This class provides foundational tools and conventions for preprocessing and handling OpenNeuro datasets within the Brainsets framework. It is designed to be subclassed for specific datasets and supports both EEG and iEEG modalities.
- Attributes (to be defined by subclasses):
dataset_id: Identifier for the OpenNeuro dataset (e.g., “ds005555”).brainset_id: Unique local identifier for the brainset.origin_version: Version string corresponding to the raw source dataset.derived_version: Version or tag indicating the processing version of the derived data.description: Optional textual description of the dataset.modality: Data modality for this pipeline. Must be overridden by subclasses.
- Customization points:
- This class supports and encourages dataset-specific customizations via:
CHANNEL_NAME_REMAPPING: Map original to standardized channel names.TYPE_CHANNELS_REMAPPING: Map channel types to specific channel names.IGNORE_CHANNELS: List channels to exclude from processing.
- These can be set as class attributes or managed dynamically by overriding the following methods:
The
process_common()method implements the standard steps and routines shared by all OpenNeuro datasets. This provides a consistent entry point for all dataset processing. Subclasses may extend or override theprocess()method to implement dataset-specific processing logic.
Documentation can be found in the official brainsets docs: See [Creating an OpenNeuro Pipeline](https://brainsets.readthedocs.io/en/latest/concepts/openneuro_pipeline.html) for the complete guide on building OpenNeuro pipelines.
- parser: ArgumentParser | None = ArgumentParser(prog='__main__.py', usage=None, description=None, formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=True)#
Argument parser for common OpenNeuro pipeline flags.
- modality: Literal['eeg', 'ieeg']#
Data modality for this pipeline. Must be overridden by subclasses.
- origin_version: str#
Version of the original data. Must be specified by the author of each pipeline.
- derived_version: str#
Version of the processed data. Must be specified by the author of each pipeline.
- CHANNEL_NAME_REMAPPING: dict[str, str] | None = None#
Optional dict mapping original channel name to new standardized name.
For more complex configurations (e.g., per-recording mappings), override get_channel_name_remapping() instead.
- TYPE_CHANNELS_REMAPPING: dict[str, list[str]] | None = None#
Optional dict mapping channel types to lists of channel names.
For more complex configurations (e.g., per-recording mappings), override get_type_channels_remapping() instead.
- IGNORE_CHANNELS: list[str] | None = None#
Optional list of channel names to ignore.
Channel names should be specified as they appear in the original namespace of the raw object (i.e., prior to any remapping or type changes).
- static validate_dataset_id(dataset_id)[source]#
Validate OpenNeuro dataset identifier format.
OpenNeuro dataset IDs follow the format ‘ds’ followed by exactly 6 digits, where the numeric portion ranges from 000001 to 009999.
- Parameters:
dataset_id (
str) – The dataset identifier in strict format: - Must be lowercase ‘ds’ followed by exactly 6 digits. - Numeric portion must be between 000001 and 009999.- Raises:
ValueError – If the dataset ID format is invalid, does not match strict format, or the numeric part is outside the valid range.
- Return type:
- classmethod get_manifest(raw_dir, args)[source]#
Generate a manifest DataFrame by discovering recordings from OpenNeuro.
This implementation queries OpenNeuro S3 and parses BIDS-compliant filenames to discover recordings for the pipeline modality.
- Parameters:
- Returns:
subject_id: Subject identifier (e.g., ‘sub-01’)
recording_id: Recording identifier (index)
s3_url: S3 URL for downloading
- Return type:
DataFrame with columns
- download(manifest_item)[source]#
Download data for a single recording from OpenNeuro S3.
- Parameters:
manifest_item – A single row of the manifest
- Return type:
Series- Returns:
Series containing
subject_id,recording_id,s3_url,latest_snapshot_tag,age,sex, andspecies.
- process_common(download_output)[source]#
Process data files and create a Data object.
This method handles common OpenNeuro processing tasks: 1. Loads BIDS-structured data files using MNE-BIDS 2. Extracts metadata (subject, session, device, brainset descriptions) 3. Extracts signal and channel information 5. Creates a Data object
- process(download_output)[source]#
Process and save the dataset.
Default implementation calls
_process_common()and persists the result. Subclasses can override to add dataset-specific processing.- Parameters:
download_output (
Series) – Series returned by download()- Return type:
- get_channel_name_remapping(recording_id=None)[source]#
Return channel name remapping for a given recording.
Override this method to provide per-recording channel name remappings. The default implementation returns the class-level CHANNEL_NAME_REMAPPING attribute.