BIDS for Spectroscopy

wclarke · May 27, 2020, 5:54pm

Hi Everyone,

I wanted to make a topic here about the proposed MRS BIDS format extension. This is so that we could have a place for discussion and hopefully move the proposal forward as a community. I’ll give a brief overview in this post and then add some of my thoughts in another post below.

For those unfamiliar with BIDS it stands for Brain Imaging Data Structure. BIDS is a specification for arranging brain imaging data measured from multiple formats and modalities in a certain structure in a file system. A dataset adhering to BIDS is immediately more easily understood by other people, software and maybe even you. Much more information is available on the bids website.

BIDS does not currently include any specification for spectroscopy data, focussing mainly on (functional) MRI. However, several modalities (MEG and EEG for example) have successfully been integrated into BIDS using the BIDS community extension proposals. Such a proposal has been made for MRS, originally by Dickson Wong, which can be found here.

The proposal needs to cover a few areas:

A file format for MRS data
A format for any meta data
A specification for both required and optional meta data.
File naming conventions

Currently the proposal contains suggestions for these fields for SVS (edited and non-edited) and proposes using a combination of tab separated files and JSON files. There has been discussion on this document recently about the current suggestions but it currently lacks direction and is difficult to monitor. I hope we can move it along here. More on my thoughts below.

wclarke · May 27, 2020, 6:13pm

My thoughts on this are as follows:

We should not limit ourselves to MRS. We are seeing MRSI move on in great strides, and a plethora of new editing techniques. Functional and diffusion weighted MRS is also on the rise. We should aim to encapsulate all of these in any proposal.
We should use this opportunity to try to come up with a standard file format for spectroscopy data. There a lot of great tools out there for processing and fitting our data, and each of them has reams of code just for dealing with the myriad of proprietary data formats we get off our scanners. Without a standard format modularity of tools is out of reach, and we can’t achieve the success of e.g. FSL, SPM and AFNI.
The above requires a flexible file format which can encode up to three spatial dimensions, a time dimension and some additional flexible dimensions. We should capture orientation information and limited meta data.
We should try to align closely with other (neuro)imaging formats. If we come up with our own special format we will continue to isolate spectroscopy and not be able to leverage existing tools.

For the above reasons I would propose trying to use the NIFTI format. Meta data can be encoded in a JSON “sidecar”. This is already the standard in BIDS and for many neuroimaging tools. It has the ability to contain 3D spatial + time domain data + a few other dimensions.

I have put a little effort in to seeing if NIFTI would work for MRS. Paul McCarthy kindly made some adjustments to fsleyes so that complex spectroscopy data could be displayed properly in it (see attached)

I have also got a (very much beta stage) tool for conversion of various spectroscopy formats to NIFTI available on Github and Pypi (https://github.com/wexeee/spec2nii). Currently this tool is focussed on Siemens formats (what I have availible), but I would be keen to extend it to other formats if possible.

mmikkel · May 29, 2020, 4:25am

Hi @wclarke, thanks for starting a conversation about the BIDS Extension Proposal for MRS in this forum. I and a few others (including Dickson Wong and some of the BIDS management team) who over a long period have been working on this (very slowly, unfortunately!) had a brief email exchange in January about reigniting this initiative.

I think this is a great medium to make the community aware of the Proposal and to garner input and commentary about what it should include. I agree wholeheartedly with developing a standard file format for MRS data. Another option is the ISMRM Raw Data format.

I’ll reach out to the others I’ve been communicating with to bring them to this forum so we can continue the dialogue in earnest!

wclarke · May 29, 2020, 2:01pm

Hi @mmikkel.

That’s great that there is already a small group interested and making some progress. Please do tell them about this forum. Hopefully we can get something going on here.

I completely agree that ISMRMRD is another potentially suitable format. Most of the effort in converting to either NIfTI or ISMRMRD I think will be the same, and then moving between them would be fairly easy.

nwduncan · June 2, 2020, 7:30am

I’ve been using the BIDS format for my fMRI data and can vouch for it being worthwhile. I’m probably not much help from the technical side but would be happy to contribute to the development of the MRS standard however I can.

martin · June 3, 2020, 9:26am

Hi @mmikkel,

I actually played around with ISMRMRD for storing MRS data a while back for this purpose (probably when the draft BIDS MRS google document was made public) and found the libraries quite difficult to work with. Not due to bad design, but rather the format is quite complex due to its flexible design. NIFTI on the other hand was very quick to get started with, and has the advantage of being the standard for most other MR modalities - so inherently BIDS-friendly. I agree with Will that NIFTI + JSON is probably the way to go here.

Thanks for reaching out to Dickson Wong and the BIDS team. This was on my list of things to do in order to try and figure out how to help move this effort forward. Please let us know when you hear back. Perhaps we should try and organise an online meeting at some point to get the interested parties together?

Martin

mmikkel · June 3, 2020, 4:42pm

It sounds like NIFTI is the way to go then in terms of a standard data format! I admit I have almost zero understanding of the nuts and bolts of NIFTI, so I’d love to hear your insights @martin and @wclarke when it comes to how this would look for raw MRS data.

I am very much on board with an online meeting. My understanding from Dickson is that he has been in touch with Alex Lin and @bsoher about moving all this forward. I’d be happy to organize a Zoom meeting or the like for interested parties.

martin · June 3, 2020, 6:07pm

There is a fairly long thread describing some options (towards the end) proposing how to encode MRS data points as NIFTI on page 4 of:

I found the following as a useful reference to get up to scratch with the NIFTI standard:

Thanks for offering to organise a meeting, Zoom is fine for me.

mmikkel · June 3, 2020, 6:19pm

Great; thanks! I’ll have a look through these.

bsoher · June 4, 2020, 3:37pm

Hi Everyone,

I had the honor of sitting on Dickson’s PhD defense last year, and chatted a bit with him about his BIDS work. It would be great to garner a community consensus on how to store data, organize projects and share information generally.

I’ve been wondering though a bit about the process for all this and thought I’d post some thoughts here to start contributing, and maybe gain clarity. Please correct any of my statements below if they are incorrect/unclear.

NIFTI is a storage format for data. While BIDS is an overall way to organize a scientific project that may have multiple types and formats of data and resources .
What we are proposing here is to pick a consensus format in which MRS data will be stored, and add information to the BIDS schema that describes how to access MRS data via that format as part of a BIDS organized project.
In reality, BIDS (generally, OR the extension that we are doing now) could be extended to handle a variety of MRS data formats, such as the DICOM Spectroscopy Information Object Definition (IOD). It would just (maybe) be easier if we only had to deal with one consensus, though.
Whatever format(s) we propose to support, ideally we would provide a standard code library in multiple languages to parse data from various (manufacturer) formats to the agreed upon standard BIDS MRS format. This would be a ‘do it once and everyone uses it’ sort of time saver. But, there would be upkeep through the years as manufacturers, or labs, change formats, etc.

So, off on a tangent … the ‘need’ that BIDS addresses has been floating around now for quite a while. I can think about a couple of examples just off the top of my head. I’m sure other groups have existing solutions, too.

The work done by Andrew Maudsley in his MIDAS EPSI project since 2004. They have an entire XML schema set up to organize multi-modality MR projects that include EPSI, anatomic MRI, diffusion MRI, QSM, and other data sets taken from the same subject. It’s organized along typical MR scanner terminology: tags for <project>, <subject>, <study>, <series>, and then lots of sub-tags for images, processed result, imported results etc. Here’s a link to one of their (many) PDFs describing the schema.

http://mrir.med.miami.edu:8000/midas/chrome/site/private/Documents/Help_files/MIDAS%20Project%20Description.pdf

XCEDE - XML-based Clinical and Experimental Data Exchange is an XML schema for developed by members of the Biomedical Informatics Research Network. It provides an extensive hierarchy for storing, describing and documenting the data generated by scientific studies. Here’s a couple of links to the original publication and github repository resources (which looks inactive for the last 8 years? - or maybe just stable?)

https://www.researchgate.net/publication/226075581_XCEDE_An_extensible_schema_for_biomedical_data

github.com

incf-nidash/XCEDE/blob/master/README.md

## XML-based Clinical and Experimental Data Exchange 

The XCEDE schema provides an extensive metadata hierarchy for describing and documenting research and clinical studies. The schema organizes information into five general hierarchical levels: a complete project; studies within a project; subjects involved in the studies; visits for each of the subjects; the full description of the subject's participation during each visit

Each of these sub-schemas is composed of information relevant to that aspect of an experiment and can be stored in separate XML files or spliced into one large file allowing for the XML data to be stored in a hierarchical directory structure along with the primary data. Each sub-schema also allows for the storage of data provenance information allowing for a traceable record of processing and/or changes to the underlying data. Additionally, the sub-schemas contain support for derived statistical data in the form of human imaging activation maps and simple statistical value lists.

XCEDE was originally designed in the context of neuroimaging studies and complements the Biomedical Informatics Research Network (BIRN) Human Imaging Database, an extensible database and intuitive web-based user interface for the management, discovery, retrieval, and analysis of clinical and brain imaging data. This close coupling allows for an interchangeable source-sink relationship between the database and the XML files, which can be used for the import/export of data to/from the database, the standardized transport and interchange of experimental data, the local storage of experimental information within data collections, and human and machine readable description of the actual data.

#### Latest release
* XCEDE 2.0: [schema](https://github.com/incf-nidash/XCEDE/blob/master/xcede-2.0-core.xsd) 

#### User Resources
* [XCEDE Manual](https://github.com/incf-nidash/XCEDE/blob/master/manual/manual_full.pdf)

#### Developer Resources (build applications using NIDM)
* [Schema Documentation](https://github.com/incf-nidash/XCEDE/blob/master/documentation/xcede-2.0-core.xsd.html)
* [Examples](https://github.com/incf-nidash/XCEDE/tree/master/examples)
* [Tools](https://github.com/incf-nidash/XCEDE/tree/master/tools)

https://github.com/incf-nidash/XCEDE/blob/master/manual/manual_full.pdf

So, as we begin this work with BIDS, I think we need to consider what we can learn from these historical (hysterical?) examples.

Yes, for some reason I am organizing myself with a lot of lists today.
I have not extensively compared the BIDS structure to the XCEDE or MIDAS hierarchies, but they are based on similar project organization needs. So, likely there is significant overlap, even if they are called different names. We should delve into these a bit to see what they have learned about handling MRS or MR data generally, so we don’t have to relearn it on our own.
Really, the only long term format that has had any longevity has been DICOM. I’m sure that has been due to adoption as an industry standard. Painfully complicated, but it has lasted.
It’s probably good that we are hanging an MRS extension onto an existing schema with a large supporting community. This seems to be the only way that any code/schema project has longevity.
Likely, we should aim from the start to create a BIDS extension for MRS that does not preclude some sort of industry adoption at a later date. And maybe include ‘things’ that promote long term community/industry adoption.

Alright, enough generalities. Here’s what I’m thinking about NIFTI as a format.

I like how well it handle multi-dimensional data. Very important for MRS data, whether it be SVS, SI, diffusion MRS, longitudinal MRS series, edited MRS, multi-D spectral acquisitions, whatever. I think that NIFTI can store those 1D/2D/3D/4D/5D etc. arrays.
However, the standard NIFTI header is NOT sufficient to describe what all those dimensions are. And the supporting parameters for voxel/slice/slab location, orientation, region of interest, etc.
There is the NIFTI ‘header extensions’ protocol that allows you to stuff any sort of data ‘blob’ you want (XML text string, JSON text string, whatever) after the standard header. This would totally work to allow us to describe everything we wanted to about the data stored in the NIFTI data array.
Using these ‘header extensions’ has not had widespread adoption by community or in coding libraries.
I think that BIDS generally uses an external JSON file to describe what is happening in the project. But, I would not want to depend on an external file to allow me to access my MRS data accurately from a NIFTI file. It’s like back to the bad old days of if the external header is lost, the binary data is useless.
I think that at a minimum, we could create a sub-entry for the BIDS JSON file for access to the project in general, but we should also stick that sub-entry into a ‘header extension’ within the NIFTI data file. That would allow the file to be accessed / shared even if the external JSON descriptor file was not available.

Alright, gotta get back to the paying job now.

Best,

Brian.

mmikkel · June 4, 2020, 4:13pm

I’d like to follow on from @bsoher and include the following BIDS-related links for those who might not now what BIDS is:

https://bids-specification.readthedocs.io/en/stable/

uzayemir · June 22, 2020, 4:01am

Dear All,
I like to the discussion and it is very timely,

Let me try to provide my understanding.
1- First of all, we need to understand MRS(I) is not different from any other MR imaging modality. Although it is the first invivo mr modality, we are behind MRI. But If we start to see the MRS as a multi-echo mri, then we could make entire field working for MRS(I). all the tools and developments can be used for MRS or MRSI or multi echo MRI
2- This motivates me to use NIFTI starting from 2009 later Dinesh incorporated it in MRspa for SVS. It was initially for visualization but as you all aware of that NIFTI can hold many dimensions, I decided to put fid in nifti too (I will provide more deatils).
3- Once I started to develop mrsi methods, I had difficult times to visualize my results since I cannot generate dicom (I still can not). I decided to use nifti for mrsi data format (
https://www.nature.com/articles/s41598-018-26096-y
) for my reconstructed files, also see

https://twitter.com/uzemye/status/1095341933675728900

. That allowed me to use fslview and afni first and then fsleyes (after Paul includes the power spectrum option). In summary, once you are in nifti format, you can facilitate all the tools developed for mri modalities like fsl and afni for MRSI.
For instance you can use TEDAna for T2* estimations of fid (GitHub - ME-ICA/tedana: TE-dependent analysis of multi-echo fMRI) (see my hbm2020 talk this year,

or you can use topup - FslWiki for distortion corrrection (which I am using all the time) or flirt for motion correction (from my cmrr uhf workshop talk

Then life becomes easier if you think the mrs as multi-te MRI as you can see from the fsl-MRS( thanks to Will-Saad ).

To be honest I found afni-brick (bucket) approach more flexible than nifti (except brick do not keep oblique data) however nifti is the most common one. Also afni visualization suits MRS better then fsleyes since you can visualize YxY grid of spectra

(

, see figure 2).

And the supporting parameters for voxel/slice/slab location, orientation, region of interest, >>etc.

You do have the orientation information (in plane rotation), voi (size) , orientation in dicoms, what you need to calculate for the nifti is transformation matrix. It is simple for SVS since you have only one pixel and it is dimension. For MRSI you need to have the grid size, FOV and volume of interest is coming from your image acquisition and then create the nifti file. What I do is later mask it with mrpage to generate 3D information (

, see figure 1). What I use after these, keep lcmodel outputs (waterscaling factor, LW,SNR,CRLB,metabolite con centration, toegthr with water images, watersuppresed images, and lcmodel input image) in nifti or afni-bricks (find it beter ), for time series I keep another set which includes fids (input lcmodel, lcmodel fits….). you can use this for all body organs and for different purposes (they are all afni-buckets or fsl nifti)

https://twitter.com/uzemye/status/1254158713985908738

martin · June 22, 2020, 6:04pm

Hi Uzay,

Thanks for letting us know what you’ve been up to, it looks like a few of us have been playing around with nifti and MRS for some time!

Personally I think nifti is the way forward for sharing MRS (due to good existing libraries for python, c, matlab, R) - but I’d be interested to know if you think there are any limitations that can only be fixed with a more flexible container like the anfi-bricks you mention.

Martin

martin · June 24, 2020, 9:04am

Hi @wclarke,

I’ve had a little play around with spec2nii and it looks very nice - great work.

I was wondering if we could try and define the minimum information we need for MRS processing that can’t be included in the nifti file and should be in the json to be considered “valid bids MRS” or similar. All the orientation information, voxel dimensions and imaging matrix is nicely covered by nifti. The dell time (1 / sampling frequency) can be in pixdim4.

What’s missing, as far as I can tell, is the transmitter frequency (needed for the ppm scale), the central frequency of the ppm scale (usually around 4.65 for 1H and 0 for 31P). Alternatively the central frequency could be inferred from the nucleus. So perhaps we require dicom fields : Transmitter Frequency, Nucleus and Chemical Shift Reference (see http://dicom.nema.org/medical/dicom/2016c/output/chtml/part03/sect_C.8.14.html).

I think we discussed this before, but we also need a way to interpret the other 3 nifti dimensions (dynamics, coils, indirect acquistion dimension etc), either by defining a default interpretation, or requiring information in the json file when they have a dimension length greater than 1. This is important as we really need to start discouraging the storage of SVS as a single averaged spectrum, yet vendor raw formats can be horribly complex.

Of course there is lots of other (optional) useful information to be included in the json file, but I thought we should start with the absolute minimum to be able to ensure some level of comparability/consistency between software packages.

Be good to hear your (and anyone else’s thoughts) on this.

Martin

admin · June 24, 2020, 1:46pm

+1 on storing the transmitter frequency, if this isn’t done already.
Also +1 on the dimensions discussion! Absolutely to be had.

I’d also like to propose storing editing pulse frequencies, durations, and something like contrast matrices describing in which order they are applied in which sub-experiments of the edited acquisition. We should probably also get e.g. @AssafTal’s input as to how we might accommodate other multiplexed acquisitions (multi-TR/TE, fingerprinting).

I’m not even sure how much of this is stored in the actual vendor headers, but would you guys think there’s merit in somehow storing the slice-selective gradient directions? That would help with reconstructing the chemical shift displacement. I think I recall a paper where Thomas Lange (or Thomas Ernst?) did this, but I can’t find it right away.

Best,
Georg

uzayemir · June 24, 2020, 2:15pm

Hello MArtin,

Here is what Afni has for fitting,
More in the folllowing link
https://afni.nimh.nih.gov/pub/dist/doc/program_help/README.attributes.html

Uzay

#define FUNC_FIM_TYPE 0 /* 1 value /
#define FUNC_THR_TYPE 1 / obsolete /
#define FUNC_COR_TYPE 2 / fico: correlation /
#define FUNC_TT_TYPE 3 / fitt: t-statistic /
#define FUNC_FT_TYPE 4 / fift: F-statistic /
#define FUNC_ZT_TYPE 5 / fizt: z-score /
#define FUNC_CT_TYPE 6 / fict: Chi squared /
#define FUNC_BT_TYPE 7 / fibt: Beta stat /
#define FUNC_BN_TYPE 8 / fibn: Binomial /
#define FUNC_GT_TYPE 9 / figt: Gamma /
#define FUNC_PT_TYPE 10 / fipt: Poisson /
#define FUNC_BUCK_TYPE 11 / fbuc: bucket */

martin · June 25, 2020, 7:11am

Yes, more pulse sequence info would be good. Maybe we need different levels of conformance:

enough information to plot the raw data with the correct ppm scale (sufficient for sharing simulated data or acquired data that isn’t intended to be mapped onto MRI).
enough information to localise the data in the scanner coordinate system.
detailed information about the acquisition protocol to perform basis simulation, relaxation correction, editing steps, CSD directions…

Martin

martin · June 25, 2020, 7:18am

Hi Uzay,

The func type integer code could be easily added to the json sidecar. Thanks for reminding me about metabolite and statistical maps, these fall under the category of a derivative: https://bids-specification.readthedocs.io/en/derivatives/05-derivatives/01-introduction.html.

Martin

wclarke · June 26, 2020, 8:54pm

Hi @martin,

Glad you liked what you found. Spec2nii is very much still in development, and I’m very happy to make changes as things move on. I could really do with some GE and Philips test data (phantom with structure, different orientations etc with a screenshot of the right orientations!) so I can get the orientation information working well for those.

I like the idea of having various levels of compliance. In terms of what we definitely need for the lowest level, I agree that we need transmitter/imaging frequency. Should we include receiver bandwidth/dwell-time? As you say it would be in the pixel dimensions, but maybe it should appear in both. I’m wary of defining a reference point for the ppm scale, I think it is better to identify the nucleus and leave users/programs to infer from that if they wish to. I’m also not aware of Siemens DICOMS having that field populated, and I’m not sure how recon programs could calculate it reliably anyway.

The second level could indeed be orientation info - should we have a field in the json indicating if this level of compliance is met?

For the dimensions, I agree with your(?) comments back on the google doc. I think we should have defaults, but then also have the option to specify labels in the json sidecar. I also agree with your thoughts on having coils, dynamics and an “indirect acquisition” as defaults. I could imagine that we could define some specific tags for dimensions, then a catch all “indirect” and some user indices. We could specify options like:
DIM_COIL
DIM_DYN
DIM_INDIRECT
DIM_EDIT
DIM_USER_0
DIM_USER_1
…
DIM_USER_N

With the intention that if you run out of dimensions in the NIfTI file you can switch to multiple files with that same naming convention, so we never hit a limit (maybe someone is doing some in vivo 3D+ NMR somewhere).

For handling vendor naming conventions, we could have an additional JSON parameter for each dimension where a freeform name can be specified. That could be used to store “original” names in. E.g for Siemens twix.
dim_4: DIM_COIL
dim_4_name: “Cha”
dim_5: DIM_USER_0
dim_5_name: “Set”
dim_6: DIM_USER_1
dim_6_name: “Rep”

@admin I’m not sure how much stuff we could pull automatically from headers, it’s hit and miss whether even simple fields get populated correctly by the vendors. However I think we should define a list of fields that could be included. We can start simple (TE,TR,TI) and get more exotic, but we should also define units.

One way that we could have different values for each of these is to have an array associated with each dimension that can have header parameters which overwrite the top level definitions for iterations of that dimension.E.g.
A 1x1x1x2048x32x10 file (32 coils, 10 echo times). In the json it reads
img_freq: 120.3003
TE: 0.001 <— This isn’t required and gets overwritten
dim_4: DIM_COIL
dim_4_name: “Coils”
dim_5: DIM_USER_0
dim_5_name: “Echo time”
dim_5_header: [{TE:0.001},{TE:0.002},…{TE:0.010}] <— these header values are used for each value of this dimension. You can redefine any number of values, but if it isn’t listed a top level one will be used.

Anyway that’s some thoughts from me.

mmikkel · June 26, 2020, 11:53pm

Agreed on all these points. Yes to including dwell time. And of course, all these things can be put into the .json sidecar file.

For reference, here’s what the current BIDS Specification (v1.4.0) looks like for MRI metadata fields: https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/01-magnetic-resonance-imaging-data.html