BIDS for Spectroscopy

@wclarke yes all sounds reasonable.

I’m still awaiting a response from Dickson Wong, but will give him a few more days before contacting the BIDS team to find out about whether we should consider BEP022 as orphaned and try and find someone to officially take over (@wclarke would you be happy to, I guess either way we should consider having a BIDS lead and an MRS lead who could be first/last author if it gets published?)

For now we should probably separate the NIFTI MRS file format from the BIDS spec, so I’ve started a new document and will try and get levels 1 and 2 drafted over the next few days:

once we’re happy with the basics we can invite others to comment, ideally people who know about the less common MRS techniques: 2/3D MRS (states tppi etc), hyper-polarised, hetronuclear stuff, etc…

2 Likes

Would it make sense to create a spreadsheet of vendor file formats and the information they contain? Level 1 should probably cover the intersection of parameters that are available in all headers.

@admin, yes a list of conformance would be a good idea. As far as I’m aware, the only way to export individual coil data from Philips scanners is the list/data format which doesn’t give the sampling or transmitter frequency which I think we need for Level 1. I guess we could define level 0 as the lowest common denominator, which is just the raw FIDs and nothing else.

@martin, happy to take the lead on either part, with a slight preference for the NIfTI section. Brian had a response from Dickson at the start of the month. I’ll forward you what he said. It sounds like Alex Lin had been picking up some bits as well.

@admin Definitely sensible to see where the overlap in vendor format meta-data is/isn’t. Though I think we should not necessarily conform to the lowest common denominator - so a level 0 seems sensible.

Incidentally, I was going to send an email out to everyone today who has shown an interest in the BIDS BEP for MRS. But looks like @martin has beat me to it! :slight_smile:

@wclarke, if you prefer to be the NIFTI MRS format lead, I’d be happy to take the lead on BEP022 as I’ve already been having a back-and-forth with the BIDS team and Dickson Wong about this.

@mmikkel Sounds good, but happy to confer with others before anointing myself :wink: .

I didn’t see an email, maybe I’m being blind.

Hi @mmikkel,

I sent Dickson an e-mail about 9 days ago but haven’t had a response yet. I haven’t got in touch with anyone from BIDS, so please send the e-mail you were planning (unless I’m misunderstanding you).

Apologies, my reply above was poorly worded! I will send an email round to all relevant parties.

And also to echo @wclarke: happy to hear from anyone else who would like to take the lead on BEP022!

Just watched a presentation at OHBM on the BEP001 for quantitative imaging with multiple contrasts. We can probably take a clue or two from those folks and try to stay in line:


1 Like

William/Brian (or anyone who would care to do so), could you explain what is the purpose of such a format? What would you envision another MRS researcher doing with such a dataset once they get their hands on it, and why? Are we trying to create a format that will allow other researchers to:

  1. Peruse the data of a paper for quality assurance?
  2. Recreate an entire processing pipeline described in a paper? (For reproducibility)
  3. Use a single format which will make it easier for software packages to load/save such data?
  4. Something else?

How are those things not addressed by existing formats, such as NifTI or DICOM?

I feel that clearly defining the purpose of such a format would immediately dictate what needs to be in it , and what can be safely discarded.

@AssafTal these are excellent questions, I tried to address some of them in the following:

From my perspective, the most important end goal is to have a good way of sharing data sets which requires a decent format for MRS and a defined structure to combine MRS with other modalities (BIDS).

Hi Assaf,

Welcome to the forum. I’ll try and answer some of the questions. I think it is indeed a good way of figuring out what we need.

On the topic of having an MRS NIfTI standard.
Perusing data e.g. from a publication
I think that this is an important application. We don’t often share much data in our publications, and those data we do share are sometimes liable to be “typical” spectra.

I think some of the reasons we don’t is that there isn’t a standard way to do so. Much of our analysis happens once data is loaded into memory. From there it never again emerges except as either fitting specific formats (e.g. .RAW files), downstream summary statistics or as publication type plots.

If there was an obvious choice of format to share and selection of programs to view it in then I think most of the hurdles would be removed.

Recreate an entire processing pipeline
A data standard will help with this, removing some of the format dependent steps. However, there are other issues that must also be solved to make this a reality.

Single data format for software packages

  1. Currently every software package must reimplement the same functions for data I/O from many different formats. This is done to varying standards as each developer generally has experience with only a subset of the formats. This is a large overhead for any developer. If there was a single format typically handled, we could pool our knowledge and experience in creating a single point of conversion.

  2. Having a single format means that users can use software packages in a much more modular way. Users could pick and choose the best or most relevant pipelines from a collection of toolboxes. In the neuroimaging community people often mix steps from FSL, SPM, Freesurfer and AFNI, this is enabled by them all reading a single format.

DICOM doesn’t achieve this currently mostly because of insufficient vendor reconstruction pipelines. Uncombined coils, or even resolved averages aren’t available in DICOM format on many scanners. In the absence of this the default reconstruction software can apply some very non-optimal steps.

Whilst efforts to solve this on the reconstruction side in either proprietary environments or third-party ones (Gadgetron) might work for the technical research community, I don’t foresee great uptake elsewhere.

Interfacing with imaging tools
To interpret our data fully we are generally reliant on accurately aligning low resolution spectroscopy data with higher resolution imaging data. If we adopt the de facto neuroimaging data standard registration and visualisation become a somewhat solved problem.

I realise that this is a neuroimaging centric view though, NIfTI is not a standard for other anatomies.

On the topic of having an MRS BIDs standard.
I’m not so rehearsed in the arguments for BIDs, and I’ve run out of time for now. I’ll have a think and reply later.

3 Likes

@wclarke et al. This all sounds like a great idea, and I would definitely like to be involved and help to support the adoption of more MRS standards with Suspect and OpenMRSLab. Personally I am reluctant to use a multi-file format to contain anything important, it is too easy for the files to get separated. However I agree about the value of aligning our practices with other MR disciplines. I think NIFTI does support internal extensions, is this a viable alternative?

The idea of having a spectroscopy format with relevant meta-data accompanying it is very appealing. One idea that I have been thinking about is keeping a processing history in the metadata, showing what steps have been performed, and with what parameters etc. This would make it much easier to recreate a processing pipeline, or just to remember what you did when you get around to writing up the paper.

@martin has done a great job of putting together an initial proposal for the format but I wonder whether it is sufficiently flexible at the moment. The assumption is that the data will be always stored with any spatial dimensions in image space, and the spectral data in the time domain, but that seems to mean that a number of processing steps may have to be performed on raw data (for MRSI at least) before it can be saved in NIFTI MRS, and would also mean that it would be difficult to take the output from e.g. an LCModel fit in the frequency domain, which is only defined for part of the original spectral width, and export that in NIFTI MRS. I therefore agree with Assaf Tal that we need to define the applications first, and that will define the format.

I am going to borrow from the UX lexicon and suggest that we need to define some “user stories”, which describe how people see NIFTI MRS fitting into their MRS workflow, and this will guide what kinds of data we need to support. The possible examples I have identified so far are:

  1. Dave is developing software to process spectroscopy data. He only needs to support loading data from NIFTI MRS because all manufacturer raw data formats can be converted to NIFTI MRS by a free, open-source tool provided by the community.
  2. Susan is building a processing pipeline for a new MRS study. She wants to use Nipype to chain together different processing steps provided by different software tools and uses NIFTI MRS as the format to pass data from one tool to the next.
  3. Mohamed is doing a big, multi-centre study with data from several different scanners. He wants to run all the data through a common pipeline so he wants the individual sites to send him data in the NIFTI MRS format
  4. Lasya is submitting a paper and wants to include her processed data in NIFTI MRS so that readers can open it in fsleyes or similar rather than just looking at a static figure.

Any other suggestions, or criticism of these would be most welcome, I think that will help us to build the discussion out from there about what features need to be included to support these users.

Hi Ben,

Good to have you onboard.

On the topic of the current proposal @martin and I have indeed put something together. I think it is now in a position to have comments and suggestions from a wider group. Please do so at this link. You can add comments and also propose edits to the text.

I understand the concerns about the multi-file format. I find myself seeing both sides of the arguments. Having the second file does increase the fragility of the data, but it is quite desirable to have the meta-data in a human readable and very easily parseable format. On the last point does anyone know of widely used readers that don’t handle the header extensions? The current proposal does include the option of storing the JSON header in a NIFTI header extension. I have contacted Bob Cox and Rick Reynolds to get us an ecode: 44 (“NIFTI_ECODE_MRS”). The idea is that any sidecar file would take precedence but you could use the header extension if there wasn’t one.

I like the idea of having something in the JSON header to track processing steps. Could you make a suggestion in the document along those lines? We can store anything we like in that header (within reason), but how do we make it not rigidly prescribed to deal with novel processing steps, whilst also making it interpretable?

Regarding storing data in k-space or frequency domain. I proposed time domain as it is nicely compatible with e.g. FSLeyes to view the data. I’ve found that you can store a fit that is discontinuous at some frequency domain limits (0.2 and 4.2 ppm in this case) in the time domain without much artefact.

Perhaps you could do the same with an LCModel fit, though as it does casually discard the imaginary data it might need some reconstruction.

I think that NIFTI is the wrong format to try and store k-space data, as far as I’m aware that isn’t done for any other modality/contrast typically stored in NIFTI. Even if we went against the grain and did allow it, to be useful we would need to also store additional information about the trajectory to handle the increasingly common non-cartesian MRSI sequences. My view is that currently MRSI reconstruction is to specialised to try and handle in the general case, and so is the preprocessing that goes along with it. The ISMRMRD format might be a better format for that. My vision for this format was that it would be for reconstructed data, prior to fitting (the primary MRS analysis step), therefore aligning with it’s use in the imaging world. I’m happy to told I’m wrong though and what I say is too limiting.

I like the examples - it’s a good way of thinking about it. I think what we propose does cover those specific ones. I guess one case we don’t handle yet is:

  1. Barry is using a collaborator’s MRSI sequence and has been given raw (unreconstructed) data
    from the scanner by the radiographer. We would like to derive concentration values of a couple
    of specific metabolites to inform on a specific clinical hypothesis of his. His target journal requires that data is made available with the publication. His collaborator provides a matlab script to reconstruct the data.

What stage of this process should we try to provide for?

Just because it got a little lost in the longer post above…

Please do look at Martin and my suggestion for a NIFTI format specifcation. Edits can be suggested here.

We would appreciate feed back on all of it, and also suggestions of what MRS specific metadata should be defined int he specification (see Appendix A).

@bennyrowland thanks for offering to help. Regarding your point about storing analysis results, I’ve had some similar thoughts and came to the conclusion this should come under a different specification. So we could have something like:

mrs_data_level_X - for storing data points (what myself and Will are working on right now)
mrs_fit - for storing fit plots (baseline, residual, individual metabolite signals etc)
mrs_basis - for storing basis set signals, names
mrs_??? - for storing metabolite maps

where these data descriptors are stored in the intent_name. Of course all this will take a fair bit of time to thrash out, so better to focus on getting agreement on how to store the data points (and ideally have multiple software packages providing basic support for the format) - before moving out to the other datatypes in my opinion. However, I do agree we should think a bit more widely to avoid unintentionally restricting ourselves in the future.

1 Like

I think we might need to be careful not to make a specific format for everything. I.e. do we need something for storing metabolite maps? Isn’t that just an image for which we can use a normal appropriately named NIFTI file? I might also argue that basis spectra are just spectra and can therefore be described by the current format.

1 Like

@wclarke yes fair point - no need to add a new type if we can just use what has already been defined

1 Like

@wclarke, Likewise putting fid in nifti, i am using map arrays into the same dimension. On the FSL or afni i just slide the timepoints for different metabolite map visualization. However, the problem is, we need to know in advance the order of the metabolites or ideally change definitions (similar to the statistical maps of afni buckets).

I agree that having too many file formats, particularly if they all have the same extension and you can’t readily distinguish them without opening, is going to be as bad as the situation we currently have. On the other hand, I do think we have different kinds of data that need to be supported. Bearing in mind that we are talking about the value of the intent_name field, it doesn’t seem unreasonable to define “mrs_fit”, for example. It is true that data, fit, baseline, residual, individual metabolite signals could be simply another optional dimension under a global format, but the use cases for raw and fitted data are different enough that I like the idea of being able to open a file and know from a simple flag what kind of data I have and what to do with it (speaking as a piece of software). I can easily imagine some software that knows how to visualise fitted spectra on an anatomical scan but that would balk at raw MRS data, and doesn’t want to try and guess what is in the file by looking at the dimension labels. @wclarke, for your images above in fsleyes, how do you store e.g. the FID, baseline, fit, metabolite components? Are they each one nifti file or is that one axis of the nd-dataset.

It is not really clear to me how you store metabolite concentrations and spectral data in a single nd-array, other than putting the concentrations into the JSON. That would work for SVS, but not for MRSI. I like @uzayemir’s suggestion of metabolite maps with metabolite name as just another axis of the data, but that does require some JSON giving the list of metabolite names, at the very least, and I don’t see how to combine it with the spectral data in a single file. Not the end of the world but unfortunate.

Although I can see advantages to nifti in its alignment with existing packages outside spectroscopy, I could also imagine an HDF5 based format (not necessarily ISMRMRD) which would essentially be a container for different nd-arrays, so you could store raw data (of however many dimensions), fitted data, basis set, metabolite maps etc. as different entries in the same file. There would be a place more shared metadata and also component specific metadata. Another advantage of this would be that you could support k-space data with accompanying trajectories quite easily.

It would even be possible to use a format like that as a processing history, where each operation creates a new entry in the file. Obviously for MRSI data that could get large quite quickly, but it would be really nice if you could share a single file with your metabolite maps and fitted spectra, and recipients could dig all the way back to the raw data and see each stage of the processing pipeline along the way.