Bootstrapping averages / scrambling non-consecutive datapoints

hello, i was hoping to do a bootstrap analysis with my MRS (Hercules, 3T) data to get a sense of how much variability there is- is that a reasonable idea? I know that in hercules there are four sub-experiments (subA,B,C,D) so is it a good idea to bootstrap averages within each sub experiment, re concatenate the bootstrapped data, and then re-calculate metabolite concentrations for each bootstrapped sample?

In other words, if I have 250 averages for each of the 4 sub-experiments (1000 ave total ), is it okay to randomly select 100 of those average from different times in the scan. or, are there are issues with scrambling/ not using consecutive timepoints?

The reason I want to select a subset of the data to do a bootstrap analysis on, is because I only want to calculate concentrations during periods of wakefulness (measured with concurrent EEG), which are scattered throughout the 35 min.

psuedo code for bootstrapping sub-experiments:
#0 - define arrays with samples for each sub experiment
subA_samples = [1 5 9 … ]
subB_samples = [2 6 10 …]
subC_samples = [3 7 …]
subD_samples = [ 4 8 …]

#1 define which averages/samples are usable based separate EEG analysis
wakefulness_samples = [1:50 400:300] #define which samples subject was awake
subA_usable_samples = ismember(subA_samples, wakefulness_samples)

#2 - randomly select 100 of the samples from each experiment
subA_boostrap = randomsample(subA_usable_samples, 100)

#3 concatenate data back into a bootstrapped dataset that alternates A B C D
for i=1:100
one_cycle = [ subA_bootstrap(i) subB_bootstrap(i) subC_boostrap(i) subD_bootstrap(i) ]
allcycles = [one_cycle allcycles] %append the data

#4 feed this dataset into an edited version of Osprey that allows me to select only certain samples out of the dataset

Hi @sdw,

This is a pretty complex thing to do and you’d certainly need to modify the code of your Osprey copy, but it’s generally possible.

I would probably recommend that you boot-strap sets of four (ABCD) with the same index instead of completely randomly within each sub-experiment, otherwise you will exacerbate effects of drift and movement by matching transients that may have experienced slightly different editing pulse frequency offsets.

I think it makes the most sense to do this at the end of the OspreyLoad procedure.

Happy to help with further pointers,

1 Like