I know the residual should be as flat as possible. But how flat is flat enough? Is there a quantitative metric for this purpose? Like R-squared in linear regression, the closer to 1 it is, the better the fit is.
I also know that the variance of residual can be used as such a metric. But the problem is, with different datasets, even if they have the same variance of residual, the fit quality can be quite different.
What is the best/common practice in this area to quantitatively determine how well a fit is in AMARES?
One metric is the fit quality number (FQN), originally proposed by Slotboom and recommended in the data analysis consensus paper. It’s the variance of the fit residual divided by the variance of the noise - if that approaches 1, the residual is noise-like. If it’s substantially larger, it indicates structured residuals.
It’s not particularly widely used, but I think it’s a really useful indicator to identify modeling failure.
Hello,
I assume you are already very familiar with the pros and cons of CRLB – while it has limitations, it’s commonly used by both AMARES and non-AMARES users to threshold the goodness of fit for each peak (typically <=20%, though sometimes higher).
It seems that you are looking for a metric for the global AMARES fitting quality. It is true that the raw residual variance is dataset-dependent. FQN is an excellent idea - it compares residual variance to noise variance, making it intuitive (testing if the residual is just noise) and able to identify overfitting (FQN<1 means residual is smaller than noise). A key advantage of FQN is that it remains relatively stable across different SNR levels. However, it does require accurate noise estimation from a non-signal region.
If you use OXSA for AMARES fitting, it prints out dataNormSq (squared norm of the demeaned data), resNormSq (squared norm of residual), and their ratio (resNormSq/dataNormSq). This ratio behaves similarly to (1-R²) in that closer to 0 means a better fit. However, it should be noted that this ratio is SNR-dependent - lower SNR data will have a higher ratio even with good fitting.
When I benchmarked pyAMARES while developing it, I implemented these metrics in a Compare_to_OXSA function so pyAMARES also prints out these metrics. If you use OXSA and pyAMARES to fit the same data with the same prior knowledge, the resNormSq and dataNormSq outputs should be almost the same (and other fitting results, too).
There are other metrics such as chi-square, RMSE, etc. I’m not sure which one is the best; perhaps you can use a combination of them, and perhaps even with visual inspection of the fitting. I’m here to learn, too.
I have a new question regarding FQN when apodization is applied before AMARES fit.
If I use the tail of FID_apodized as noise, then this noise has been suppressed by apodization. It makes the FQN very large.
If I use the tail of FID_unapodized as noise, since residual has gone through the apodization, a very long tail of residual is very close to 0, hence having low variance. That makes FQN pretty small, often smaller than 1.
Thus, I’m thinking if I should use the tail of FID_unapodized as noise, but before computing the variance of residual, apply the inverse of apodization to residual. If apodization is multiply FID with exp(-t/T), then the inverse is multiply residual with exp(t/T) Does that make any sense? Or is there a better way?
I should use the tail of FID_unapodized as noise, but before computing the variance of residual , apply the inverse of apodization to residual . If apodization is multiply FID with exp(-t/T) , then the inverse is multiply residual with exp(t/T)