I would like to use the lmfit module to fit a function to a variable number of data-sets, with some shared and some individual parameters.
Here is an example generating Gaussian data, and fitting to each data-set individually:
import numpy as np
import matplotlib.pyplot as plt
from lmfit import minimize, Parameters, report_fit
def func_gauss(params, x, data=[]):
A = params['A'].value
mu = params['mu'].value
sigma = params['sigma'].value
model = A*np.exp(-(x-mu)**2/(2.*sigma**2))
if data == []:
return model
return data-model
x = np.linspace( -1, 2, 100 )
data = []
for i in np.arange(5):
params = Parameters()
params.add( 'A' , value=np.random.rand() )
params.add( 'mu' , value=np.random.rand()+0.1 )
params.add( 'sigma', value=0.2+np.random.rand()*0.1 )
data.append(func_gauss(params,x))
plt.figure()
for y in data:
fit_params = Parameters()
fit_params.add( 'A' , value=0.5, min=0, max=1)
fit_params.add( 'mu' , value=0.4, min=0, max=1)
fit_params.add( 'sigma', value=0.4, min=0, max=1)
minimize(func_gauss, fit_params, args=(x, y))
report_fit(fit_params)
y_fit = func_gauss(fit_params,x)
plt.plot(x,y,'o',x,y_fit,'-')
plt.show()
# ideally I would like to write:
#
# fit_params = Parameters()
# fit_params.add( 'A' , value=0.5, min=0, max=1)
# fit_params.add( 'mu' , value=0.4, min=0, max=1)
# fit_params.add( 'sigma', value=0.4, min=0, max=1, shared=True)
# minimize(func_gauss, fit_params, args=(x, data))
#
# or:
#
# fit_params = Parameters()
# fit_params.add( 'A' , value=0.5, min=0, max=1)
# fit_params.add( 'mu' , value=0.4, min=0, max=1)
#
# fit_params_shared = Parameters()
# fit_params_shared.add( 'sigma', value=0.4, min=0, max=1)
# call_function(func_gauss, fit_params, fit_params_shared, args=(x, data))
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
I think you’re most of the way there. You need to put the data sets into an array or structure that can be used in a single, global objective function that you give to minimize() and fits all data sets with a single set of Parameters for all the data sets. You can share this set among data sets as you like. Expanding on your example a bit, the code below does work to do a single fit to the 5 different Gaussian functions. For an example of tying parameters across data sets, I used nearly identical value for sigma the 5 datasets the same value. I created 5 different sigma Parameters (‘sig_1’, ‘sig_2’, …, ‘sig_5’), but then forced these to have the same values using a mathematical constraint. Thus there are 11 variables in the problem, not 15.
import numpy as np
import matplotlib.pyplot as plt
from lmfit import minimize, Parameters, report_fit
def gauss(x, amp, cen, sigma):
"basic gaussian"
return amp*np.exp(-(x-cen)**2/(2.*sigma**2))
def gauss_dataset(params, i, x):
"""calc gaussian from params for data set i
using simple, hardwired naming convention"""
amp = params['amp_%i' % (i+1)].value
cen = params['cen_%i' % (i+1)].value
sig = params['sig_%i' % (i+1)].value
return gauss(x, amp, cen, sig)
def objective(params, x, data):
""" calculate total residual for fits to several data sets held
in a 2-D array, and modeled by Gaussian functions"""
ndata, nx = data.shape
resid = 0.0*data[:]
# make residual per data set
for i in range(ndata):
resid[i, :] = data[i, :] - gauss_dataset(params, i, x)
# now flatten this to a 1D array, as minimize() needs
return resid.flatten()
# create 5 datasets
x = np.linspace( -1, 2, 151)
data = []
for i in np.arange(5):
params = Parameters()
amp = 0.60 + 9.50*np.random.rand()
cen = -0.20 + 1.20*np.random.rand()
sig = 0.25 + 0.03*np.random.rand()
dat = gauss(x, amp, cen, sig) + np.random.normal(size=len(x), scale=0.1)
data.append(dat)
# data has shape (5, 151)
data = np.array(data)
assert(data.shape) == (5, 151)
# create 5 sets of parameters, one per data set
fit_params = Parameters()
for iy, y in enumerate(data):
fit_params.add( 'amp_%i' % (iy+1), value=0.5, min=0.0, max=200)
fit_params.add( 'cen_%i' % (iy+1), value=0.4, min=-2.0, max=2.0)
fit_params.add( 'sig_%i' % (iy+1), value=0.3, min=0.01, max=3.0)
# but now constrain all values of sigma to have the same value
# by assigning sig_2, sig_3, .. sig_5 to be equal to sig_1
for iy in (2, 3, 4, 5):
fit_params['sig_%i' % iy].expr='sig_1'
# run the global fit to all the data sets
result = minimize(objective, fit_params, args=(x, data))
report_fit(result)
# plot the data sets and fits
plt.figure()
for i in range(5):
y_fit = gauss_dataset(fit_params, i, x)
plt.plot(x, data[i, :], 'o', x, y_fit, '-')
plt.show()
For what it’s worth, I would consider holding the multiple data sets in a dictionary or list of DataSet class instead of a multi-dimensional array. Anyway, I hope this helps get you going onto what you really need to do.
Method 2
I’ve used simple approach: define a function firs n( = cargsnum) of arguments is common
for all data set’s other is individual
{
def likelihood_common(var, xlist, ylist, mlist, cargsnum):
cvars = var[:cargsnum]
iargnum = [model.func_code.co_argcount - 1 - cargsnum for model in mlist]
argpos = [cargsnum,] + list(np.cumsum(iargnum[:-1]) + cargsnum)
args = [list(cvars) + list(var[pos:pos+iarg]) for pos, iarg in zip(argpos, iargnum)]
res = [likelihood(*arg) for arg in zip(args, xlist, ylist, mlist)]
return np.sum(res)
}
here supposed that each data set have the same weight.
The issue I faced in this approach is weary low computation speed and instability
in case of large number of fitted parameters and data sets.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0