I am trying to predict house prices in the Cali housing data set with a random forest. I do not understand why I get a KeyError: 'squared_error' in this simple code:
from sklearn.datasets import fetch_california_housing
import sklearn.ensemble
housing = fetch_california_housing()
rfr = sklearn.ensemble.RandomForestRegressor(n_estimators=100,
max_depth=int(25),
max_features="auto",
n_jobs=-1,
oob_score = True,
min_samples_leaf=20,
criterion = 'squared_error')
rfr.fit(housing.data, housing.target)
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/kratz/anaconda3/lib/python3.8/site-packages/sklearn/ensemble/_forest.py", line 387, in fit
trees = Parallel(n_jobs=self.n_jobs, verbose=self.verbose,
File "/home/kratz/anaconda3/lib/python3.8/site-packages/joblib/parallel.py", line 1054, in __call__
self.retrieve()
File "/home/kratz/anaconda3/lib/python3.8/site-packages/joblib/parallel.py", line 933, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/home/kratz/anaconda3/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
File "/home/kratz/anaconda3/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/kratz/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 595, in __call__
return self.func(*args, **kwargs)
File "/home/kratz/anaconda3/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
return [func(*args, **kwargs)
File "/home/kratz/anaconda3/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
return [func(*args, **kwargs)
File "/home/kratz/anaconda3/lib/python3.8/site-packages/sklearn/utils/fixes.py", line 222, in __call__
return self.function(*args, **kwargs)
File "/home/kratz/anaconda3/lib/python3.8/site-packages/sklearn/ensemble/_forest.py", line 169, in _parallel_build_trees
tree.fit(X, y, sample_weight=curr_sample_weight, check_input=False)
File "/home/kratz/anaconda3/lib/python3.8/site-packages/sklearn/tree/_classes.py", line 1247, in fit
super().fit(
File "/home/kratz/anaconda3/lib/python3.8/site-packages/sklearn/tree/_classes.py", line 350, in fit
criterion = CRITERIA_REG[self.criterion](self.n_outputs_,
KeyError: 'squared_error'
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
It’s probably due to the version of scikit-learn in your environment. According to the docs for RandomForestRegressor criterion = 'squared_error' was introduced in v1.0, so if you have a prior version use criterion='mse' instead.
You can use pip freeze to check for the version of your libraries in your env; for scikit-learn you can also use:
import sklearn sklearn.__version__
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0