All Questions
1,924 questions
0
votes
2
answers
35
views
Python Sklearn.Model_Selection giving error numpy.dtype size changed
I have a train test split code
from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(new_cleaned_df, test_size=0.05, random_state=42, shuffle=True)
train_df....
1
vote
2
answers
140
views
'numpy.ndarray' object has no attribute 'groupby'
I am trying to apply target encoding to categorical features using the category_encoders.TargetEncoder in Python. However, I keep getting the following error:
AttributeError: 'numpy.ndarray' object ...
1
vote
0
answers
60
views
How to reduce the size of Numpy data type
I am using Python to do cosine similarity.
similarity_matrix = cosine_similarity(tfidf_matrix)
The problem is that I am getting this error
MemoryError: Unable to allocate 44.8 GiB for an array with ...
0
votes
1
answer
46
views
How do I pass sklearns train_test_split actual dataseries and not single values as input argument?
I want to train an LSTM-based RNN model for binary classification and for that I wanted to use tensorflow keras model with LSTM layers. In order to do so, I need testing input and output as well as ...
0
votes
0
answers
407
views
ImportError: numpy.core.multiarray failed to import with newest numpy version
I am trying to run this code in VSCode and keep running into this error message:
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
...
0
votes
0
answers
38
views
Are there any neat ways to get warnings or errors if a numpy or scikit-learn operation is slow because of non-contiguous arrays?
This is something that seems to bite me repeatedly when using numpy for anything (most often things like scikit-learn).
It's very easy to make an array non-contiguous (all it takes is a.T), then pass ...
0
votes
0
answers
35
views
ValueError: setting an array element with a sequence. No irregular datatype or jagged array not a duplicate
Ok, I found this weird.
I am building a text classifier. I have used the gensim's word2vec with aggregation when I try to run it through an sklearn classifier it gives me the ValueError: setting an ...
0
votes
1
answer
47
views
Loading a pipeline with a dense-array conversion step
I trained and saved the following model using joblib:
def to_dense(x):
return np.asarray(x.todense())
to_dense_array = FunctionTransformer(to_dense, accept_sparse=True)
model = make_pipeline(
...
0
votes
1
answer
214
views
Installing old version of scikit-learn: ModuleNotFoundError: No module named 'numpy' [duplicate]
I have an old Python project that uses scikit-learn version 0.22.2.post1. Unfortunately I am unable to update to a newer version of scikit-learn as the training data has long been lost, and I ...
0
votes
1
answer
170
views
np.load fails with ValueError: cannot reshape array of size (838715,) into shape (838710,)
I'm trying to save the scaling parameters of a dataset into a .npy file on the disk, so I avoid having to recalculate them every time I re-run the code.
For now, I'm using MaxAbsScaler() from sklearn ...
2
votes
0
answers
224
views
Comparing specific linear regression functions in numpy, scipy, and sklearn: Differences in pre-processing and penalty functions
Looking at least-squares regression methods in various python libraries, and from their documentation and the two threads below (amongst others) is it right to say the following?
scipy.linalg.lstsq() ...
0
votes
1
answer
62
views
Python3 skglm - 'Poisson' object has no attribute 'get_lipschitz'
Im working with count data and want to fit a poisson regression with a L1 norm. I have the following code which throws the error and is reproducable:
import numpy as np
import skglm
import sklearn
X =...
1
vote
1
answer
58
views
How to save single Random Forest model with cross validation?
I am using 10 fold cross validation, trying to predict binary labels (Y) based on the embedding inputs (X).
I want to save one of the models (perhaps the one with the highest ROC AUC). I'm not sure ...
1
vote
0
answers
67
views
scikit-learn: ValueError: Input contains NaN, infinity or a value too large for dtype('float64') while predicting with GP
I've been using scikit-learn for Gaussian process regressors for a while, working with adaptively constructed models where the existing GP is used to select new datapoints for the GP. Recently I've ...
2
votes
2
answers
118
views
Nearest neighbor for list of arrays
`I have a list of arrays like this (in x, y coordinates):
coordinates= array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]), array([[1500, 1960],
[1620, 2200],
[1505, 1975]]), ...