All Questions
18,790 questions
2
votes
2
answers
71
views
Pandas: Fill in missing values with an empty numpy array
I have a Pandas Dataframe that I derive from a process like this:
df1 = pd.DataFrame({'c1':['A','B','C','D','E'],'c2':[1,2,3,4,5]})
df2 = pd.DataFrame({'c1':['A','B','C'],'c2':[1,2,3],'c3': [np.array((...
2
votes
4
answers
108
views
Pandas - fillna multiple columns with a given series, matching by index?
I'd like to use a fillna command to fillna multiple columns of a Pandas dataframe with the same series, matching by index:
import numpy as np
import pandas as pd
df_1 = pd.DataFrame(index = [0, 1, 2],...
2
votes
1
answer
112
views
Why does pd.isnull behave differently on DataFrame vs. single element?
I'm noticing an inconsistency in how pd.isnull behaves.
Given that pd.isnull('nan') returns False, but pd.isnull(float('nan')) returns True, I would have expected that applying pd.isnull to a ...
2
votes
1
answer
64
views
Shuffle a dataset w.r.t a column value
I have the following Dataframe, which contains, among others, UserID and rank_group as attribute:
UserID Col2 Col3 rank_group
0 1 2 3 1
1 1 5 6 1
...
20 1 8 ...
4
votes
2
answers
146
views
Arrange consecutive zeros in panda by specific rule
I have panda series as the following :
1 1
2 2
3 3
4 4
5 0
6 0
7 1
8 2
9 3
10 0
11 0
12 0
13 0
14 1
15 2
I have to ...
2
votes
4
answers
129
views
How to list a 2d array in a tabular form along with two 1d arrays from which it was generated?
I'm trying to calculate a 2d variable z = x + y where x and y are 1d arrays of unequal dimensions (say, x- and y-coordinate points on a spatial grid). I'd like to display the result row-by-row in ...
-1
votes
1
answer
64
views
Can't get grouped data into numpy array
I have a CSV file like this:
Ngày(Date),Số(Number)
07/03/2025,8
07/03/2025,9
...
06/03/2025,6
06/03/2025,10
06/03/2025,18
06/03/2025,14
...
(Each day has 27 numbers)
I want to predict a list of 27 ...
-5
votes
1
answer
103
views
assigning data in a pythonic way
Looking for suggestions on how to compress this code into a couple of lines.
One line for assigning columns, and the other for data.
df_input = pd.DataFrame(columns=['supply_temp', 'liquid_mass_flow','...
3
votes
2
answers
128
views
Why is this python code not running faster with parallelization?
This is a MWE of some code I'm writing to do some monte carlo exercises. I need to estimate models across draws and I'm parallelizing across models. In the MWE a "model" is just parametrized ...
0
votes
1
answer
77
views
adding interval to multiple bin counts in python/pandas
Context: Given a piece of paper with text/drawings sporadically printed on it, determine how many unmarked strips of a given width could be cut from it (without doing anything clever like offsetting ...
0
votes
2
answers
54
views
Automate initializing random integers for various columns in data frame Python
I am trying to minimize the following code to a for loop, as I have 14 of the similar columns, or something in a fewer number of code lines. What would be a pythonic way to do it?
df['fan1_rpm'] = np....
2
votes
3
answers
108
views
Comparing dataframes
The goal is to compare two pandas dataframes considering a margin of error.
To reproduce the issue:
Importing pandas
import pandas as pd
Case one - same data dataframes
df1 = pd.DataFrame({"A&...
0
votes
1
answer
58
views
Groupby and add calculated columns based on multiple conditions from other columns
I have a dataset that I want to groupby, and then add some calculated columns based on conditions from other columns. I want the status to only include 'open' and 'closed', and I want the state to ...
0
votes
0
answers
42
views
CuPy ROI Analysis significantly slower than NumPy version on RTX 4090
I have two versions of an ROI analysis class - one using NumPy and one using CuPy. The CuPy version is running much slower despite using an RTX 4090. Both versions perform the same operations:
# ...
0
votes
1
answer
130
views
TypeError: category dtype does not support aggregation 'mean' for Movies
I used these codes using the groupby() function to find the top averages, budgets, revenue etc. for movies as part of my Exploratory Data Analysis:
movies = df.groupby('Title')
movies.mean()....