Skip to main content

All Questions

Tagged with
2 votes
2 answers
71 views

Pandas: Fill in missing values with an empty numpy array

I have a Pandas Dataframe that I derive from a process like this: df1 = pd.DataFrame({'c1':['A','B','C','D','E'],'c2':[1,2,3,4,5]}) df2 = pd.DataFrame({'c1':['A','B','C'],'c2':[1,2,3],'c3': [np.array((...
cbw's user avatar
  • 289
2 votes
4 answers
108 views

Pandas - fillna multiple columns with a given series, matching by index?

I'd like to use a fillna command to fillna multiple columns of a Pandas dataframe with the same series, matching by index: import numpy as np import pandas as pd df_1 = pd.DataFrame(index = [0, 1, 2],...
Faraz Masroor's user avatar
2 votes
1 answer
112 views

Why does pd.isnull behave differently on DataFrame vs. single element?

I'm noticing an inconsistency in how pd.isnull behaves. Given that pd.isnull('nan') returns False, but pd.isnull(float('nan')) returns True, I would have expected that applying pd.isnull to a ...
pommelador's user avatar
2 votes
1 answer
64 views

Shuffle a dataset w.r.t a column value

I have the following Dataframe, which contains, among others, UserID and rank_group as attribute: UserID Col2 Col3 rank_group 0 1 2 3 1 1 1 5 6 1 ... 20 1 8 ...
Carlo Allocca's user avatar
4 votes
2 answers
146 views

Arrange consecutive zeros in panda by specific rule

I have panda series as the following : 1 1 2 2 3 3 4 4 5 0 6 0 7 1 8 2 9 3 10 0 11 0 12 0 13 0 14 1 15 2 I have to ...
prem's user avatar
  • 439
2 votes
4 answers
129 views

How to list a 2d array in a tabular form along with two 1d arrays from which it was generated?

I'm trying to calculate a 2d variable z = x + y where x and y are 1d arrays of unequal dimensions (say, x- and y-coordinate points on a spatial grid). I'd like to display the result row-by-row in ...
Schat17's user avatar
  • 33
-1 votes
1 answer
64 views

Can't get grouped data into numpy array

I have a CSV file like this: Ngày(Date),Số(Number) 07/03/2025,8 07/03/2025,9 ... 06/03/2025,6 06/03/2025,10 06/03/2025,18 06/03/2025,14 ... (Each day has 27 numbers) I want to predict a list of 27 ...
gialociubc's user avatar
-5 votes
1 answer
103 views

assigning data in a pythonic way

Looking for suggestions on how to compress this code into a couple of lines. One line for assigning columns, and the other for data. df_input = pd.DataFrame(columns=['supply_temp', 'liquid_mass_flow','...
Jesh Kundem's user avatar
3 votes
2 answers
128 views

Why is this python code not running faster with parallelization?

This is a MWE of some code I'm writing to do some monte carlo exercises. I need to estimate models across draws and I'm parallelizing across models. In the MWE a "model" is just parametrized ...
jtorca's user avatar
  • 1,601
0 votes
1 answer
77 views

adding interval to multiple bin counts in python/pandas

Context: Given a piece of paper with text/drawings sporadically printed on it, determine how many unmarked strips of a given width could be cut from it (without doing anything clever like offsetting ...
Idle_92's user avatar
  • 69
0 votes
2 answers
54 views

Automate initializing random integers for various columns in data frame Python

I am trying to minimize the following code to a for loop, as I have 14 of the similar columns, or something in a fewer number of code lines. What would be a pythonic way to do it? df['fan1_rpm'] = np....
Jesh Kundem's user avatar
2 votes
3 answers
108 views

Comparing dataframes

The goal is to compare two pandas dataframes considering a margin of error. To reproduce the issue: Importing pandas import pandas as pd Case one - same data dataframes df1 = pd.DataFrame({"A&...
Paulo Marques's user avatar
0 votes
1 answer
58 views

Groupby and add calculated columns based on multiple conditions from other columns

I have a dataset that I want to groupby, and then add some calculated columns based on conditions from other columns. I want the status to only include 'open' and 'closed', and I want the state to ...
lala345's user avatar
  • 125
0 votes
0 answers
42 views

CuPy ROI Analysis significantly slower than NumPy version on RTX 4090

I have two versions of an ROI analysis class - one using NumPy and one using CuPy. The CuPy version is running much slower despite using an RTX 4090. Both versions perform the same operations: # ...
Santi's user avatar
  • 368
0 votes
1 answer
130 views

TypeError: category dtype does not support aggregation 'mean' for Movies

I used these codes using the groupby() function to find the top averages, budgets, revenue etc. for movies as part of my Exploratory Data Analysis: movies = df.groupby('Title') movies.mean()....
Nike Cage 675's user avatar

15 30 50 per page
1
2 3 4 5
1253