Skip to main content

All Questions

Tagged with
0 votes
0 answers
12 views

Performance difference during fetching between pandas-gbq and bigquery_storage api in python

I can fetch data from gbq using two methods: df = pd.io.gbq.read_gbq( query, project_id=project_id use_bqstorage_api=True, credentials=credentials, configuration=dict( ...
KJon's user avatar
  • 1
-4 votes
1 answer
24 views

How do I read a `.arrow` (Apache Arrow aka Feather V2 format) file with Python Pandas?

I'm trying to read an .arrow format file with Python pandas. pandas does not have a read_arrow function. However, it does have read_csv, read_parquet, and other similarly named functions. How can I ...
user2138149's user avatar
  • 17.9k
0 votes
1 answer
58 views

How to match a substring using a pattern and replace by passing a variable in RegEx, Python

I am trying to iterate through a Pandas dataframe's column values one by one to detect a substring with a RegEx pattern and replace it wherever it shows up. The string values in the dataframe's target ...
SimonsWorld's user avatar
0 votes
0 answers
58 views

How to compare every 2 rows(rows 1 and 2, rows 3 and 4, etc..) against eachother and output the results to a table

I am working on a project that requires me to compare 2 rows (1 and 2, 3 and 4, etc...) and output the differences to a table. Now I have been able to compare the columns and create the table with ...
Ajlec12's user avatar
  • 45
1 vote
1 answer
51 views

xlsxwriter not applying the border to the full dataset

I'm simply trying to create a nice border for my dataset. It applies it nicely to the entire dataset expect to the first row where the data actually starts. import pandas as pd import io # In-memory ...
user22083723's user avatar
1 vote
0 answers
35 views

How to convert from Python pandas Timestamp to repeated google.protobuf.Timestamp? (Python + Google Protocol Buffers)

I am trying to write some code which converts the contents of a pandas.DataFrame to a protobuf object which can be serialized and written to a file. Here is my protobuf definition. syntax = "...
user2138149's user avatar
  • 17.9k
1 vote
2 answers
71 views

Efficiently calculate time to first 'purchase' event per user in Pandas DataFrame

How can I compute time to first target event per user using Pandas efficiently (with edge cases)? I'm analyzing user behavior using a Pandas DataFrame that logs events on an app. Each row includes a ...
Samuel Olayiwola's user avatar
0 votes
0 answers
23 views

Modin: switch to Pandas because of "Mixed Partitioning columns in Parquet"

I would like to use Modin to read a partitioned parquet. The parquet has a single partition key of type int. When I run it automatically switches to the default pandas implementation with the ...
MarcelloDG's user avatar
1 vote
0 answers
69 views

How to fix read_csv system error in pandas?

I am getting a system error when using pd.read_csv(): import pandas as pd df = pd.read_csv('MLproject/color_names.csv', usecols=['Name', 'Hex']) The error I'm getting is: SystemError ...
user372087's user avatar
0 votes
2 answers
79 views

Merge more than 2 dataframes if they exist and initialised

I am trying to merge three dataframes using intersection(). How can we check that all dataframes exists/initialised before running the intersection() without multiple if-else check blocks. If any ...
RKIDEV's user avatar
  • 347
1 vote
1 answer
43 views

Down-sampling with Dask - Python

I'm trying to update the dependencies in our repository (running with Python 3.12.8) and stumbled across this phenomenon when updating Dask from dask[complete]==2023.12.1 to dask[complete]==2024.12.1: ...
Mina's user avatar
  • 81
3 votes
3 answers
84 views

Convert month abbreviation to full name

I have this function which converts an English month to a French month: def changeMonth(month): global CurrentMonth match month: case "Jan": return "Janvier&...
user29295031's user avatar
0 votes
0 answers
32 views

Can't build pandas_ods_reader through pip

I recently upgraded my OS and had to rebuild several packages (I use a virtual env) starting from pip itself. However, pandas_ods_reader fails to build now (had it working earlier): copying ...
user2751530's user avatar
0 votes
1 answer
52 views

Custom Shaping in pandas for Excel Output

I have a dataset with world population (For clarity, countries are limmited to Brazil, Canada, Denmark): import pandas as pd world = pd.read_csv("../data/worldstats.csv") cond = world[&...
Demeter P. Chen's user avatar
1 vote
2 answers
69 views

Python Pandas.read_csv header and index column not lining up

I have a bunch of csv files read from a teensy adc onto an SD card and am trying to extract them to be able to do some basic stats over each row. I have tried everything I can think of to try and fix ...
N Mastick's user avatar

15 30 50 per page
1
2 3 4 5
16623