All Questions
249,336 questions
0
votes
0
answers
12
views
Performance difference during fetching between pandas-gbq and bigquery_storage api in python
I can fetch data from gbq using two methods:
df = pd.io.gbq.read_gbq(
query,
project_id=project_id
use_bqstorage_api=True,
credentials=credentials,
configuration=dict(
...
-4
votes
1
answer
24
views
How do I read a `.arrow` (Apache Arrow aka Feather V2 format) file with Python Pandas?
I'm trying to read an .arrow format file with Python pandas.
pandas does not have a read_arrow function. However, it does have read_csv, read_parquet, and other similarly named functions.
How can I ...
0
votes
1
answer
58
views
How to match a substring using a pattern and replace by passing a variable in RegEx, Python
I am trying to iterate through a Pandas dataframe's column values one by one to detect a substring with a RegEx pattern and replace it wherever it shows up.
The string values in the dataframe's target ...
0
votes
0
answers
58
views
How to compare every 2 rows(rows 1 and 2, rows 3 and 4, etc..) against eachother and output the results to a table
I am working on a project that requires me to compare 2 rows (1 and 2, 3 and 4, etc...) and output the differences to a table. Now I have been able to compare the columns and create the table with ...
1
vote
1
answer
51
views
xlsxwriter not applying the border to the full dataset
I'm simply trying to create a nice border for my dataset. It applies it nicely to the entire dataset expect to the first row where the data actually starts.
import pandas as pd
import io
# In-memory ...
1
vote
0
answers
35
views
How to convert from Python pandas Timestamp to repeated google.protobuf.Timestamp? (Python + Google Protocol Buffers)
I am trying to write some code which converts the contents of a pandas.DataFrame to a protobuf object which can be serialized and written to a file.
Here is my protobuf definition.
syntax = "...
1
vote
2
answers
71
views
Efficiently calculate time to first 'purchase' event per user in Pandas DataFrame
How can I compute time to first target event per user using Pandas efficiently (with edge cases)?
I'm analyzing user behavior using a Pandas DataFrame that logs events on an app. Each row includes a ...
0
votes
0
answers
23
views
Modin: switch to Pandas because of "Mixed Partitioning columns in Parquet"
I would like to use Modin to read a partitioned parquet. The parquet has a single partition key of type int. When I run it automatically switches to the default pandas implementation with the ...
1
vote
0
answers
69
views
How to fix read_csv system error in pandas?
I am getting a system error when using pd.read_csv():
import pandas as pd
df = pd.read_csv('MLproject/color_names.csv', usecols=['Name', 'Hex'])
The error I'm getting is:
SystemError ...
0
votes
2
answers
79
views
Merge more than 2 dataframes if they exist and initialised
I am trying to merge three dataframes using intersection(). How can we check that all dataframes exists/initialised before running the intersection() without multiple if-else check blocks. If any ...
1
vote
1
answer
43
views
Down-sampling with Dask - Python
I'm trying to update the dependencies in our repository (running with Python 3.12.8) and stumbled across this phenomenon when updating Dask from dask[complete]==2023.12.1 to dask[complete]==2024.12.1:
...
3
votes
3
answers
84
views
Convert month abbreviation to full name
I have this function which converts an English month to a French month:
def changeMonth(month):
global CurrentMonth
match month:
case "Jan":
return "Janvier&...
0
votes
0
answers
32
views
Can't build pandas_ods_reader through pip
I recently upgraded my OS and had to rebuild several packages (I use a virtual env) starting from pip itself. However, pandas_ods_reader fails to build now (had it working earlier):
copying ...
0
votes
1
answer
52
views
Custom Shaping in pandas for Excel Output
I have a dataset with world population (For clarity, countries are limmited to Brazil, Canada, Denmark):
import pandas as pd
world = pd.read_csv("../data/worldstats.csv")
cond = world[&...
1
vote
2
answers
69
views
Python Pandas.read_csv header and index column not lining up
I have a bunch of csv files read from a teensy adc onto an SD card and am trying to extract them to be able to do some basic stats over each row.
I have tried everything I can think of to try and fix ...