Skip to main content

All Questions

Tagged with
2 votes
2 answers
157 views

How to efficiently process a large CSV file with pandas when memory is limited?

I'm working with a very large CSV file (around 10GB) that doesn't fit into my computer's memory. When I try to load it into a pandas DataFrame using pd.read_csv(), I get a MemoryError. What's the most ...
rysch's user avatar
  • 96
3 votes
2 answers
71 views

How to preprocess multivalue attributes in a dataframe?

Description: Input is a CSV file CSV file contains columns of different data types: Ordinal Values, Nominal Values, Numerical Values and Multi Value For the multivalue columns. Minimum is 1, ...
DILF Unboxing's user avatar
1 vote
1 answer
62 views

How can I extract specific objects, transpose, and combine multiple, complex-nested JSON files into a CSV using python and pandas?

I'm aware of the several posts that cover this topic, I apologise in advance. I've been reading and trying several times. Here are three example json files that I save into fildir: https://data.sec....
shrykullgod's user avatar
0 votes
0 answers
65 views

Is there a function to extract datetime from CSV without parsing it to a string?

I have raw measurement data in the form of a CSV file RAW CSV data format. Now I have a code that averages the milliseconds to the second. and then I need to apply a time correction of 1 hour 10 ...
Pushkar K's user avatar
0 votes
1 answer
54 views

Pandas adds "." + digit to the header of a csv

I would like to import a csv file with headers to pandas. Somehow, pandas appends a ".7" to the last headers name The last header in the csv contains a "?" as the last character (...
Adler's user avatar
  • 2,817
0 votes
1 answer
59 views

Splitting into multiple dataframes if criteria is met

I am working on a small program and need some guidance. Basically I am trying to read a CSV, put the attributes into a data frame and filter where "video = 1". This has been done. What I ...
Brian Hamilton's user avatar
0 votes
0 answers
52 views

I cannot get all data to export to CSV

# Collect batting stats for the 2022, 2023, and 2024 seasons try: print("Collecting batting stats from 2022 to 2024...") batting_data = batting_stats(2021, 2024, league="all&...
Chad Broussard's user avatar
1 vote
2 answers
93 views

How can I clean a year column with messy values?

I have a project I'm working on for a data analysis course, where we pick a data set and go through the steps of cleaning and exploring the data with a question to answer in mind. I want to be able to ...
Jubilbee Draws's user avatar
2 votes
2 answers
98 views

How to Write a Pandas DataFrame to CSV With Strings Quoted and Integers/Empty Cells Unaltered Without Adding Escape Characters for Commas?

I am working on a Python script to write a DataFrame to a CSV file. My goal is to: Enclose all string values in double quotes ("). Keep numeric values unchanged (no quotes). Leave empty cells as ...
Aswany Mahendran's user avatar
1 vote
1 answer
55 views

How to prevent Pandas to_csv double quoting empty fields in output csv

I currently have a sample python script that reads a csv with double quotes as a text qualifier and removes ascii characters and line feeds in the fields via a dataframe. It then outputs the dataframe ...
Eseosa Omoregie's user avatar
0 votes
1 answer
40 views

Dataframe of dataframes: writing and reading

I have a set of images. In each image, a program finds objects with attributes X and type. The number of objects vary from image to image. Hence for one image I have a df_objects with N_objects rows ...
Bruno Mansoulie's user avatar
0 votes
0 answers
62 views

Error tokenizing data when merging multiple csv into one excel file

I'm having an issue regarding on Error tokenizing data. I'm trying to merge multiple csv files into one excel file and my files have some special characters in it with big data. This is the error I ...
noel feng's user avatar
0 votes
1 answer
65 views

Pandas dataframe is mangled when writing to csv

I have written a pipeline to send queries to uniprot, but am having a strange issue with one of the queries. I've put this into a small test case below. I am getting the expected dataframe (df) ...
Tim Kirkwood's user avatar
0 votes
0 answers
48 views

How to handle cell with comma when using pd.readcsv? Error tokenizing data. C error: Expected 1 fields in line 88, saw 2

I'm reading a set of .csv files and adding them to one giant data frame called 'df', but I kept getting this error in some of my files: Error tokenizing data. C error: Expected 1 fields in line 88, ...
gracemcmc's user avatar
0 votes
1 answer
272 views

Expected String or bytes-like object, got 'float'

I'm trying to make an ETL (Extract, transform and load) algorithm with python. I got an amazon review database, but when i use the DataFrame.apply() method to apply the function with regex i got the ...
Filipy's user avatar
  • 29

15 30 50 per page
1
2 3 4 5
170