How do I dump a 2D NumPy array into a csv file in a human-readable format?
13 Answers
numpy.savetxt
saves an array to a text file.
import numpy
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
numpy.savetxt("foo.csv", a, delimiter=",")
-
2is this preferred over looping through the array by dimension? I'm guessing so. Commented May 21, 2011 at 10:13
-
66you can also change the format of each figure with the fmt keyword. default is '%.18e', this can be hard to read, you can use '%.3e' so only 3 decimals are shown. Commented May 22, 2011 at 17:25
-
5
-
16Your method works well for numerical data, but it throws an error for
numpy.array
of strings. Could you prescribe a method to save as csv for annumpy.array
object containing strings? Commented Mar 25, 2016 at 14:31 -
26@ÉbeIsaac You can specify the format as string as well:
fmt='%s'
– LuisCommented Apr 6, 2017 at 16:34
Use the pandas
library's DataFrame.to_csv
. It does take some extra memory, but it's very fast and easy to use.
import pandas as pd
df = pd.DataFrame(np_array)
df.to_csv("path/to/file.csv")
If you don't want a header or index, use:
df.to_csv("path/to/file.csv", header=False, index=False)
-
1I find it again and again that the best csv exports are when 'piped' into pandas' to_csv– morkCommented Apr 2, 2017 at 8:03
-
10Not good. This creates a df and consumes extra memory for nothing– TexCommented May 31, 2017 at 23:05
-
30worked like charm, it's very fast - tradeoff for extra memory usage. parameters
header=None, index=None
remove header row and index column. Commented Nov 24, 2017 at 6:39 -
1The
numpy.savetxt
method is great, but it puts a hash symbol at the start of the header line.– Dave CCommented Dec 12, 2018 at 16:35 -
3@DaveC : You have to set the
comments
keyword argument to''
, the#
will be suppressed.– Milind RCommented Jan 14, 2019 at 20:31
tofile
is a convenient function to do this:
import numpy as np
a = np.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
a.tofile('foo.csv',sep=',',format='%10.5f')
The man page has some useful notes:
This is a convenience function for quick storage of array data. Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.
Note. This function does not produce multi-line csv files, it saves everything to one line.
-
8As far as I can tell, this does not produce a csv file, but puts everything on a single line.– PeterCommented Jan 14, 2016 at 18:46
-
@Peter, good point, thanks, I've updated the answer. For me it does save ok in csv format (albeit limited to one line). Also, it's clear that the asker's intent is to "dump it in human-readable format" - so I think the answer is relevant and useful.– LeeCommented Jan 15, 2016 at 10:35
-
3Actually, np.savetext() provides the newline argument, not np.tofile()– eaydinCommented Aug 26, 2018 at 0:48
As already discussed, the best way to dump the array into a CSV file is by using .savetxt(...)
method. However, there are certain things we should know to do it properly.
For example, if you have a numpy array with dtype = np.int32
as
narr = np.array([[1,2],
[3,4],
[5,6]], dtype=np.int32)
and want to save using savetxt
as
np.savetxt('values.csv', narr, delimiter=",")
It will store the data in floating point exponential format as
1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
5.000000000000000000e+00,6.000000000000000000e+00
You will have to change the formatting by using a parameter called fmt
as
np.savetxt('values.csv', narr, fmt="%d", delimiter=",")
to store data in its original format
Saving Data in Compressed gz format
Also, savetxt
can be used for storing data in .gz
compressed format which might be useful while transferring data over network.
We just need to change the extension of the file as .gz
and numpy will take care of everything automatically
np.savetxt('values.gz', narr, fmt="%d", delimiter=",")
Hope it helps
-
3
Writing record arrays as CSV files with headers requires a bit more work.
This example reads from a CSV file (example.csv
) and writes its contents to another CSV file (out.csv
).
import numpy as np
# Write an example CSV file with headers on first line
with open('example.csv', 'w') as fp:
fp.write('''\
col1,col2,col3
1,100.1,string1
2,222.2,second string
''')
# Read it as a Numpy record array
ar = np.recfromcsv('example.csv', encoding='ascii')
print(repr(ar))
# rec.array([(1, 100.1, 'string1'), (2, 222.2, 'second string')],
# dtype=[('col1', '<i8'), ('col2', '<f8'), ('col3', '<U13')])
# Write as a CSV file with headers on first line
with open('out.csv', 'w') as fp:
fp.write(','.join(ar.dtype.names) + '\n')
np.savetxt(fp, ar, '%s', ',')
Note that the above example cannot handle values which are strings with commas. To always enclose non-numeric values within quotes, use the csv
built-in module:
import csv
with open('out2.csv', 'w', newline='') as fp:
writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
writer.writerow(ar.dtype.names)
writer.writerows(ar.tolist())
-
2This is where pandas again helps. You can do: pd.DataFrame(out, columns=['col1', 'col2']), etc– EFreakCommented May 11, 2020 at 21:51
-
To store a NumPy array to a text file, import savetxt
from the NumPy module
consider your Numpy array name is train_df:
import numpy as np
np.savetxt('train_df.txt', train_df, fmt='%s')
OR
from numpy import savetxt
savetxt('train_df.txt', train_df, fmt='%s')
-
Since you are calling
np.savetext(...
, you don't need the import callfrom numpy import savetxt
. If you do import it, you can simply call it assavetext(...
– AtybzzCommented Jan 20, 2022 at 19:29
I believe you can also accomplish this quite simply as follows:
- Convert Numpy array into a Pandas dataframe
- Save as CSV
e.g. #1:
# Libraries to import
import pandas as pd
import nump as np
#N x N numpy array (dimensions dont matter)
corr_mat #your numpy array
my_df = pd.DataFrame(corr_mat) #converting it to a pandas dataframe
e.g. #2:
#save as csv
my_df.to_csv('foo.csv', index=False) # "foo" is the name you want to give
# to csv file. Make sure to add ".csv"
# after whatever name like in the code
-
3
if you want to write in column:
for x in np.nditer(a.T, order='C'):
file.write(str(x))
file.write("\n")
Here 'a' is the name of numpy array and 'file' is the variable to write in a file.
If you want to write in row:
writer= csv.writer(file, delimiter=',')
for x in np.nditer(a.T, order='C'):
row.append(str(x))
writer.writerow(row)
In Python we use csv.writer() module to write data into csv files. This module is similar to the csv.reader() module.
import csv
person = [['SN', 'Person', 'DOB'],
['1', 'John', '18/1/1997'],
['2', 'Marie','19/2/1998'],
['3', 'Simon','20/3/1999'],
['4', 'Erik', '21/4/2000'],
['5', 'Ana', '22/5/2001']]
csv.register_dialect('myDialect',
delimiter = '|',
quoting=csv.QUOTE_NONE,
skipinitialspace=True)
with open('dob.csv', 'w') as f:
writer = csv.writer(f, dialect='myDialect')
for row in person:
writer.writerow(row)
f.close()
A delimiter is a string used to separate fields. The default value is comma(,).
-
This has already been suggested: stackoverflow.com/a/41009026/8881141 Please only add new approaches, don't repeat previously published suggestions.– Mr. TCommented Nov 8, 2018 at 12:16
numpy.savetxt()
method is used to save a NumPy array into an output text file, however by default it will make use of scientific notation.
If you'd like to avoid this, then you need to specify an appropriate format using fmt
argument. For example,
import numpy as np
np.savetxt('output.csv', arr, delimiter=',', fmt='%f')
If you want to save your numpy array (e.g. your_array = np.array([[1,2],[3,4]])
) to one cell, you could convert it first with your_array.tolist()
.
Then save it the normal way to one cell, with delimiter=';'
and the cell in the csv-file will look like this [[1, 2], [2, 4]]
Then you could restore your array like this:
your_array = np.array(ast.literal_eval(cell_string))
-
1well that is literally going to destroy all the memory savings for using a numpy array Commented Apr 16, 2018 at 8:00
You can also do it with pure python without using any modules.
# format as a block of csv text to do whatever you want
csv_rows = ["{},{}".format(i, j) for i, j in array]
csv_text = "\n".join(csv_rows)
# write it to a file
with open('file.csv', 'w') as f:
f.write(csv_text)
As other answers mentioned, it's important to pass the fmt=
in order to save a "human-readable" file. In fact, if you pass a separate format for each column, you don't need to pass a delimiter.
arr = np.arange(9).reshape(3, 3)
np.savetxt('out.csv', arr, fmt='%f,%.2f,%.1f')
It saves a file whose contents look like:
0.000000,1.00,2.0
3.000000,4.00,5.0
6.000000,7.00,8.0
Now to read the file from csv, use np.loadtxt()
:
np.loadtxt('out.csv', delimiter=',')
If you want to append to an existing file (as well as create a new file), use a context manager and open a file with mode='ab'
.
with open('out.csv', 'ab') as f:
np.savetxt(f, arr, delimiter=',', fmt='%.1f')