Please note that the recommended version of Scilab is 2024.1.0. This page might be outdated.
See the recommended documentation of this function
csvRead
Read comma-separated value file
Calling Sequence
M = csvRead(filename) M = csvRead(filename, separator) M = csvRead(filename, separator, decimal) M = csvRead(filename, separator, decimal, conversion) M = csvRead(filename, separator, decimal, conversion, substitute) M = csvRead(filename, separator, decimal, conversion, substitute, regexpcomments, range) M = csvRead(filename, separator, decimal, conversion, substitute, regexpcomments, range, header) [M, comments] = csvRead(filename, separator, decimal, conversion, substitute, regexpcomments, range, header)
Arguments
- filename
a 1-by-1 matrix of strings, the file path.
- separator
a 1-by-1 matrix of strings, the field separator used.
- decimal
a 1-by-1 matrix of strings, the decimal used.
If
decimal
is different of[]
andconversion
is set tostring
, the decimal conversion will be done.- conversion
a 1-by-1 matrix of strings, the type of the output
M
. Available values are "string" or "double" (by default).Note that read_csv has "string" as default.
- substitute
a m-by-2 matrix of strings, a replacing map (default = [], meaning no replacements). The first column
substitute(:,1)
contains the searched strings and the second columnsubstitute(:,2)
contains the replace strings. Every occurence of a searched string in the file is replaced.- regexpcomments
a string: a regexp to remove lines which match. (default: [])
- range
a 1-by-4 matrix of floating point integers, the range of rows and columns which must be read (default range=[], meaning that all the rows and columns). Specify range using the format
[Row1 Column1 Row2 Column2]
where (R1,C1) is the upper left corner of the data to be read and (R2,C2) is the lower right corner.Note that the file has to be correctly formated. The range will be done in the memory on the parsed elements.- header
a 1-by-1 matrix of floating point integers, the number of lines to be ignored at the beginning of the file.
- M
a m-by-n matrix of strings or double.
- comments
a m-by-n matrix of strings matched by regexp.
Description
Given an ascii file with comma separated values delimited fields, this function returns the corresponding Scilab matrix of strings or doubles.
For example, the .csv data file may have been created by a spreadsheet software using "Text and comma" format.
It might happen that the columns are separated by a non-comma separator. In this case, use csvRead(filename, separator) for another choice of separator.
The default value of the optional input arguments are defined by the
csvDefault
function.
Any optional input argument equal to the empty matrix
[]
is set to its default value.
When the input argument "conversion" is equal to "double", the non-numeric fields within the .csv (e.g. strings) are converted into NaN.
csvRead is able to handle both UTF-8 and ASCII text files.
Examples
The following script presents some basic uses of the
csvRead
function.
// Create a file with some data separated with tabs. M = 1:50; filename = fullfile(TMPDIR, "data.csv"); csvWrite(M, filename, ascii(9), '.'); // read csv file M1 = csvRead(filename,ascii(9), [], 'string') // Returns a double M2 = csvRead(filename,ascii(9), '.', 'double') // Compares original data and result. and(M == M2) // Use the substitude argument to manage // special data files. content = [ "1" "Not-A-Number" "2" "Not-A-Number" ]; substitute = [ "Not-A-Number" "Nan" ]; mputl(content,filename); M = csvRead(filename,",",".","double",substitute) isnan(M(2,1)) // Expected=%t isnan(M(4,1)) // Expected=%t
The following script presents more practical uses of the
csvRead
function.
// Define a matrix of strings Astr = [ "1" "8" "15" "22" "29" "36" "43" "50" "2" "9" "16" "23" "30" "37" "44" "51" "3" "10" "17" "6+3*I" "31" "38" "45" "52" "4" "11" "18" "25" "32" "39" "46" "53" "5" "12" "19" "26" "33" "40" "47" "54" "6" "13" "20" "27" "34" "41" "48" "55" "+0" "-0" "Inf" "-Inf" "Nan" "1.D+308" "1.e-308" "1.e-323" ]; // Create a file with some data separated with commas filename = fullfile(TMPDIR , 'foo.csv'); sep = ","; fd = mopen(filename,'wt'); for i = 1 : size(Astr,"r") mfprintf(fd,"%s\n",strcat(Astr(i,:),sep)); end mclose(fd); // To see the file: edit(filename) // Read this file Bstr = csvRead ( filename ) // Create a file with a particular separator: here ";" filename = fullfile(TMPDIR , 'foo.csv'); sep = ";"; fd = mopen(filename,'wt'); for i = 1 : size(Astr,"r") mfprintf(fd,"%s\n",strcat(Astr(i,:),sep)); end mclose(fd); // // Read the file and customize the separator csvRead ( filename , sep )
The following script shows how to remove lines with regexp argument
of the csvRead
function.
Empty field are managed by csvRead
csvWrite(['1','','3';'','','6'], TMPDIR + "/example.csv") csvRead(TMPDIR + "/example.csv", [], [], "string") csvRead(TMPDIR + "/example.csv", [], [], "double")
// Define a matrix of strings Astr = [ "1" "8" "15" "22" "29" "36" "43" "50" "2" "9" "16" "23" "30" "37" "44" "51" "3" "10" "17" "6+3*I" "31" "38" "45" "52" "4" "11" "18" "25" "32" "39" "46" "53" "5" "12" "19" "26" "33" "40" "47" "54" "6" "13" "20" "27" "34" "41" "48" "55" "+0" "-0" "Inf" "-Inf" "Nan" "1.D+308" "1.e-308" "1.e-323" ]; // Create a file with some data separated with commas filename = fullfile(TMPDIR , 'foo.csv'); sep = ","; fd = mopen(filename,'wt'); for i = 1 : size(Astr,"r") mfprintf(fd,"%s\n",strcat(Astr(i,:),sep)); end mclose(fd); // To see the file: edit(filename) // Read this file Bstr = csvRead ( filename ) // Create a file with a particular separator: here ";" filename = fullfile(TMPDIR , 'foo.csv'); sep = ";"; fd = mopen(filename,'wt'); for i = 1 : size(Astr,"r") mfprintf(fd,"%s\n",strcat(Astr(i,:),sep)); end mclose(fd); // // Read the file and customize the separator csvRead ( filename , sep )
In the following script, the file "filename" is read by blocks of 5000 rows. The algorithm stops when the number of rows actually read from the file differ from 5000, i.e. when the end of the file has been reached.
blocksize = 5000; C1 = 1; C2 = 3; iblock = 1 while (%t) R1 = (iblock-1) * blocksize + 1; R2 = blocksize + R1-1; irange = [R1 C1 R2 C2]; mprintf("Block #%d, rows #%d to #%d\n",iblock,R1,R2); tic(); M=csvRead(filename , [] , [] , [] , [] , [] , [] , irange ); t = toc(); nrows = size(M,"r"); ncols = size(M,"c"); if ( nrows > 0 ) then p = t/(nrows*ncols)*1.e6; mprintf(" Actual #rows=%d\n",nrows); mprintf(" T=%.3f (s)\n",t); mprintf(" T=%.1f (ms/cell)\n",p); end if ( nrows < blocksize ) then mprintf("... End of the file.\n"); break end iblock = iblock + 1; end
This produces:
Block #1, rows #1 to #5000 Actual #rows=5000 T=3.135 (s) T=209.0 (ms/cell) Block #2, rows #5001 to #10000 Actual #rows=5000 T=3.139 (s) T=209.3 (ms/cell) Block #3, rows #10001 to #15000 Actual #rows=5000 T=3.151 (s) T=210.1 (ms/cell) etc....
CSV = ["1,0,0,0,0"; .. "0,1,0,0,0"; .. "0,0,1,0,0"; .. "4,4,1,2,0"; .. "4,63,1,2,0"; .. "4,63,1,4,233"; .. "42,3,23,2,23"; .. ]; filename = fullfile(TMPDIR , 'foo.csv'); mputl(CSV, filename); // Extract a subset of the csv file csvRead(filename, [], [], "double", [], [], [5 3 7 6])
comments = [ "// Copyright (C) INRIA" "// This file must be used under the terms of the CeCILL."]; filename = fullfile(TMPDIR , 'foo.csv'); csvWrite(rand(2,3), filename, ascii(9), ",", [], comments); header = 2; [M, c] = csvRead(filename, ascii(9), ",", "double", [], [], [], header) // Ignore the two first lines (the header)
See Also
History
Versão | Descrição |
5.4.0 | Function introduced. Based on the 'csv_readwrite' module. The only difference in the behavior compared to read_csv is that csvRead will try to convert value to double by default when read_csv will return value as string. |
5.4.1 | If decimal is different of [] and conversion is set to string , the decimal conversion will be done. |
5.5 | Addition of the "header" input argument, to ignore headers. |
Report an issue | ||
<< csvDefault | Planilhas | csvTextScan >> |