Scilab Home page | Wiki | Bug tracker | Forge | Mailing list archives | ATOMS | File exchange
Please login or create an account
Change language to: Français - Português - 日本語 - Русский

Please note that the recommended version of Scilab is 6.0.1. This page might be outdated.
See the recommended documentation of this function

Scilab help >> Spreadsheet > csvRead

csvRead

Read comma-separated value file

Calling Sequence

M = csvRead(filename)
M = csvRead(filename, separator)
M = csvRead(filename, separator, decimal)
M = csvRead(filename, separator, decimal, conversion)
M = csvRead(filename, separator, decimal, conversion, substitute)
M = csvRead(filename, separator, decimal, conversion, substitute, rexgepcomments, range)
[M, comments] = csvRead(filename, separator, decimal, conversion, substitute, rexgepcomments, range)

Parameters

filename

a 1-by-1 matrix of strings, the file path.

separator

a 1-by-1 matrix of strings, the field separator used.

decimal

a 1-by-1 matrix of strings, the decimal used.

conversion

a 1-by-1 matrix of strings, the type of the output M. Available values are "string" or "double" (by default).

Note that read_csv has "string" as default.

substitute

a m-by-2 matrix of strings, a replacing map (default = [], meaning no replacements). The first column substitute(:,1) contains the searched strings and the second column substitute(:,2) contains the replace strings. Every occurence of a searched string in the file is replaced.

rexgepcomments

a string: a regexp to remove lines which match. (default: [])

range

a 1-by-4 matrix of floating point integers, the range of rows and columns which must be read (default range=[], meaning that all the rows and columns). Specify range using the format [R1 C1 R2 C2] where (R1,C1) is the upper left corner of the data to be read and (R2,C2) is the lower right corner.

M

a m-by-n matrix of strings or double.

comments

a m-by-n matrix of strings matched by regexp.

Description

Given an ascii file with comma separated values delimited fields, this function returns the corresponding Scilab matrix of strings or doubles.

For example, the .csv data file may have been created by a spreadsheet software using "Text and comma" format.

It might happen that the columns are separated by a non-comma separator. In this case, use csvRead(filename, separator) for another choice of separator.

The default value of the optional input arguments are defined by the csvDefault function.

Any optional input argument equal to the empty matrix [] is set to its default value.

When the input argument "conversion" is equal to "double", the non-numeric fields within the .csv (e.g. strings) are converted into NaN.

csvRead is able to handle both UTF-8 and ASCII text files.

Examples

The following script presents some basic uses of the csvRead function.

// Create a file with some data separated with tabs.
            M = 1:50;
            filename = fullfile(TMPDIR, "data.csv");
            csvWrite(M, filename, ascii(9), '.');
            
            // read csv file
            M1 = csvRead(filename,ascii(9), [], 'string')
            
            // Returns a double
            M2 = csvRead(filename,ascii(9), '.', 'double')
            
            // Compares original data and result.
            and(M == M2)
            
            // Use the substitude argument to manage
            // special data files.
            content = [
            "1"
            "Not-A-Number"
            "2"
            "Not-A-Number"
            ];
            
            substitute = [
            "Not-A-Number" "Nan"
            ];
            
            mputl(content,filename);
            M = csvRead(filename,",",".","double",substitute)
            isnan(M(2,1)) // Expected=%t
            isnan(M(4,1)) // Expected=%t

The following script presents more practical uses of the csvRead function.

// Define a matrix of strings
            Astr = [
            "1" "8" "15" "22" "29" "36" "43" "50"
            "2" "9" "16" "23" "30" "37" "44" "51"
            "3" "10" "17" "6+3*I" "31" "38" "45" "52"
            "4" "11" "18" "25" "32" "39" "46" "53"
            "5" "12" "19" "26" "33" "40" "47" "54"
            "6" "13" "20" "27" "34" "41" "48" "55"
            "+0" "-0" "Inf" "-Inf" "Nan" "1.D+308" "1.e-308" "1.e-323"
            ];
            
            // Create a file with some data separated with commas
            filename = fullfile(TMPDIR , 'foo.csv');
            sep = ",";
            fd = mopen(filename,'wt');
            for i = 1 : size(Astr,"r")
            mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
            end
            mclose(fd);
            // To see the file : edit(filename)
            
            // Read this file
            Bstr = csvRead ( filename )
            
            // Create a file with a particular separator: here ";"
            filename = fullfile(TMPDIR , 'foo.csv');
            sep = ";";
            fd = mopen(filename,'wt');
            for i = 1 : size(Astr,"r")
            mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
            end
            mclose(fd);
            
            //
            // Read the file and customize the separator
            csvRead ( filename , sep )

The following script shows how to remove lines with regexp argument of the csvRead function.

CSV = ["// tata"; ..
            "1,0,0,0,0"; ..
            "// titi"; ..
            "0,1,0,0,0"; ..
            "// toto"; ..
            "0,0,1,0,0"; ..
            "// tutu"];
            filename = fullfile(TMPDIR , 'foo.csv');
            mputl(CSV, filename);
            
            // remove lines with // @ beginning
            [M, comments] = csvRead(filename, [], [], [], [], '/\/\//')

Empty field are managed by csvRead

csvWrite(['1','','3';'','','6'], TMPDIR + "/example.csv")
csvRead(TMPDIR + "/example.csv", [], [], "string")
csvRead(TMPDIR + "/example.csv", [], [], "double")
// Define a matrix of strings
Astr = [
"1" "8" "15" "22" "29" "36" "43" "50"
"2" "9" "16" "23" "30" "37" "44" "51"
"3" "10" "17" "6+3*I" "31" "38" "45" "52"
"4" "11" "18" "25" "32" "39" "46" "53"
"5" "12" "19" "26" "33" "40" "47" "54"
"6" "13" "20" "27" "34" "41" "48" "55"
"+0" "-0" "Inf" "-Inf" "Nan" "1.D+308" "1.e-308" "1.e-323"
];

// Create a file with some data separated with commas
filename = fullfile(TMPDIR , 'foo.csv');
sep = ",";
fd = mopen(filename,'wt');
for i = 1 : size(Astr,"r")
mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
end
mclose(fd);
// To see the file : edit(filename)

// Read this file
Bstr = csvRead ( filename )

// Create a file with a particular separator: here ";"
filename = fullfile(TMPDIR , 'foo.csv');
sep = ";";
fd = mopen(filename,'wt');
for i = 1 : size(Astr,"r")
mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
end
mclose(fd);
//
// Read the file and customize the separator
csvRead ( filename , sep )

In the following script, the file "filename" is read by blocks of 5000 rows. The algorithm stops when the number of rows actually read from the file differ from 5000, i.e. when the end of the file has been reached.

blocksize = 5000;
            C1 = 1;
            C2 = 3;
            iblock = 1
            while (%t)
            R1 = (iblock-1) * blocksize + 1;
            R2 = blocksize + R1-1;
            irange = [R1 C1 R2 C2];
            mprintf("Block #%d, rows #%d to #%d\n",iblock,R1,R2);
            tic();
            M=csvRead(filename , [] , [] , [] , [] , [] , [] , irange );
            t = toc();
            nrows = size(M,"r");
            ncols = size(M,"c");
            if ( nrows > 0 ) then
            p = t/(nrows*ncols)*1.e6;
            mprintf("  Actual #rows=%d\n",nrows);
            mprintf("  T=%.3f (s)\n",t);
            mprintf("  T=%.1f (ms/cell)\n",p);
            end
            if ( nrows < blocksize ) then
            mprintf("... End of the file.\n");
            break
            end
            iblock = iblock + 1;
            end

This produces :

Block #1, rows #1 to #5000
            Actual #rows=5000
            T=3.135 (s)
            T=209.0 (ms/cell)
            Block #2, rows #5001 to #10000
            Actual #rows=5000
            T=3.139 (s)
            T=209.3 (ms/cell)
            Block #3, rows #10001 to #15000
            Actual #rows=5000
            T=3.151 (s)
            T=210.1 (ms/cell)
            etc....

See Also

History

VersionDescription
5.4.0 Function introduced. Based on the 'csv_readwrite' module. The only difference in the behavior compared to read_csv is that csvRead will try to convert value to double by default when read_csv will return value as string.

Comments

Author : John Jukans posted the 04/06/2013 13:14
What is the max size of matrix/file (rows by columns) that can
Scilab read in ?
Reply to this comment
Please login to comment this page


Add a comment:
Please login to comment this page.

Report an issue
<< csvDefault Spreadsheet csvTextScan >>

Scilab Enterprises
Copyright (c) 2011-2017 (Scilab Enterprises)
Copyright (c) 1989-2012 (INRIA)
Copyright (c) 1989-2007 (ENPC)
with contributors
Last updated:
Mon Oct 01 17:34:57 CEST 2012