Reader Connection
The League\Csv\Reader
class extends the general connections capabilities to ease selecting
and manipulating CSV document records. Starting with version 9.6.0
, the class implements
the League\Csv\TabularDataReader
interface.
The Reader
provides a convenient and straight forward API to access and handle CSV. While most
of its capabilities are explained in the Tabular Data Reader documentation page,
the current page will focus on Reader
specific features and/or properties.
CSV example
Many examples in this reference require a CSV file. We will use the following file file.csv
containing the following data:
"First Name","Last Name",E-mail
john,doe,john.doe@example.com
jane,doe,jane.doe@example.com
john,john,john.john@example.com
jane,jane
Records normalization
General Rules
The returned records are normalized using the following rules:
- Stream filters are applied if present.
- Empty records are skipped if present.
- The document BOM sequence is skipped if present.
- If a header record was provided, the number of fields is normalized to the number of fields contained in that record:
- Extra fields are truncated.
- Missing fields are added with a
null
value.
- Field values are formatter if formatters are provided Since version 9.11
use League\Csv\Reader;
$reader = Reader::createFromPath('/path/to/my/file.csv', 'r');
$reader->setHeaderOffset(0);
$records = $reader->getRecords();
foreach ($records as $offset => $record) {
//$offset : represents the record offset
//var_export($record) returns something like
// array(
// 'First Name' => 'jane',
// 'Last Name' => 'jane',
// 'E-mail' => null
// );
}
Record Formatter
A formatter is a callable
which accepts a single CSV record as an array
on input and returns an array
representing the formatted CSV record according to its inner rules.
function(array $record): array
You can attach as many formatters as you want to the Reader
class using the Reader::addFormatter
method.
Formatters are applied following the First In First Out rule.
Formatting happens AFTER combining the header and the fields value if a header is available and CSV value BUT BEFORE< you can access the actual value.
use League\Csv\Reader;
$csv = <<<CSV
firstname,lastname,e-mail
john,doe,john.doe@example.com
CSV;
$formatter = fn (array $row): array => array_map(strtoupper(...), $row);
$reader = Reader::createFromString($csv)
->setHeaderOffset(0)
->addFormatter($formatter);
[...$reader];
// [
// [
// 'firstname' => 'JOHN',
// 'lastname' => DOE',
// 'e-mail' => 'JOHN.DOE@EXAMPLE.COM',
// ],
//];
echo $reader->toString(); //returns the original $csv value without the formatting.
Controlling the presence of empty records
By default, the CSV document normalization removes empty records, but you can control the presence of such records using the following methods:
Reader::skipEmptyRecords(): self;
Reader::includeEmptyRecords(): self;
Reader::isEmptyRecordsIncluded(): bool;
- Calling
Reader::includeEmptyRecords
will ensure empty records are left in theIterator
returned byReader::getRecords
, converselyReader::skipEmptyRecords
will ensure empty records are skipped. - At any given time you can ask your Reader instance if empty records will be stripped or
included using the
Reader::isEmptyRecordsIncluded
method. - If no header offset is specified, the empty record will be represented by an empty
array
. Conversely, for consistency, an empty record will be represented by an array filled withnull
values as expected from header presence normalization.
use League\Csv\Reader;
$source = <<<EOF
"parent name","child name","title"
"parentA","childA","titleA"
EOF;
$reader = Reader::createFromString($source);
$reader->isEmptyRecordsIncluded(); //returns false
iterator_to_array($reader, true);
// [
// 0 => ['parent name', 'child name', 'title'],
// 3 => ['parentA', 'childA', 'titleA'],
// ];
$reader->includeEmptyRecords();
$reader->isEmptyRecordsIncluded(); //returns true
iterator_to_array($reader, true);
// [
// 0 => ['parent name', 'child name', 'title'],
// 1 => [],
// 2 => [],
// 3 => ['parentA', 'childA', 'titleA'],
// ];
$reader->setHeaderOffset(0);
iterator_to_array($reader, true);
// [
// 1 => ['parent name' => null, 'child name' => null, 'title' => null],
// 2 => ['parent name' => null, 'child name' => null, 'title' => null],
// 3 => ['parent name' => 'parentA', 'child name' => 'childA', 'title' => 'titleA'],
// ];
$reader->skipEmptyRecords();
$reader->isEmptyRecordsIncluded(); //returns false
$res = iterator_to_array($reader, true);
// [
// 3 => ['parent name' => 'parentA', 'child name' => 'childA', 'title' => 'titleA'],
// ];
Document header
While accessing the CSV header is done via the getHeader
method which is part of the TabularDataReader
API,
Because CSV documents come in difference shape and form the class exposes a way to select and get the document Header
record via the setHeaderOffset
and getHeaderOffset
method.
Description
public Reader::setHeaderOffset(?int $offset): self
public Reader::getHeaderOffset(void): ?int
public Reader::getHeader(void): array
Example
use League\Csv\Reader;
$csv = Reader::createFromPath('/path/to/file.csv', 'r');
$csv->setHeaderOffset(0);
$header_offset = $csv->getHeaderOffset(); //returns 0
$header = $csv->getHeader(); //returns ['First Name', 'Last Name', 'E-mail']
If no header offset is set:
Reader::getHeader
method will return an empty array.Reader::getHeaderOffset
will returnnull
.
use League\Csv\Reader;
$csv = Reader::createFromPath('/path/to/file.csv', 'r');
$csv->setHeaderOffset(1000); //valid offset but the CSV does not contain 1000 records
$header_offset = $csv->getHeaderOffset(); //returns 1000
$header = $csv->getHeader(); //throws a SyntaxError exception
Because the CSV document is treated as tabular data the header can not contain duplicate entries. If the header contains duplicates an exception will be thrown on usage.
use League\Csv\Reader;
$csv = Reader::createFromPath('/path/to/file.csv', 'r');
$csv->nth(0); //returns ['field1', 'field2', 'field1', 'field4']
$csv->setHeaderOffset(0); //valid offset but the record contain duplicates
$header_offset = $csv->getHeaderOffset(); //returns 0
$records = $csv->getRecords(); //throws a SyntaxError exception
use League\Csv\Reader;
use League\Csv\SyntaxError;
$csv = Reader::createFromPath('/path/to/file.csv', 'r');
$csv->nth(0); //returns ['field1', 'field2', 'field1', 'field4']
$csv->setHeaderOffset(0); //valid offset but the record contain duplicates
$header_offset = $csv->getHeaderOffset(); //returns 0
try {
$records = $csv->getRecords(); //throws a SyntaxError exception
} catch (SyntaxError $exception) {
$duplicates = $exception->duplicateColumnNames(); //returns ['field1']
}
Document records
To access the CSV records you will need to use the getRecords
or the getRecordsAsObjects
methods. The methods
returns an Iterator
containing all CSV document records as array
or as objects. It will extract the
records using the CSV controls characters.
use League\Csv\Reader;
$reader = Reader::createFromPath('/path/to/my/file.csv', 'r');
$records = $reader->getRecords();
foreach ($records as $offset => $record) {
//$offset : represents the record offset
//var_export($record) returns something like
// array(
// 'john',
// 'doe',
// 'john.doe@example.com'
// );
}
Records selection with Reader::setHeaderOffset
Just like the getHeader
method, the method output depends on the header record selected using setHeaderOffset
.
use League\Csv\Reader;
$reader = Reader::createFromPath('/path/to/my/file.csv', 'r');
$reader->setHeaderOffset(0);
$records = $reader->getRecords();
foreach ($records as $offset => $record) {
//$offset : represents the record offset
//var_export($record) returns something like
// array(
// 'First Name' => 'jane',
// 'Last Name' => 'doe',
// 'E-mail' => 'jane.doe@example.com'
// );
}
use League\Csv\Reader;
$reader = Reader::createFromPath('/path/to/my/file.csv', 'r');
$reader->setHeaderOffset(0);
$records = $reader->getRecords(['firstname', 'lastname', 'email']);
foreach ($records as $offset => $record) {
//$offset : represents the record offset
//var_export($record) returns something like
// array(
// 'firstname' => 'jane',
// 'lastname' => 'doe',
// 'email' => 'jane.doe@example.com'
// );
}
//the first record will still be skipped!!
Selecting records
Please header over the TabularDataReader documentation page
for more information on the class features. If you require a more advance record selection, you
should use a Statement or a FragmentFinder class to process the Reader
object. The
found records are returned as a ResultSet object.
Records conversion
Json serialization
The Reader
class implements the JsonSerializable
interface. As such you can use the json_encode
function directly on the instantiated object. The interface is implemented using PHP’s
iterator_array
on the Reader::getRecords
method. As such, the returned JSON
string data depends on the presence or absence of a header.
use League\Csv\Reader;
$records = [
['firstname', 'lastname', 'e-mail', 'phone'],
['john', 'doe', 'john.doe@example.com', '0123456789'],
];
$tmp = new SplTempFileObject();
foreach ($records as $record) {
$tmp->fputcsv($record);
}
$reader = Reader::createFromFileObject($tmp);
echo '<pre>', PHP_EOL;
echo json_encode($reader, JSON_PRETTY_PRINT), PHP_EOL;
//display
//[
// [
// "firstname",
// "lastname",
// "e-mail",
// "phone"
// ],
// [
// "john",
// "doe",
// "john.doe@example.com",
// "0123456789"
// ]
//]
$reader->setHeaderOffset(0);
echo '<pre>', PHP_EOL;
echo json_encode($result, JSON_PRETTY_PRINT), PHP_EOL;
//display
//[
// {
// "firstname": "john",
// "lastname": "doe",
// "e-mail": "john.doe@example.com",
// "phone": "0123456789"
// }
//]
Other conversions
If you wish to convert your CSV document in XML
or HTML
please refer to the converters bundled
with this library.