The League of Extraordinary Packages

Our Packages:

Presented by The League of Extraordinary Packages

Getting Started

Connections Settings

Inserting Records

Selecting Records

Interoperability

Converting Records

Upgrading Guide

BOM sequences

Detecting the BOM sequence

<?php

interface ByteSequence
{
    const BOM_UTF8 = "\xEF\xBB\xBF";
    const BOM_UTF16_BE = "\xFE\xFF";
    const BOM_UTF16_LE = "\xFF\xFE";
    const BOM_UTF32_BE = "\x00\x00\xFE\xFF";
    const BOM_UTF32_LE = "\xFF\xFE\x00\x00";
}

To improve interoperability with programs interacting with CSV, the package now provides an interface ByteSequence to help you detect the appropriate BOM sequence.

Constants

The ByteSequence interface provides the following constants :

bom_match

<?php

function League\Csv\bom_match(string $str): string

The League\Csv\bom_match function expects a string and returns the BOM sequence found at its start or an empty string otherwise.

<?php

use League\Csv\ByteSequence;
use function League\Csv\bom_match;

bom_match('hello world!'); //returns ''
bom_match(ByteSequence::BOM_UTF8.'hello world!'); //returns '\xEF\xBB\xBF'
bom_match('hello world!'.ByteSequence::BOM_UTF16_BE); //returns ''

Managing CSV documents BOM sequence

Detecting the BOM sequence

<?php

public AbstractCsv::getInputBOM(void): string

The CSV document current BOM character is detected using the getInputBOM method. This method returns the currently used BOM character or an empty string if none is found or recognized. The detection is done using the bom_match function.

<?php

use League\Csv\Writer;

$csv = Writer::createFromPath('/path/to/file.csv');
$bom = $csv->getInputBOM();

Setting the outputted BOM sequence

<?php

public AbstractCsv::setOutputBOM(string $sequence): self
public AbstractCsv::getOutputBOM(void): string

All connections classes implement the ByteSequence interface.

<?php

use League\Csv\Reader;

$csv = Reader::createFromPath('/path/to/file.csv', 'r');
$csv->setOutputBOM(Reader::BOM_UTF8);
$bom = $csv->getOutputBOM(); //returns "\xEF\xBB\xBF"

The default output BOM character is set to an empty string.

The output BOM sequence is never saved to the CSV document.