Map

Introduction #

Map process is advanced way to transform your data. Using Map process, you can controls both meta data of data flow as well as the values.

Stateless Map #

Stateless Map code pattern is used when all our calculation is limited within a single row.

...
->pipe(new Map(array(
    '{value}' => function($row, $metaData) {
        $row['orderQuarter'] = 'Q ' . $row['orderQuarter'];
        return array($row);
    },
    '{meta}' => function($metaData) {
        $metaData['columns']['productName'] = array(
            'label' => 'Products',
        );
        $metaData['columns']['orderYear'] = array(
            'label' => 'Year',
            'type' => 'string',
        );
        $metaData['columns']['orderQuarter'] = array(
            'label' => 'Quarter',
            'type' => 'string',
        );
        $metaData['columns']['dollar_sales'] = array(
            'label' => 'Sales',
            'type' => 'number',
            "prefix" => "$",
        );
        return $metaData;
    },
)))
    ...

Stateful Map #

Stateful Map is used when our calculation relates to other rows or previous calculation.

->pipe(new Map([
    '{value}' => function($row, $meta, $index, $mapState) {
        $numTopRows = 2;
        //If a row is among the first 2 rows
        if ($index < $numTopRows) {
            $mappedRows = [$row];
            //return it to send to next process or datastore
            return ['{rows}' => $mappedRows]; 
        }
        //Otherwise,
        //initialise a key of this Map's state to use for sum
        $sum = Util::init($mapState, 'sumOthers', []);
        foreach ($row as $columnName => $value) {
            Util::init($sum, $columnName, 0);
            //if column name = 'dollar_sales', sum it
            $sum[$columnName] = $columnName === 'dollar_sales' ? 
                $sum[$columnName] + $value : 'Other Customers';
        }
        //Save the sum to this Map's state
        $mapState['sumOthers'] = $sum;
        $mappedRows = [];
        //Skip rows after the first 2 rows (they won't be sent to next process or datastore) 
        //return this Map's state to save it
        return ['{rows}' => $mappedRows, '{state}' => $mapState];
        
    },
    '{end}' => function($count, $mapState) {
        //After all rows had been sent
        //retrieve this Map's state and send it at the end of Map process
        $rowsToSend = [$mapState['sumOthers']];
        return $rowsToSend;
    },
]))