ZetCode

C# yield

last modified March 29, 2021

C# yield tutorial shows how to use yield keyword in C# language.

The yield keyword

The yield keyword is use to do custom stateful iteration over a collection. The yield keyword tells the compiler that the method in which it appears is an iterator block.

yield return <expression>;
yield break;

The yield return statement returns one element at a time. The return type of yield keyword is either IEnumerable or IEnumerator. The yield break statement is used to end the iteration.

We can consume the iterator method that contains a yield return statement either by using foreach loop or LINQ query. Each iteration of the loop calls the iterator method. When a yield return statement is reached in the iterator method, the expression is returned, and the the current location in code is retained. Execution is restarted from that location the next time that the iterator function is called.

Two important aspects of using yield are:

C# yield example

In the first examples, we work with Fibonacci sequence.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...

The Fibonacci sequence is the series of numbers, where the next number is found by adding up the two numbers before it.

Program.cs
using System;
using System.Linq;
using System.Collections.Generic;

var data = Fibonacci(10);

foreach (int e in data)
{
    Console.WriteLine(e);
}

IEnumerable<int> Fibonacci(int n)
{
    var vals = new List<int>();

    for (int i = 0, n1 = 0, n2 = 1; i < n; i++)
    {
        int fib = n1 + n2;
     
        n1 = n2;

        vals.Add(fib);
        n2 = fib;
    }

    return vals;
}

Here, we compute the sequence without the yield keyword. We print the first ten values of the sequence.

var vals = new List<int>();

This implementation requires a new list. Imagine that we worked hundreds of millions of values. This would significantly slow our computation and would require huge amount of memory.

$ dotnet run 
1
2
3
5
8
13
21
34
55
89

Next, we use the yield keyword to generate the Fibonacci sequence.

Program.cs
using System;
using System.Linq;
using System.Collections.Generic;

foreach (int fib in Fibonacci(10))
{
    Console.WriteLine(fib);
}

IEnumerable<int> Fibonacci(int n)
{
    for (int i = 0, n1 = 0, n2 = 1; i < n; i++)
    {
        yield return n1;

        int temp = n1 + n2;
        n1 = n2;

        n2 = temp;
    }
}

This implementation starts producing numbers before reaching the specified end of the sequence.

for (int i = 0, n1 = 0, n2 = 1; i < n; i++)
{
    yield return n1;

    int temp = n1 + n2;
    n1 = n2;

    n2 = temp;
}

The yield return returns the currently computed value to the above foreach statement. The n1, n2, temp values are remembered; C# creates a class behind the scenes to keep these values.

C# yield running total

The yield stores state; the next program demonstrates this.

Program.cs
using System;
using System.Collections.Generic;

var vals = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

foreach (int e in RunningTotal())
{
    Console.WriteLine(e);
}

IEnumerable<int> RunningTotal()
{
    int runningTotal = 0;

    foreach (int val in vals)
    {
        runningTotal += val;
        yield return runningTotal;
    }
}

The example calculates the running total for a list of integers. The runningTotal is stored when the control goes between the iterator and the consumer of the iterator.

$ dotnet run 
1
3
6
10
15
21
28
36
45
55

C# yield partition example

In the next example, we compare the efficiency of two approaches to partitioning a huge list.

Program.cs
using System;
using System.Linq;
using System.Collections.Generic;
using System.Collections.ObjectModel;

var vals = Enumerable.Range(1, 100_000_000);

var option = int.Parse(args[0]);

IEnumerable<IEnumerable<int>> result;

if (option == 1)
{
    result = Partition1(vals, 5);
} else 
{
    result = Partition2(vals, 5);
}

foreach (var part in result)
{
    // Console.WriteLine(string.Join(", ", part));
}

Console.WriteLine(string.Join(", ", result.First()));
Console.WriteLine(string.Join(", ", result.Last()));

Console.WriteLine("-------------------");
Console.WriteLine("Finished");

IEnumerable<IEnumerable<int>> Partition1(IEnumerable<int> source, int size)
{
    int[] array = null;
    int count = 0;

    var data = new List<IEnumerable<int>>();

    foreach (int item in source)
    {
        if (array == null)
        {
            array = new int[size];
        }

        array[count] = item;
        count++;

        if (count == size)
        {
            data.Add(new ReadOnlyCollection<int>(array));
            array = null;
            count = 0;
        }
    }

    if (array != null)
    {
        Array.Resize(ref array, count);
        data.Add(new ReadOnlyCollection<int>(array));
    }

    return data;
}

IEnumerable<IEnumerable<int>> Partition2(IEnumerable<int> source, int size)
{
    int[] array = null;
    int count = 0;

    foreach (int item in source)
    {
        if (array == null)
        {
            array = new int[size];
        }

        array[count] = item;
        count++;

        if (count == size)
        {
            yield return new ReadOnlyCollection<int>(array);
            array = null;
            count = 0;
        }
    }

    if (array != null)
    {
        Array.Resize(ref array, count);
        yield return new ReadOnlyCollection<int>(array);
    }
}

We have a sequence of a hundred million vallues. We partition them into groups of five values with and without the yield keyword and compare the efficiency.

var vals = Enumerable.Range(1, 100_000_000);

A sequence of one hundred million values is generated with Enumerable.Range.

var option = int.Parse(args[0]);

IEnumerable<IEnumerable<int>> result;

if (option == 1)
{
    result = Partition1(vals, 5);
} else 
{
    result = Partition2(vals, 5);
}

The program is run with a parameter. The option 1 invokes Partition1 function. The yield keyword is used in Partition2 and is invoked with option other than 1.

var data = new List<IEnumerable<int>>();
...
return data;

The Partition1 function builds a list with values partitioned inside. For one hundred million values, this requires a significant chunk of memory. Also, if there is not enough free memory, the operating system starts swapping the memory to disk which slows down the computation.

if (array != null)
{
    Array.Resize(ref array, count);
    yield return new ReadOnlyCollection<int>(array);
}

In Partition2, we return one partitioned collection at a time. We don't wait for the whole process to finish. This approach requires less memory.

$ /usr/bin/time -f "%M KB %e s" bin/Release/net5.0/Partition 1
1, 2, 3, 4, 5
99999996, 99999997, 99999998, 99999999, 100000000
-------------------
Finished
1696712 KB 6.38 s

$ /usr/bin/time -f "%M KB %e s" bin/Release/net5.0/Partition 2
1, 2, 3, 4, 5
99999996, 99999997, 99999998, 99999999, 100000000
-------------------
Finished
30388 KB 2.99 s

We use the time command to compare the two functions. In our case it was 1.7 GB vs 30 MB.

In this tutorial, we have worked with C# yield keyword.

Visit C# tutorial or list all C# tutorials.