C# yield
last modified February 1, 2024
In this article we show how to use yield keyword in C# language.
The yield keyword
The yield
keyword is use to do custom stateful iteration over a
collection. The yield keyword tells the compiler that the method in which it
appears is an iterator block.
yield return <expression>; yield break;
The yield return
statement returns one element at a time. The
return type of yield keyword is either IEnumerable
or
IEnumerator
. The yield break
statement is used to end
the iteration.
We can consume the iterator method that contains a yield return statement either by using foreach loop or LINQ query. Each iteration of the loop calls the iterator method. When a yield return statement is reached in the iterator method, the expression is returned, and the the current location in code is retained. Execution is restarted from that location the next time that the iterator function is called.
Two important aspects of using yield are:
- lazy evaluation
- deferred execution
C# yield example
In the first examples, we work with Fibonacci sequence.
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
The Fibonacci sequence is the series of numbers, where the next number is found by adding up the two numbers before it.
var data = Fibonacci(10); foreach (int e in data) { Console.WriteLine(e); } IEnumerable<int> Fibonacci(int n) { var vals = new List<int>(); for (int i = 0, n1 = 0, n2 = 1; i < n; i++) { int fib = n1 + n2; n1 = n2; vals.Add(fib); n2 = fib; } return vals; }
Here, we compute the sequence without the yield
keyword. We print
the first ten values of the sequence.
var vals = new List<int>();
This implementation requires a new list. Imagine that we worked hundreds of millions of values. This would significantly slow our computation and would require huge amount of memory.
$ dotnet run 1 2 3 5 8 13 21 34 55 89
Next, we use the yield
keyword to generate the Fibonacci sequence.
foreach (int fib in Fibonacci(10)) { Console.WriteLine(fib); } IEnumerable<int> Fibonacci(int n) { for (int i = 0, n1 = 0, n2 = 1; i < n; i++) { yield return n1; int temp = n1 + n2; n1 = n2; n2 = temp; } }
This implementation starts producing numbers before reaching the specified end of the sequence.
for (int i = 0, n1 = 0, n2 = 1; i < n; i++) { yield return n1; int temp = n1 + n2; n1 = n2; n2 = temp; }
The yield return
returns the currently computed value to the
above foreach
statement. The n1
, n2
,
temp
values are remembered; C# creates a class behind the scenes
to keep these values.
We can have multiple yields statements.
int n = 10; IEnumerable<string> res = FibSeq().TakeWhile(f => f.n <= n).Select(f => $"{f.fib}"); Console.WriteLine(string.Join(" ", res)); IEnumerable<(int n, int fib)> FibSeq() { yield return (0, 0); yield return (1, 1); var (x, y, n) = (1, 0, 0); while (x < int.MaxValue - y) { (x, y, n) = (x + y, x, n + 1); yield return (n, x); } }
In the example, we calculate the Fibonacci sequence with the help of tuples.
IEnumerable<string> res = FibSeq().TakeWhile(f => f.n <= n).Select(f => $"{f.fib}");
We consume the Fibonacci sequence with the LINQ's TakeWhile
method.
Console.WriteLine(string.Join(" ", res));
The sequence of strings is joined.
IEnumerable<(int n, int fib)> FibSeq() { yield return (0, 0); yield return (1, 1); var (x, y, n) = (1, 0, 0); while (x < int.MaxValue - y) { (x, y, n) = (x + y, x, n + 1); yield return (n, x); } }
The FibSeq
method returns a sequence of tuple values. Each tuple
contains the n
value, up to which we generate the sequence and
the current Fibonacci value fib
.
yield return (0, 0); yield return (1, 1);
The first two tuples are returned with yield return
.
var (x, y, n) = (1, 0, 0); while (x < int.MaxValue - y) { (x, y, n) = (x + y, x, n + 1); yield return (n, x); }
The rest of the sequence is calculated with a while loop. The sequence goes
up to int.MaxValue
.
C# yield running total
The yield
stores state; the next program demonstrates this.
List<int> vals = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; foreach (int e in RunningTotal()) { Console.WriteLine(e); } IEnumerable<int> RunningTotal() { int runningTotal = 0; foreach (int val in vals) { runningTotal += val; yield return runningTotal; } }
The example calculates the running total for a list of integers. The
runningTotal
is stored when the control goes between the iterator
and the consumer of the iterator.
$ dotnet run 1 3 6 10 15 21 28 36 45 55
C# yield partition example
In the next example, we compare the efficiency of two approaches to partitioning a huge list.
using System.Collections.ObjectModel; var vals = Enumerable.Range(1, 100_000_000); var option = int.Parse(args[0]); IEnumerable<IEnumerable<int>> result; if (option == 1) { result = Partition1(vals, 5); } else { result = Partition2(vals, 5); } foreach (var part in result) { // Console.WriteLine(string.Join(", ", part)); } Console.WriteLine(string.Join(", ", result.First())); Console.WriteLine(string.Join(", ", result.Last())); Console.WriteLine("-------------------"); Console.WriteLine("Finished"); IEnumerable<IEnumerable<int>> Partition1(IEnumerable<int> source, int size) { int[] array = null; int count = 0; var data = new List<IEnumerable<int>>(); foreach (int item in source) { if (array == null) { array = new int[size]; } array[count] = item; count++; if (count == size) { data.Add(new ReadOnlyCollection<int>(array)); array = null; count = 0; } } if (array != null) { Array.Resize(ref array, count); data.Add(new ReadOnlyCollection<int>(array)); } return data; } IEnumerable<IEnumerable<int>> Partition2(IEnumerable<int> source, int size) { int[] array = null; int count = 0; foreach (int item in source) { if (array == null) { array = new int[size]; } array[count] = item; count++; if (count == size) { yield return new ReadOnlyCollection<int>(array); array = null; count = 0; } } if (array != null) { Array.Resize(ref array, count); yield return new ReadOnlyCollection<int>(array); } }
We have a sequence of a hundred million vallues. We partition them into groups
of five values with and without the yield
keyword and compare the
efficiency.
var vals = Enumerable.Range(1, 100_000_000);
A sequence of one hundred million values is generated with
Enumerable.Range
.
var option = int.Parse(args[0]); IEnumerable<IEnumerable<int>> result; if (option == 1) { result = Partition1(vals, 5); } else { result = Partition2(vals, 5); }
The program is run with a parameter. The option 1 invokes
Partition1
function. The yield
keyword is used in
Partition2
and is invoked with option other than 1.
var data = new List<IEnumerable<int>>(); ... return data;
The Partition1
function builds a list with values partitioned
inside. For one hundred million values, this requires a significant chunk of
memory. Also, if there is not enough free memory, the operating system starts
swapping the memory to disk which slows down the computation.
if (array != null) { Array.Resize(ref array, count); yield return new ReadOnlyCollection<int>(array); }
In Partition2
, we return one partitioned collection at a time. We
don't wait for the whole process to finish. This approach requires less memory.
$ /usr/bin/time -f "%M KB %e s" bin/Release/net5.0/Partition 1 1, 2, 3, 4, 5 99999996, 99999997, 99999998, 99999999, 100000000 ------------------- Finished 1696712 KB 6.38 s $ /usr/bin/time -f "%M KB %e s" bin/Release/net5.0/Partition 2 1, 2, 3, 4, 5 99999996, 99999997, 99999998, 99999999, 100000000 ------------------- Finished 30388 KB 2.99 s
We use the time
command to compare the two functions. In our case
it was 1.7 GB vs 30 MB.
Source
yield statement - language reference
In this article we have worked with C# yield
keyword.
Author
List all C# tutorials.