Home > Articles > Programming > C#

  • Print
  • + Share This
This chapter is from the book

Lambda Expressions

Introduced in C# 3.0, lambda expressions are a more succinct syntax of anonymous functions than anonymous methods, where anonymous functions is a general term that includes both lambda expressions and anonymous methods. Lambda expressions are themselves broken into two types: statement lambdas and expression lambdas. Figure 12.2 shows the hierarchical relationship between the terms.

Figure 12.2

Figure 12.2 Anonymous Function Terminology

Statement Lambdas

With statement lambdas, C# 3.0 provides a reduced syntax for anonymous methods, a syntax that does not include the delegate keyword and adds the lambda operator, =>. Listing 12.14 shows equivalent functionality to Listing 12.12, except that Listing 12.14 uses a statement lambda rather than an anonymous method.

Listing 12.14. Passing a Delegate with a Statement Lambda

class DelegateSample
{

  // ...

   static void Main(string[] args)
  {

      int i;
      int[] items = new int[5];

      for (i=0; i<items.Length; i++)
      {
          Console.Write("Enter an integer:");
          items[i] = int.Parse(Console.ReadLine());
      }

      BubbleSort(items,
          (int first, int second) =>
          {
             return first < second;
          }
      );

      for (i = 0; i < items.Length; i++)
      {
          Console.WriteLine(items[i]);
      }
  }
}

When reading code that includes a lambda operator, you would replace the lambda operator with the words "go/goes to." For example, you would read n => { return n.ToString();} as "n goes to return n dot ToString." In Listing 12.15, you would read the second BubbleSort() parameter as "integers first and second go to returning the result of first less than second."

As readers will observe, the syntax in Listing 12.14 is virtually identical to that in Listing 12.12, apart from the changes already outlined. However, statement lambdas allow for an additional shortcut via type parameter inference. Rather than explicitly declaring the data type of the parameters, statement lambdas can omit parameter types as long as the compiler can infer the types. In Listing 12.15, the delegate data type is bool ComparisonHandler(int first, int second), so the compiler verifies that the return type is a bool and infers that the input parameters are both integers (int).

Listing 12.15. Omitting Parameter Types from Statement Lambdas

// ...

    BubbleSort(items,
        (first, second) =>
        {
             return first < second;
        }
    );
  
// ...

In general, statement lambdas do not need parameter types as long as the compiler can infer the types or can implicitly convert them to the requisite expected types. In cases where inference is not possible, the data type is required, although even when it is not required, you can specify the data type explicitly to increase readability; once the statement lambda includes one type, all types are required.

In general, C# requires a lambda expression to have parentheses around the parameter list regardless of whether the data type is specified. Even parameterless statement lambdas, representing delegates that have no input parameters, are coded using empty parentheses (see Listing 12.16).

Listing 12.16. Parameterless Statement Lambdas

using System;
  // ...
  Func<string> getUserInput =
      () =>
      {
          string input;
          do
          {
              input = Console.ReadLine();
          }
          while(input.Trim().Length==0);
          return input;
      };
  // ...

The exception to the parenthesis rule is that if the compiler can infer the data type and there is only a single input parameter, the statement lambda does not require parentheses (see Listing 12.17).

Listing 12.17. Statement Lambdas with a Single Input Parameter

using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
  // ...
  IEnumerable<Process> processes = Process.GetProcesses().Where(
      process => { return process.WorkingSet64 > 1000000000; });
  // ...

(In Listing 12.17, Where() returns a query for processes that have a physical memory utilization greater than 1GB.)

Note that back on Listing 12.16, the body of the statement lambda includes multiple statements inside the statement block (via curly braces). Although there can be any number of statements in a statement lambda, typically a statement lambda uses only two or three statements in its statement block. In contrast, the body of an expression lambda does not even make up a full statement since there is no statement block.

Expression Lambdas

Unlike a statement lambda, which includes a statement block and, therefore, zero or more statements, an expression lambda has only an expression, with no statement block. Listing 12.18 is the same as Listing 12.14, except that it uses an expression lambda rather than a statement lambda.

Listing 12.18. Passing a Delegate with a Statement Lambda

class DelegateSample
{

  // ...
  
  static void Main(string[] args)
  {

      int i;
      int[] items = new int[5];

      for (i=0; i<items.Length; i++)
      {
          Console.Write("Enter an integer:");
          items[i] = int.Parse(Console.ReadLine());
      }

      BubbleSort(items, (first, second) => first < second; );

      for (i = 0; i < items.Length; i++)
      {
          Console.WriteLine(items[i]);
      }
  }
}

The difference between a statement and an expression lambda is that the statement lambda has a statement block on the right side of the lambda operator, whereas the expression lambda has only an expression (no return statement or curly braces, for example).

Generally, you would read a lambda operator in an expression lambda in the same way you would a statement lambda: "go/goes to." In addition, "becomes" is sometimes clearer. In cases such as the BubbleSort() call, where the expression lambda specified is a predicate (returns a Boolean), it is frequently clearer to replace the lambda operator with "such that." This changes the pronunciation of the statement lambda in Listing 12.18 to read "first and second such that first is less than second." One of the most common places for a predicate to appear is in the call to System.Linq.Enumerable()'s Where() function. In cases such as this, neither "such that" nor "goes to" is needed. We would read names.Where(name => name.Contains(" ")) as "names where names dot Contains a space," for example. One pronunciation difference between the lambda operator in statement lambdas and in expression lambdas is that "such that" terminology applies more to expression lambdas than to statements lambda since the latter tend to be more complex.

The anonymous function does not have any intrinsic type associated with it, although implicit conversion is possible for any delegate type as long as the parameters and return type are compatible. In other words, an anonymous method is no more a ComparisonHandler type than another delegate type such as LessThanHandler. As a result, you cannot use the typeof() operator (see Chapter 17) on an anonymous method, and calling GetType() is possible only after assigning or casting the anonymous method to a delegate variable.

Table 12.1 contains additional lambda expression characteristics.

Table 12.1. Lambda Expression Notes and Examples

Statement

Example

Lambda expressions themselves do not have type. In fact, there is no concept of a lambda expression in the CLR. Therefore, there are no members to call directly from a lambda expression. The . operator on a lambda expression will not compile, eliminating even the option of calls to object methods.

// ERROR: Operator '.' cannot be applied to
// operand of type 'lambda expression'
Type type = ((int x) => x).ToString();;

Given that a lambda expression does not have an intrinsic type, it cannot appear on the right of an is operator.

// ERROR: The first operand of an 'is' or 'as'
// operator may not be a lambda expression or
// anonymous method
bool boolean = ((int x) => x) is Func<int, int>;

Although there is no type on the lambda expression on its own, once assigned or cast, the lambda expression takes on a type. Therefore, it is common for developers to informally refer to the type of the lambda expression concerning type compatibility, for example.

// ERROR: Lambda expression is not compatible with
//        Func<int, bool> type.
Func<int, bool> expression = ((int  x) => x);

A lambda expression cannot be assigned to an implicitly typed local variable since the compiler does not know what type to make the variable given that lambda expressions do not have type.

// ERROR: Cannot assign lambda expression to an
//        implicitly typed local variable
var  thing = (x => x);

C# does not allow jump statements (break, goto, continue) inside anonymous functions if the target is outside the lambda expression. Similarly, you cannot target a jump statement from outside the lambda expression (or anonymous methods) into the lambda expression.

// ERROR: Control cannot leave the body of an
// anonymous method or lambda expression
string[] args;
Func<string> expression;
switch(args[0])
{
  case "/File":
  expression = () =>
      {
          if (!File.Exists(args[1]))
              {
                  break;
              }
              // ...
          return args[1];
      };
      // ...
}

Variables introduced within a lambda expression are visible only within the scope of the lambda expression body.

// ERROR: The name 'first' does not
//        exist in the current context
Func<int, int, bool> expression =
    (first, second) => first > second;
 first++;

The compiler's flow analysis is unable to detect initialization of local variables in lambda expressions.

int number;
Func<string, bool> expression =
  text => int.TryParse(text, out number);
if (expression("1"))
{
  // ERROR: Use of unassigned local variable
            System.Console.Write(number);
}
int number;
Func<int, bool> isFortyTwo =
  x => 42 == (number = x);
if (isFortyTwo(42))
{
    // ERROR: Use of unassigned local variable
              System.Console.Write(number);
}

Outer Variables

Local variables (including parameters) declared outside an anonymous function (such as a lambda expression), but captured (accessed) within the lambda expression, are outer variables of that anonymous function. this is also an outer variable. Outer variables captured by anonymous functions live on until after the anonymous function's delegate is destroyed. In Listing 12.20, it is relatively trivial to use an outer variable to count how many times swap is called by BubbleSort(). Output 12.2 shows the results of this listing.

Listing 12.20. Using an Outer Variable in a Lambda Expression

class DelegateSample
{

  // ...
  
  static void Main(string[] args)
  {

      int i;
      int[] items = new int[5];
      int swapCount=0;

      for (i=0; i<items.Length; i++)
      {
          Console.Write("Enter an integer:");
          items[i] = int.Parse(Console.ReadLine());
      }

      BubbleSort(items,
          (int first, int second) =>
          {
              bool swap = first < second;
              if(swap)
              {
                  swapCount++;
              }
              return swap;

          }

      );

      for (i = 0; i < items.Length; i++)
      {
          Console.WriteLine(items[i]);
      }

      Console.WriteLine("Items were swapped {0} times.",
                        swapCount);
    }
}

Output 12.2:

Enter an integer:5
Enter an integer:1
Enter an integer:4
Enter an integer:2
Enter an integer:3
5
4
3
2
1
Items were swapped 4 times.

swapCount appears outside the lambda expression and is incremented inside it. After calling the BubbleSort() method, swapCount is printed out to the console.

As this code demonstrates, the C# compiler takes care of generating CIL code that shares swapCount between the anonymous method and the call site, even though there is no parameter to pass swapCount within the anonymous delegate, nor within the BubbleSort() method. Given the sharing of the variable, it will not be garbage-collected until after the delegate that references it goes out of scope.

Expression Trees

Lambda expressions provide a succinct syntax for defining a method inline within your code. The compiler converts the code so that it is executable and callable later, potentially passing the delegate to another method. One feature for which it does not offer intrinsic support, however, is a representation of the expression as data—data that may be traversed and even serialized.

Consider the lambda expression in the following code:

persons.Where( person => person.Name.ToUpper() == "INIGO MONTOYA");

Assuming that persons is an array of Persons, the compiler compiles the lambda expression to a Func<string, bool> delegate type and then passes the delegate instance to the Where() method. Code and execution like this work very well. (The Where() method is an IEnumerable extension method from the class System.Linq.Enumerable, but this is irrelevant within this section.)

What if persons was not a Person array, but rather a collection of Person objects sitting on a remote computer, or perhaps in a database? Rather than returning all items in the persons collection, it would be preferable to send data describing the expression over the network and have the filtering occur remotely so that only the resultant selection returns over the network. In scenarios such as this, the data about the expression is needed, not the compiled CIL. The remote computer then compiles or interprets the expression data.

Interpreting is motivation for adding expression trees to the language. Lambda expressions that represent data about expressions rather than compiled code are expression trees. Since the expression tree represents data rather than compiled code, it is possible to convert the data to an alternative format—to convert it from the expression data to SQL code (SQL is the language generally used to query data from databases) that executes on a database, for example. The expression tree received by Where() may be converted into a SQL query that is passed to a database, for example (see Listing 12.22).

Listing 12.22. Converting an Expression Tree to a SQL where Clause

Recognizing the original Where() call parameter as data, you can see that it is made up of the following:

  • The call to the Person property, Name
  • A call to a string method called ToUpper()
  • A constant value, "INIGO MONTOYA"
  • An equality operator, ==

The Where() method takes this data and converts it to the SQL where clause by iterating over the data and building a SQL query string. However, SQL is just one example of what an expression tree may convert to.

Both a lambda expression for delegates and a lambda expression for an expression tree are compiled, and in both cases, the syntax of the expression is verified at compile time with full semantic analysis. The difference, however, is that a lambda expression is compiled into a delegate in CIL, whereas an expression tree is compiled into a data structure of type System.Linq.Expressions.Expression. As a result, when a lambda expression is an expression lambda, it may execute—it is CIL instructions for what the runtime should do. However, if the lambda expression is an expression tree, it is not a set of CIL instructions, but rather a data structure. Although an expression tree includes a method that will compile it into a delegate constructor call, it is more likely that the expression tree (data) will be converted into a different format or set of instructions.

System.Linq.Enumerable versus System.Linq.Queryable

Let us consider an example that highlights the difference between a delegate and an expression tree. System.Linq.Enumerable and System.Linq.Queryable are very similar. They each provide virtually identical extension methods to the collection interfaces they extend (IEnumerable and IQueryable, respectively). Consider, for example, the Where() method from Listing 12.22. Given a collection that supports IEnumerable, a call to Where() could be as follows:

persons.Where( person => person.Name.ToUpper() ==
    "INIGO MONTOYA");

Conceptually, the Enumerable extension method signature is defined on IEnumerable<TSource> as follows:

public IEnumerable<TSource> Where<TSource>(
    Func<TSource, bool> predicate);

However, the equivalent Queryable extension on the IQueryable<TSource> method call is identical, even though the conceptual Where() method signature (shown) is not:

public IQueryable<TSource> Where<TSource>(
    Expression<Func<TSource, bool>> predicate);

The calling code for the argument is identical because the lambda expression itself does not have type until it is assigned/cast.

Enumerable's Where() implementation takes the lambda expression and converts it to a delegate that the Where() method's implementation calls. In contrast, when calling Queryable's Where(), the lambda expression is converted to an expression tree so that the compiler converts the lambda expression into data. The object implementing IQueryable receives the expression data and manipulates it. As suggested before, the expression tree received by Where() may be converted into a SQL query that is passed to a database.

Examining an Expression Tree

Capitalizing on the fact that lambda expressions don't have intrinsic type, assigning a lambda expression to a System.Linq.Expressions.Expression<TDelegate> creates an expression tree rather than a delegate.

In Listing 12.23, we create an expression tree for the Func<int, int, bool>. (Recall that Func<int, int, bool> is functionally equivalent to the ComparisonHandler delegate.) Notice that just the simple act of writing an expression to the console, Console.WriteLine(expression) where expression is of type Expression<TDelegate>, will result in a call to expression's ToString() method). However, this doesn't cause the expression to be evaluated or even to write out the fully qualified name of Func<int, int, bool> (as would happen if we used a delegate instance). Rather, displaying the expression writes out the data (in this case, the expression code) corresponding to the value of the expression tree.

Listing 12.23. Examining an Expression Tree

using System;
using System.Linq.Expressions;

class Program
{
    static void Main()
    {
        Expression<Func<int, int, bool>> expression;
        expression = (x, y) => x > y;
        Console.WriteLine("-------------{0}-------------",
            expression);
        PrintNode(expression.Body, 0);
        Console.WriteLine();
        Console.WriteLine();
        expression = (x, y) => x * y > x + y;
        Console.WriteLine("-------------{0}-------------",
            expression);
        PrintNode(expression.Body, 0);
        Console.WriteLine();
        Console.WriteLine();
    }
    public static void PrintNode(Expression expression,
        int indent)
    {
        if (expression is BinaryExpression)
            PrintNode(expression as BinaryExpression, indent);
        else
            PrintSingle(expression, indent);
    }
    private static void PrintNode(BinaryExpression expression,
      int indent)
    {
        PrintNode(expression.Left, indent + 1);
        PrintSingle(expression, indent);
        PrintNode(expression.Right, indent + 1);
    }
    private static void PrintSingle(
        Expression expression, int indent)
    {
        Console.WriteLine("{0," + indent * 5 + "}{1}",
          "", NodeToString(expression));
    }
    private static string NodeToString(Expression expression)
    {
        switch (expression.NodeType)
        {
            case ExpressionType.Multiply:
                return "*";
            case ExpressionType.Add:
                return "+";
            case ExpressionType.Divide:
                return "/";
            case ExpressionType.Subtract:
                return "-";
            case ExpressionType.GreaterThan:
                return ">";
            case ExpressionType.LessThan:
                return "<";
            default:
                return expression.ToString() +
                    " (" + expression.NodeType.ToString() + ")";
        }
    }
}

In Output 12.3, we see that the Console.WriteLine() statements within Main() print out the body of the expression trees as text.

Output 12.3:

------------- (x, y) => x > y -------------
    x (Parameter)
>
    y (Parameter)


------------- (x, y) => (x * y) > (x + y) -------------
          x (Parameter)
    *
          y (Parameter)
>
          x (Parameter)
    +
          y (Parameter)

The output of the expression as text is due to conversion from the underlying data of an expression tree—conversion similar to the PrintNode() and NodeTypeToString() functions, only more comprehensive. The important point to note is that an expression tree is a collection of data, and by iterating over the data, it is possible to convert the data to another format. In the PrintNode() method, Listing 12.23 converts the data to a horizontal text interpretation of the data. However, the interpretation could be virtually anything.

Using recursion, the PrintNode() function demonstrates that an expression tree is a tree of zero or more expression trees. The contained expression trees are stored in an Expression's Body property. In addition, the expression tree includes an ExpressionType property called NodeType where ExpressionType is an enum for each different type of expression. There are numerous types of expressions: BinaryExpression, ConditionalExpression, LambdaExpression (the root of an expression tree), MethodCallExpression, ParameterExpression, and ConstantExpression are examples. Each type derives from System.Linq.Expressions.Expression.

Generally, you can use statement lambdas interchangeably with expression lambdas. However, you cannot convert statement lambdas into expression trees. You can express expression trees only by using expression lambda syntax.

  • + Share This
  • 🔖 Save To Your Account