Tuesday, June 11, 2013

LINQ: Func<T> vs. Expression<Func<T>>

Continuing the previous post about differences between IEnumerable<T> and IQueryable<T> I would like to describe differences between Func<T> and Expression<Func<T>>, basic features of lambda expressions and expression trees and possible impact on Entity Framework.

What is Func<T>

Func<T> is just a predefined generic delegate which encapsulates a method that accepts no parameters and returns a value of type T. It's declared like:


public delegate T Func<out T>();


Its value can be assigned to a named method or to an anonymous method through delegate syntax or through lambda expression syntax. All the following assignments are correct and return the same result:

void Main()
{
    Func<int> theDelegate; // Declare a delegate that has no parameters and returns an integer value.

    theDelegate = NamedMethod; // Assign to a named method.
    theDelegate = delegate() { return 0; }; // Assign to anonymous method through delegate syntax.
    theDelegate = delegate { return 0; }; // It has no parameters, so round braces can be omitted.
    theDelegate = () => { return 0; }; // The same anonymous method through lambda expression syntax.
    theDelegate = () => 0; // The syntax of single return statement can be even more simpler.

    int result = theDelegate(); // Call the delegate and get the result.
}

int NamedMethod()
{
    return 0;
}

What is Expression<Func<T>>

Expression<Func<T>> or Expression<TDelegate> is a representation of a strongly typed lambda expression in the form of expression tree. In other words, the code in Expression is represented like a data structure. The declaration:

public sealed class Expression<TDelegate> : LambdaExpression

We can see that Expression<TDelegate> is of type of LambdaExpression. LambdaExpression is a base class for representing lambda expressions in form of expression trees.

Expression tree can be constructed through lambda expression syntax or through API syntax

The following declarations are identical:

static void ConstructSimplestExpression()
{
    Expression<Func<int>> expression;

    expression = () => 0; // Construct expression tree through lambda expression syntax.

    expression = Expression.Lambda<Func<int>>(Expression.Constant(0)); // Construct the tree through API syntax.
}

The first technique gives us more simple, straightforward and neat way, but the second one gives us the possibility to build dynamic expressions.

Expression tree can be compiled to a normal delegate (Func<T>)

How to convert an expression tree to a delegate?

static void CompileAndRunSimpestExpression()
{
    Expression<Func<int>> expression = () => 0;

    Func<int> func = expression.Compile();

    int result = func();
}

Let's have a look into the Locals View of the debugger and see how it's represented:

We can see that expression is a LambdaExpression with a ConstantExpression in its Body, func is a MultiCastDelegate, and obtained result is zero.

The .Compile() method compiles specified expression tree to Intermediate Language to get executable code and produces the delegate.

Is there any way to convert Func<T> to Expression<Func<T>>?

We should understand that Func<T> or a delegate is a compiled method, so we need to decompile the method and then convert the IL instructions to an expression tree. Theoretically it's possible, but there is no builtin functionality for that as it's not a straightforward process. Of course we can call a Func<T> inside an expression, but it will only give us a MethodCallExpression and not the internal expression tree of the delegate.

Expression tree can be observed and translated

As the code in expression tree is just a hierarchical set of instructions-expressions, and is treated as data, we can analyze and observe the tree.

Apparently we can take an expression and expand the tree hierarchically casting its properties and exploring property values, but it seems to be a kind of ridiculous way. Thankfully Microsoft provided a more smart way to explore the tree through ExpressionVisitor.

ExpressionVisitor is an abstract class, which we can inherit from and override it's specific virtual methods. Let's have a simple implementation that can write to Console simple binary expressions with parameters and constants:

public class ConsoleVisitor : ExpressionVisitor
{
    protected override Expression VisitBinary(BinaryExpression node)
    {
        Console.Write("(");

        this.Visit(node.Left);

        Console.Write(" {0} ", node.NodeType);

        this.Visit(node.Right);

        Console.Write(")");

        return node;
    }

    protected override Expression VisitParameter(ParameterExpression node)
    {
        Console.Write("parameter({0})", node.Name);
        return base.VisitParameter(node);
    }

    protected override Expression VisitConstant(ConstantExpression node)
    {
        Console.Write("constant({0})", node.Value);
        return base.VisitConstant(node);
    }
}

And let's have a simple lambda expression which takes 2 parameters and describes some math expression with the parameters. Then let's observe the tree with the ConsoleVisitor:

static void AnalyzeExpressionTree()
{
    Expression<Func<int, int, int>> expression = (a, b) => a + a * b + 3;

    var visitor = new ConsoleVisitor();
    visitor.Visit(expression.Body);
}

By the way, as it was mentioned before, absolutely the same expression can be constructed through the API:

static void AnalyzeExpressionConstructedThroughAPI()
{
    var parameterA = Expression.Parameter(typeof(int), "a");
    var parameterB = Expression.Parameter(typeof(int), "b");
    var constant3 = Expression.Constant(3);

    var multiplyAB = Expression.Multiply(parameterA, parameterB);
    var summ1 = Expression.Add(parameterA, multiplyAB);
    var summ = Expression.Add(summ1, constant3);

    Expression<Func<int, int, int>> expression = Expression.Lambda<Func<int, int, int>>(
        summ,
        new[] { parameterA, parameterB });

    var visitor = new ConsoleVisitor();
    visitor.Visit(expression.Body);
}

What are we going to get? Here is the result on Console:

((parameter(a) Add (parameter(a) Multiply parameter(b))) Add constant(3))

Notice that the order of operands is accurately considered by the expression.

The sample is obviously simplified, but we can guess that this technique gives us a possibility to translate an expression to... completely different programming language. That's what the Entity Framework db query provider actually does, isn't it?

IEnumerable works with Func and IQueryable works with Expression<Func>

That's what we should understand. Entity Framework or LINQ to SQL operates with expression trees to construct SQL queries from lambda expressions. If we look into the source code of appropriate extension methods, we can easily notice that Enumerable and Queryable work with different types of arguments for the predicate:

// Type: System.Linq.Enumerable
// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll
public static class Enumerable
{
    public static IEnumerable<TSource> Where<TSource>(
        this IEnumerable<TSource> source, 
        Func<TSource, bool> predicate)
    {
        return (IEnumerable<TSource>) new Enumerable.WhereEnumerableIterator<TSource>(source, predicate);
    }
}


// Type: System.Linq.Queryable
// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll
public static class Queryable
{
    public static IQueryable<TSource> Where<TSource>(
        this IQueryable<TSource> source, 
        Expression<Func<TSource, bool>> predicate)
    {
        return source.Provider.CreateQuery<TSource>(
            Expression.Call(
                null, 
                ((MethodInfo) MethodBase.GetCurrentMethod()).MakeGenericMethod(
                    new Type[] { typeof(TSource) }), 
                    new Expression[] { source.Expression, Expression.Quote(predicate) }));
    }
}

We should remember about that to prevent unintended consequenses (or to create the intended ones) when we work with delegates and expressions.

9 comments: