Tuesday, June 11, 2013

LINQ: Func<T> vs. Expression<Func<T>>

Continuing the previous post about differences between IEnumerable<T> and IQueryable<T> I would like to describe differences between Func<T> and Expression<Func<T>>, basic features of lambda expressions and expression trees and possible impact on Entity Framework.

What is Func<T>

Func<T> is just a predefined generic delegate which encapsulates a method that accepts no parameters and returns a value of type T. It's declared like:


public delegate T Func<out T>();

Its value can be assigned to a named method or to an anonymous method through delegate syntax or through lambda expression syntax. All the following assignments are correct and return the same result:

<span class="kwd">void</span><span class="pln"> </span><span class="typ">Main</span><span class="pun">()</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="typ">Func</span><span class="str">&lt;int&gt;</span><span class="pln"> theDelegate</span><span class="pun">;</span><span class="pln"> </span><span class="com">// Declare a delegate that has no parameters and returns an integer value.</span><span class="pln">

    theDelegate </span><span class="pun">=</span><span class="pln"> </span><span class="typ">NamedMethod</span><span class="pun">;</span><span class="pln"> </span><span class="com">// Assign to a named method.</span><span class="pln">
    theDelegate </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">delegate</span><span class="pun">()</span><span class="pln"> </span><span class="pun">{</span><span class="pln"> </span><span class="kwd">return</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln"> </span><span class="pun">};</span><span class="pln"> </span><span class="com">// Assign to anonymous method through delegate syntax.</span><span class="pln">
    theDelegate </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">delegate</span><span class="pln"> </span><span class="pun">{</span><span class="pln"> </span><span class="kwd">return</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln"> </span><span class="pun">};</span><span class="pln"> </span><span class="com">// It has no parameters, so round braces can be omitted.</span><span class="pln">
    theDelegate </span><span class="pun">=</span><span class="pln"> </span><span class="pun">()</span><span class="pln"> </span><span class="pun">=&gt;</span><span class="pln"> </span><span class="pun">{</span><span class="pln"> </span><span class="kwd">return</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln"> </span><span class="pun">};</span><span class="pln"> </span><span class="com">// The same anonymous method through lambda expression syntax.</span><span class="pln">
    theDelegate </span><span class="pun">=</span><span class="pln"> </span><span class="pun">()</span><span class="pln"> </span><span class="pun">=&gt;</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln"> </span><span class="com">// The syntax of single return statement can be even more simpler.</span><span class="pln">

    </span><span class="kwd">int</span><span class="pln"> result </span><span class="pun">=</span><span class="pln"> theDelegate</span><span class="pun">();</span><span class="pln"> </span><span class="com">// Call the delegate and get the result.</span><span class="pln">
</span><span class="pun">}</span><span class="pln">

</span><span class="kwd">int</span><span class="pln"> </span><span class="typ">NamedMethod</span><span class="pun">()</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="kwd">return</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln">
</span><span class="pun">}</span>

What is Expression<Func<T>>

Expression<Func<T>> or Expression<TDelegate> is a representation of a strongly typed lambda expression in the form of expression tree. In other words, the code in Expression is represented like a data structure. The declaration:

<span class="kwd">public</span><span class="pln"> </span><span class="kwd">sealed</span><span class="pln"> </span><span class="kwd">class</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">&lt;</span><span class="typ">TDelegate</span><span class="pun">&gt;</span><span class="pln"> </span><span class="pun">:</span><span class="pln"> </span><span class="typ">LambdaExpression</span>

We can see that Expression<TDelegate> is of type of LambdaExpression. LambdaExpression is a base class for representing lambda expressions in form of expression trees.

Expression tree can be constructed through lambda expression syntax or through API syntax

The following declarations are identical:

<span class="kwd">static</span><span class="pln"> </span><span class="kwd">void</span><span class="pln"> </span><span class="typ">ConstructSimplestExpression</span><span class="pun">()</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="typ">Expression</span><span class="pun">&lt;</span><span class="typ">Func</span><span class="str">&lt;int&gt;</span><span class="pun">&gt;</span><span class="pln"> expression</span><span class="pun">;</span><span class="pln">

    expression </span><span class="pun">=</span><span class="pln"> </span><span class="pun">()</span><span class="pln"> </span><span class="pun">=&gt;</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln"> </span><span class="com">// Construct expression tree through lambda expression syntax.</span><span class="pln">

    expression </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Lambda</span><span class="pun">&lt;</span><span class="typ">Func</span><span class="str">&lt;int&gt;</span><span class="pun">&gt;(</span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Constant</span><span class="pun">(</span><span class="lit">0</span><span class="pun">));</span><span class="pln"> </span><span class="com">// Construct the tree through API syntax.</span><span class="pln">
</span><span class="pun">}</span>

The first technique gives us more simple, straightforward and neat way, but the second one gives us the possibility to build dynamic expressions.

Expression tree can be compiled to a normal delegate (Func<T>)

How to convert an expression tree to a delegate?

<span class="kwd">static</span><span class="pln"> </span><span class="kwd">void</span><span class="pln"> </span><span class="typ">CompileAndRunSimpestExpression</span><span class="pun">()</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="typ">Expression</span><span class="pun">&lt;</span><span class="typ">Func</span><span class="str">&lt;int&gt;</span><span class="pun">&gt;</span><span class="pln"> expression </span><span class="pun">=</span><span class="pln"> </span><span class="pun">()</span><span class="pln"> </span><span class="pun">=&gt;</span><span class="pln"> </span><span class="lit">0</span><span class="pun">;</span><span class="pln">

    </span><span class="typ">Func</span><span class="str">&lt;int&gt;</span><span class="pln"> func </span><span class="pun">=</span><span class="pln"> expression</span><span class="pun">.</span><span class="typ">Compile</span><span class="pun">();</span><span class="pln">

    </span><span class="kwd">int</span><span class="pln"> result </span><span class="pun">=</span><span class="pln"> func</span><span class="pun">();</span><span class="pln">
</span><span class="pun">}</span>

Let's have a look into the Locals View of the debugger and see how it's represented:

We can see that expression is a LambdaExpression with a ConstantExpression in its Body, func is a MultiCastDelegate, and obtained result is zero.

The .Compile() method compiles specified expression tree to Intermediate Language to get executable code and produces the delegate.

Is there any way to convert Func<T> to Expression<Func<T>>?

We should understand that Func<T> or a delegate is a compiled method, so we need to decompile the method and then convert the IL instructions to an expression tree. Theoretically it's possible, but there is no builtin functionality for that as it's not a straightforward process. Of course we can call a Func<T> inside an expression, but it will only give us a MethodCallExpression and not the internal expression tree of the delegate.

Expression tree can be observed and translated

As the code in expression tree is just a hierarchical set of instructions-expressions, and is treated as data, we can analyze and observe the tree.

Apparently we can take an expression and expand the tree hierarchically casting its properties and exploring property values, but it seems to be a kind of ridiculous way. Thankfully Microsoft provided a more smart way to explore the tree through ExpressionVisitor.

ExpressionVisitor is an abstract class, which we can inherit from and override it's specific virtual methods. Let's have a simple implementation that can write to Console simple binary expressions with parameters and constants:

<span class="kwd">public</span><span class="pln"> </span><span class="kwd">class</span><span class="pln"> </span><span class="typ">ConsoleVisitor</span><span class="pln"> </span><span class="pun">:</span><span class="pln"> </span><span class="typ">ExpressionVisitor</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="kwd">protected</span><span class="pln"> </span><span class="kwd">override</span><span class="pln"> </span><span class="typ">Expression</span><span class="pln"> </span><span class="typ">VisitBinary</span><span class="pun">(</span><span class="typ">BinaryExpression</span><span class="pln"> node</span><span class="pun">)</span><span class="pln">
    </span><span class="pun">{</span><span class="pln">
        </span><span class="typ">Console</span><span class="pun">.</span><span class="typ">Write</span><span class="pun">(</span><span class="str">"("</span><span class="pun">);</span><span class="pln">

        </span><span class="kwd">this</span><span class="pun">.</span><span class="typ">Visit</span><span class="pun">(</span><span class="pln">node</span><span class="pun">.</span><span class="typ">Left</span><span class="pun">);</span><span class="pln">

        </span><span class="typ">Console</span><span class="pun">.</span><span class="typ">Write</span><span class="pun">(</span><span class="str">" {0} "</span><span class="pun">,</span><span class="pln"> node</span><span class="pun">.</span><span class="typ">NodeType</span><span class="pun">);</span><span class="pln">

        </span><span class="kwd">this</span><span class="pun">.</span><span class="typ">Visit</span><span class="pun">(</span><span class="pln">node</span><span class="pun">.</span><span class="typ">Right</span><span class="pun">);</span><span class="pln">

        </span><span class="typ">Console</span><span class="pun">.</span><span class="typ">Write</span><span class="pun">(</span><span class="str">")"</span><span class="pun">);</span><span class="pln">

        </span><span class="kwd">return</span><span class="pln"> node</span><span class="pun">;</span><span class="pln">
    </span><span class="pun">}</span><span class="pln">

    </span><span class="kwd">protected</span><span class="pln"> </span><span class="kwd">override</span><span class="pln"> </span><span class="typ">Expression</span><span class="pln"> </span><span class="typ">VisitParameter</span><span class="pun">(</span><span class="typ">ParameterExpression</span><span class="pln"> node</span><span class="pun">)</span><span class="pln">
    </span><span class="pun">{</span><span class="pln">
        </span><span class="typ">Console</span><span class="pun">.</span><span class="typ">Write</span><span class="pun">(</span><span class="str">"parameter({0})"</span><span class="pun">,</span><span class="pln"> node</span><span class="pun">.</span><span class="typ">Name</span><span class="pun">);</span><span class="pln">
        </span><span class="kwd">return</span><span class="pln"> </span><span class="kwd">base</span><span class="pun">.</span><span class="typ">VisitParameter</span><span class="pun">(</span><span class="pln">node</span><span class="pun">);</span><span class="pln">
    </span><span class="pun">}</span><span class="pln">

    </span><span class="kwd">protected</span><span class="pln"> </span><span class="kwd">override</span><span class="pln"> </span><span class="typ">Expression</span><span class="pln"> </span><span class="typ">VisitConstant</span><span class="pun">(</span><span class="typ">ConstantExpression</span><span class="pln"> node</span><span class="pun">)</span><span class="pln">
    </span><span class="pun">{</span><span class="pln">
        </span><span class="typ">Console</span><span class="pun">.</span><span class="typ">Write</span><span class="pun">(</span><span class="str">"constant({0})"</span><span class="pun">,</span><span class="pln"> node</span><span class="pun">.</span><span class="typ">Value</span><span class="pun">);</span><span class="pln">
        </span><span class="kwd">return</span><span class="pln"> </span><span class="kwd">base</span><span class="pun">.</span><span class="typ">VisitConstant</span><span class="pun">(</span><span class="pln">node</span><span class="pun">);</span><span class="pln">
    </span><span class="pun">}</span><span class="pln">
</span><span class="pun">}</span>

And let's have a simple lambda expression which takes 2 parameters and describes some math expression with the parameters. Then let's observe the tree with the ConsoleVisitor:

<span class="kwd">static</span><span class="pln"> </span><span class="kwd">void</span><span class="pln"> </span><span class="typ">AnalyzeExpressionTree</span><span class="pun">()</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="typ">Expression</span><span class="pun">&lt;</span><span class="typ">Func</span><span class="pun">&lt;</span><span class="kwd">int</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">int</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">int</span><span class="pun">&gt;&gt;</span><span class="pln"> expression </span><span class="pun">=</span><span class="pln"> </span><span class="pun">(</span><span class="pln">a</span><span class="pun">,</span><span class="pln"> b</span><span class="pun">)</span><span class="pln"> </span><span class="pun">=&gt;</span><span class="pln"> a </span><span class="pun">+</span><span class="pln"> a </span><span class="pun">*</span><span class="pln"> b </span><span class="pun">+</span><span class="pln"> </span><span class="lit">3</span><span class="pun">;</span><span class="pln">

    </span><span class="kwd">var</span><span class="pln"> visitor </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">new</span><span class="pln"> </span><span class="typ">ConsoleVisitor</span><span class="pun">();</span><span class="pln">
    visitor</span><span class="pun">.</span><span class="typ">Visit</span><span class="pun">(</span><span class="pln">expression</span><span class="pun">.</span><span class="typ">Body</span><span class="pun">);</span><span class="pln">
</span><span class="pun">}</span>

By the way, as it was mentioned before, absolutely the same expression can be constructed through the API:

<span class="kwd">static</span><span class="pln"> </span><span class="kwd">void</span><span class="pln"> </span><span class="typ">AnalyzeExpressionConstructedThroughAPI</span><span class="pun">()</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="kwd">var</span><span class="pln"> parameterA </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Parameter</span><span class="pun">(</span><span class="kwd">typeof</span><span class="pun">(</span><span class="kwd">int</span><span class="pun">),</span><span class="pln"> </span><span class="str">"a"</span><span class="pun">);</span><span class="pln">
    </span><span class="kwd">var</span><span class="pln"> parameterB </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Parameter</span><span class="pun">(</span><span class="kwd">typeof</span><span class="pun">(</span><span class="kwd">int</span><span class="pun">),</span><span class="pln"> </span><span class="str">"b"</span><span class="pun">);</span><span class="pln">
    </span><span class="kwd">var</span><span class="pln"> constant3 </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Constant</span><span class="pun">(</span><span class="lit">3</span><span class="pun">);</span><span class="pln">

    </span><span class="kwd">var</span><span class="pln"> multiplyAB </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Multiply</span><span class="pun">(</span><span class="pln">parameterA</span><span class="pun">,</span><span class="pln"> parameterB</span><span class="pun">);</span><span class="pln">
    </span><span class="kwd">var</span><span class="pln"> summ1 </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Add</span><span class="pun">(</span><span class="pln">parameterA</span><span class="pun">,</span><span class="pln"> multiplyAB</span><span class="pun">);</span><span class="pln">
    </span><span class="kwd">var</span><span class="pln"> summ </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Add</span><span class="pun">(</span><span class="pln">summ1</span><span class="pun">,</span><span class="pln"> constant3</span><span class="pun">);</span><span class="pln">

    </span><span class="typ">Expression</span><span class="pun">&lt;</span><span class="typ">Func</span><span class="pun">&lt;</span><span class="kwd">int</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">int</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">int</span><span class="pun">&gt;&gt;</span><span class="pln"> expression </span><span class="pun">=</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Lambda</span><span class="pun">&lt;</span><span class="typ">Func</span><span class="pun">&lt;</span><span class="kwd">int</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">int</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">int</span><span class="pun">&gt;&gt;(</span><span class="pln">
        summ</span><span class="pun">,</span><span class="pln">
        </span><span class="kwd">new</span><span class="pun">[]</span><span class="pln"> </span><span class="pun">{</span><span class="pln"> parameterA</span><span class="pun">,</span><span class="pln"> parameterB </span><span class="pun">});</span><span class="pln">

    </span><span class="kwd">var</span><span class="pln"> visitor </span><span class="pun">=</span><span class="pln"> </span><span class="kwd">new</span><span class="pln"> </span><span class="typ">ConsoleVisitor</span><span class="pun">();</span><span class="pln">
    visitor</span><span class="pun">.</span><span class="typ">Visit</span><span class="pun">(</span><span class="pln">expression</span><span class="pun">.</span><span class="typ">Body</span><span class="pun">);</span><span class="pln">
</span><span class="pun">}</span>

What are we going to get? Here is the result on Console:

((parameter(a) Add (parameter(a) Multiply parameter(b))) Add constant(3))

Notice that the order of operands is accurately considered by the expression.

The sample is obviously simplified, but we can guess that this technique gives us a possibility to translate an expression to... completely different programming language. That's what the Entity Framework db query provider actually does, isn't it?

IEnumerable works with Func and IQueryable works with Expression<Func>

That's what we should understand. Entity Framework or LINQ to SQL operates with expression trees to construct SQL queries from lambda expressions. If we look into the source code of appropriate extension methods, we can easily notice that Enumerable and Queryable work with different types of arguments for the predicate:

<span class="com">// Type: System.Linq.Enumerable</span><span class="pln">
</span><span class="com">// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</span><span class="pln">
</span><span class="com">// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll</span><span class="pln">
</span><span class="kwd">public</span><span class="pln"> </span><span class="kwd">static</span><span class="pln"> </span><span class="kwd">class</span><span class="pln"> </span><span class="typ">Enumerable</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="kwd">public</span><span class="pln"> </span><span class="kwd">static</span><span class="pln"> </span><span class="typ">IEnumerable</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;</span><span class="pln"> </span><span class="typ">Where</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;(</span><span class="pln">
        </span><span class="kwd">this</span><span class="pln"> </span><span class="typ">IEnumerable</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;</span><span class="pln"> source</span><span class="pun">,</span><span class="pln"> 
        </span><span class="typ">Func</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">bool</span><span class="pun">&gt;</span><span class="pln"> predicate</span><span class="pun">)</span><span class="pln">
    </span><span class="pun">{</span><span class="pln">
        </span><span class="kwd">return</span><span class="pln"> </span><span class="pun">(</span><span class="typ">IEnumerable</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;)</span><span class="pln"> </span><span class="kwd">new</span><span class="pln"> </span><span class="typ">Enumerable</span><span class="pun">.</span><span class="typ">WhereEnumerableIterator</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;(</span><span class="pln">source</span><span class="pun">,</span><span class="pln"> predicate</span><span class="pun">);</span><span class="pln">
    </span><span class="pun">}</span><span class="pln">
</span><span class="pun">}</span><span class="pln">


</span><span class="com">// Type: System.Linq.Queryable</span><span class="pln">
</span><span class="com">// Assembly: System.Core, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</span><span class="pln">
</span><span class="com">// Assembly location: C:\Windows\Microsoft.NET\Framework\v4.0.30319\System.Core.dll</span><span class="pln">
</span><span class="kwd">public</span><span class="pln"> </span><span class="kwd">static</span><span class="pln"> </span><span class="kwd">class</span><span class="pln"> </span><span class="typ">Queryable</span><span class="pln">
</span><span class="pun">{</span><span class="pln">
    </span><span class="kwd">public</span><span class="pln"> </span><span class="kwd">static</span><span class="pln"> </span><span class="typ">IQueryable</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;</span><span class="pln"> </span><span class="typ">Where</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;(</span><span class="pln">
        </span><span class="kwd">this</span><span class="pln"> </span><span class="typ">IQueryable</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;</span><span class="pln"> source</span><span class="pun">,</span><span class="pln"> 
        </span><span class="typ">Expression</span><span class="pun">&lt;</span><span class="typ">Func</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">,</span><span class="pln"> </span><span class="kwd">bool</span><span class="pun">&gt;&gt;</span><span class="pln"> predicate</span><span class="pun">)</span><span class="pln">
    </span><span class="pun">{</span><span class="pln">
        </span><span class="kwd">return</span><span class="pln"> source</span><span class="pun">.</span><span class="typ">Provider</span><span class="pun">.</span><span class="typ">CreateQuery</span><span class="pun">&lt;</span><span class="typ">TSource</span><span class="pun">&gt;(</span><span class="pln">
            </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Call</span><span class="pun">(</span><span class="pln">
                </span><span class="kwd">null</span><span class="pun">,</span><span class="pln"> 
                </span><span class="pun">((</span><span class="typ">MethodInfo</span><span class="pun">)</span><span class="pln"> </span><span class="typ">MethodBase</span><span class="pun">.</span><span class="typ">GetCurrentMethod</span><span class="pun">()).</span><span class="typ">MakeGenericMethod</span><span class="pun">(</span><span class="pln">
                    </span><span class="kwd">new</span><span class="pln"> </span><span class="typ">Type</span><span class="pun">[]</span><span class="pln"> </span><span class="pun">{</span><span class="pln"> </span><span class="kwd">typeof</span><span class="pun">(</span><span class="typ">TSource</span><span class="pun">)</span><span class="pln"> </span><span class="pun">}),</span><span class="pln"> 
                    </span><span class="kwd">new</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">[]</span><span class="pln"> </span><span class="pun">{</span><span class="pln"> source</span><span class="pun">.</span><span class="typ">Expression</span><span class="pun">,</span><span class="pln"> </span><span class="typ">Expression</span><span class="pun">.</span><span class="typ">Quote</span><span class="pun">(</span><span class="pln">predicate</span><span class="pun">)</span><span class="pln"> </span><span class="pun">}));</span><span class="pln">
    </span><span class="pun">}</span><span class="pln">
</span><span class="pun">}</span>

We should remember about that to prevent unintended consequenses (or to create the intended ones) when we work with delegates and expressions.

9 comments: