[WP7Dev] Using the WebClient with Reactive Extensions for Effective Asynchronous Downloads

By jay at June 22, 2010 21:07 Tags: , , , , ,

There’s a very cool framework that has slipped into the Windows Phone SDK : The Reactive Extensions.

It's actually a quite misunderstood framework, mainly because it is a bit hard to harness, but when you get a handle on it, it is very handy ! I particularly like the MemoizeAll extension, a tricky one, but very powerfull.

But I digress.

 

A Non-Reactive String Download Sample

On Windows Phone 7, the WebClient class only has a DownloadStringAsync method and a corresponding DownloadStringCompleted event. That means that you're forced to be asynchronous, to be nice to the UI and not make the application freeze on the user, because of the bad coding habit of being synchronous on remote calls.

In a world without the reactive extensions, you would use it like this :

public void StartDownload()
{
    var wc = new WebClient();
    wc.DownloadStringCompleted += 
      (e, args) => DownloadCompleted(args.Result);
                  
    // Start the download
    wc.DownloadStringAsync(new Uri("http://www.data.com/service"));
}

public void DownloadCompleted(string value)
{
    myLabel.Text = value;
}

Pretty easy. But you soon find out that the execution of the DownloadStringCompleted event is performed on the UI Thread. That means that if, for some reason you need to perform some expensive calculation after you’ve received the string, you’ll freeze the UI for the duration of your calculation, and since the Windows Phone 7 is all about fluidity and you don't want to be the bad guy, you then have to queue it in the ThreadPool.

But you also have to update the UI in the dispatcher, so you have to come back from the thread pool.

You then have :

 public void StartDownload()
 {
     WebClient wc = new WebClient();
     wc.DownloadStringCompleted += 
        (e, args) => ThreadPool.QueueUserWorkItem(d => DownloadCompleted(args.Result));

     // Start the download
     wc.DownloadStringAsync(new Uri("http://www.data.com/service"));
  }

 public void DownloadCompleted(string value)
 {
     // Some expensive calculation
     Thread.Sleep(1000);

     Dispatcher.BeginInvoke(() => myLabel.Text = value);
 }

That’s a bit more complex. And then you notice that you also have to handle exceptions because, well, it’s the Web. It’s unreliable.

So, let’s add the exception handling :

public void StartDownload()
{
    WebClient wc = new WebClient();

    wc.DownloadStringCompleted += (e, args) => {
        try {
            ThreadPool.QueueUserWorkItem(d => DownloadCompleted(args.Result));
        }
        catch (WebException e) {
            myLabel.Text = "Error !";
        }
    };
   
    // Start the download
    wc.DownloadStringAsync(new Uri("http://www.data.com/service"));
}

public void DownloadCompleted(string  value)
{
    // Some expensive calculation
    Thread.Sleep(1000);
    Dispatcher.BeginInvoke(() => myLabel.Text = value);
}

That’s starting to be a bit complex. But then you have to wait for an other call from an other WebClient to end its call and show both results.

Oh well. Fine, I'll spare you that one.

 

The Same Using the Reactive Extensions

The reactive extensions treats asynchronous events like a stream of events. You subscribe to the stream of events and leave, and you let the reactive framework do the heavy lifting for you.

I’ll spare you the explanation of the duality between IObservable and IEnumerable, because Erik Meijer explains it very well.

So, I’ll start again with the simple example, and after adding the System.Observable and System.Reactive references, I can downloading a string :

public void StartDownload()
{
    WebClient wc = new WebClient();

    var o = Observable.FromEvent<DownloadStringCompletedEventArgs>(wc, "DownloadStringCompleted")

                      // When the event fires, just select the string and make
                      // an IObservable<string> instead
                      .Select(newString => newString.EventArgs.Result);

    // Subscribe to the observable, and set the label text
    o.Subscribe(s => myLabel.Text = s);


    // Start the download
    wc.DownloadStringAsync(new Uri("http://www.data.com/service"));
}

This does the same thing the very first example did. You’ll notice the use of Observable.FromEvent to transform the event into a string from the DownloadStringCompleted event args. For this exemple, the event stream will only contain one event, since the download only occurs once. Each of these ocurrence of the event is then “projected”, using the Select statement, to a string that represents the result of the web request.

It’s a bit more complex for the simple case, because of the additional plumbing.

But now we want to handle the threads context changes. The Reactive Extensions has the concept of scheduler, to observe an IObservable in a specific context.

So, we use the scheduler like this :

public void StartDownload()
{
    WebClient wc = new WebClient();

    var o = Observable.FromEvent<DownloadStringCompletedEventArgs>(wc, "DownloadStringCompleted")

                      // Let's make sure that we’re on the thread pool
                      .ObserveOn(Scheduler.ThreadPool)

                      // When the event fires, just select the string and make
                      // an IObservable<string> instead
                      .Select(newString => ProcessString(newString.EventArgs.Result))

                      // Now go back to the UI Thread
                      .ObserveOn(Scheduler.Dispatcher)

                      // Subscribe to the observable, and set the label text
                      .Subscribe(s => myLabel.Text = s);

    wc.DownloadStringAsync(new Uri("http://www.data.com/service"));
}

public string ProcessString(string s)
{
    // A very very very long computation
    return s + "1";
}
 

In this example, we’ve changed contexts twice to suit our needs, and now, it’s getting a bit less complex than the original sample.

And if we want to handle exceptions, well, easy :

    .Subscribe(s => myLabel.Text = s, e => myLabel.Text = "Error ! " + e.Message);

And you have it !

 

Combining the Results of Two Downloads

Combining two or more asynchronous operations can be very tricky, and you have to handle exceptions, rendez-vous and complex states. That make a very complex piece of code that I won’t write here, I promised, but instead I’ll give you a sample using Reactive Extensions :

public IObservable<string> StartDownload(string uri)
{
    WebClient wc = new WebClient();

    var o = Observable.FromEvent<DownloadStringCompletedEventArgs>(wc, "DownloadStringCompleted")

                      // Let's make sure that we're not on the UI Thread
                      .ObserveOn(Scheduler.ThreadPool)

                      // When the event fires, just select the string and make
                      // an IObservable<string> instead
                      .Select(newString => ProcessString(newString.EventArgs.Result));

    wc.DownloadStringAsync(new Uri(uri));

    return o;
}

public string ProcessString(string s)
{
    // A very very very long computation
    return s + "<!-- Processing End -->";
}

public void DisplayMyString()
{
    var asyncDownload = StartDownload("http://bing.com");
    var asyncDownload2 = StartDownload("http://google.com");

    // Take both results and combine them when they'll be available
    var zipped = asyncDownload.Zip(asyncDownload2, (left, right) => left + " - " + right);

    // Now go back to the UI Thread
    zipped.ObserveOn(Scheduler.Dispatcher)

          // Subscribe to the observable, and set the label text
          .Subscribe(s => myLabel.Text = s);
}

You’ll get a very interesting combination of google and bing :)

[WP7Dev] Beware of the [ThreadStatic] attribute on Silverlight for Windows Phone 7

By Admin at June 19, 2010 21:36 Tags: , , , , , , ,

Cet article est disponible en francais.

In other words, it is not supported !

And the worst in all this is that you don’t even get warned that it’s not supported... The code compiles, but the attribute has no effect at all ! Granted that you can read the msdn article about the differences between silverlight on Windows and Windows Phone, but well, you may still miss it. Maybe a custom code analysis rule could prevent this.

Still, you want to use ThreadStatic because you probably need it, somehow. But since it is not supported, you could try the Thread.GetNamedDataSlot, mind you.

Well, too bad. It’s not supported either.

That leaves us implementing or own TLS implementation, by hand...

 

Updating Umbrella for Silverlight on Windows Phone

I’m a big fan of Umbrella, and the first time I had to use Dictionary<>.TryGetValue and its magically aweful out parameter in my attempt to rewrite my Remote Control app for Windows Phone 7, I decided to port Umbrella to it. So I could use GetValueOrDefault without rewriting it, again.

I managed to get almost all the desktop unit tests to pass, except for those who emit code, use web features, use xml and binary serializers, call private methods using reflection, and so on.

There are a few parts where the code needed to be updated, because TypeDescriptor class is not available on WP7, you have to crash and burn to see if a value is convertible from one type to the other. But that’s not too bad, it works as expected.

 

Umbrella’s ThreadLocalSource

Umbrella has this nice ThreadLocalSource class that wraps the TLS behavior, and you can easily create a static variable of that type instead of the ThreadStatic static variable.

The Umbrella quick start samples make that kind of use for it :

    ISource<int> threadLocal = new ThreadLocalSource<int>(1);

    int valueOnOtherThread = 0;

    Thread thread = new Thread(() => valueOnOtherThread = threadLocal.Value);
    thread.Start();
    thread.Join();

    Assert.Equal(1, threadLocal.Value);
    Assert.Equal(0, valueOnOtherThread);

The main thread set the value to 1, and the other thread tries to get the same value from the other thread and it should be different (the default value of an int, which is 0).

 

Updating the ThreadLocalSource to avoid the use of ThreadStatic

The TLS in .NET is basically a dictionary of string/object pairs that is attached to each running threads. So, to mimic this, we just need to make a list of all threads that want to store something for themselves and wrap it nicely.

We can create a variable of this type :

    private static Tuple<WeakReference, IDictionary<string, T>>[] _tls;

That variable is intentionally an array to try to make use of memory spacial locality, and since on that platform we won’t get a lot of threads, this should be fine when we got through the array to find one. This approach is trying to be lockless, by using a retry mechanism to update the array. The WeakReference is used to avoid keeping a reference to the thread after it has been terminated.

So, to update the array, we can do as follows :

    private static IDictionary<string, T> GetValuesForThread(Thread thread)
    {
        // Find the TLS for the specified thread
        var query = from entry in _tls

                    // Only get threads that are still alive
                    let t = entry.T.Target as Thread

                    // Get the requested thread
                    where t != null && t == thread
                    select entry.U;

        var localStorage = query.FirstOrDefault();

        if (localStorage == null)
        {
            bool success = false;

            // The storage for the new Thread
            localStorage = new Dictionary<string, T>();

            while(!success)
            {
                // store the original array so we can check later if there has not
                // been anyone that has updated the array at the same time we did
                var originalTls = _tls;

                var newTls = new List<Tuple<WeakReference, IDictionary<string, T>>>();

                // Add the slots for which threads references are still alive
                newTls.AddRange(_tls.Where(t => t.T.IsAlive));

                var newSlot = new Tuple<WeakReference, IDictionary<string, T>>()
                {
                    T = new WeakReference(thread),
                    U = localStorage
                };

                newTls.Add(newSlot);

                // If no other thread has changed the array, replace it.
                success = Interlocked.CompareExchange(ref _tls, newTls.ToArray(), originalTls) != _tls;
            }
        }

        return localStorage;
    }

Instead of the array, another dictionary could be created but I’m not sure of the actual performance improvement that would provide, particularly for very small arrays.

Using a lockless approach like this one will most likely limit the contention around the use of that TLS-like class. There may be, from time to time, computations that are performed multiple times in case of race conditions on the update of the _tls array, but that is completely acceptable. Additionally, livelocks are also out of the picture on that kind of preemptive systems.

I think developing on that platform is going to be fully of little workarounds like this one... This is going to be fun !

[LINQ] Finding the next available file name

By jay at June 10, 2010 20:16 Tags: , ,

Cet article est disponible en francais.


Sometimes, the most simple examples are the best.

 

Let’s say you have a configuration file, but you want to make a copy of it before you modify it. Easy, you copy that file to “filename.bak”. But what happens there’s already that file ? Well, either you replace it, or you create an autoincremented file.

 

If you want to do the latter, you could do it using a for loop. But since you’re a happy functional programming guy, you want to make it using LINQ.

 

You then can do it like this :

    public static string CreateNewFileName(string filePath)
    {
        if (!File.Exists(filePath))
            return filePath;

        // Don't do that for each file.
        var name = Path.GetFileNameWithoutExtension(filePath);
        var extension = Path.GetExtension(filePath);

        // Now find the next available file
        var fileQuery = from index in Enumerable.Range(2, 10000)

                        // Build the file name
                        let fileName = string.Format("{0} ({1}){2}", name, index, extension)

                        // Does it exist ?
                        where !File.Exists(fileName)

                        // No ? Select it.
                        select fileName;

        // Return the first one.
        return fileQuery.First();
    }

Note the use of the let operator, which allows the reuse of what is called a “range variable”. In this case, it avoids using string.Format multiple times.

 

The case of Infinity

There’s actually one problem with this implementation, which is the arbitrary “10000”. This might be fine if you don’t intend to make more than 10000 backups of your configuration file. But if you do, to lift that limit, we could write this iterator method :

    public static IEnumerable<int> InfiniteRange(int start)
    {
         while(true)
         {
             yield return start++;
         }
    }

Which basically will return an new value each time you ask for one. To use that method you have to make sure that you have an exit condition (the file does not exist, in the previous example), or you may well be enumerating until the end of times... Actually up to int.MaxValue, for the nit-pickers, but .NET 4.0 adds System.Numerics.BigInteger to be sure to get to the end of times. You never know.

 

To use this iterator, just replace :

        var fileQuery = from index in Enumerable.Range(2, 10000)

by

        var fileQuery = from index in InfiniteRange()

And you’re done.

Reactive Framework: MemoizeAll

By jay at February 04, 2010 18:21 Tags: , , ,

Cet article est disponible en francais.

For some time now, with the release of the Rx Framework and the reactive/interactive programming, some new features were highlighted through a very good article of Bart De Smet dealing with System.Interactive and the “lazy-caching”.

When using LINQ, one can find two sorts of operators: The “lazy” operators that take elements one by one and forward them when they a requested (Select, Where, SelectMany, First, …), and the operators that I would call “Rendez-vous” for which the entirety of the elements of the enumerators need to be enumerated (Count, ToArray/ToList, OrderBy, …) to produce a result.

 

“Lazy” Operators

Lazy operators are pretty useful as they offer a good performance when it is not required to enumerate all the elements of an enumerable. This can also be useful when it may take a very long time to enumerate each element of an enumerator, and that we only want to get the first few elements.

For instance this :
         

static IEnumerable<int> GetItems()
{
    for (int i = 0; i < 5; i++)
    {
        Console.WriteLine(i);
        yield return i + 1;
    }
}

static void Main()
{
   Console.WriteLine(GetItems().First());
   Console.WriteLine(GetItems().First());
}

Will output :


0
1
0
1

Only the first element of the enumerator will be enumerated from GetItems().

However, these operators expose a behavior that is important to know about: Each time they are enumerated, they also enumerate their source again. That could either be a advantage (Enumerating multiple times a changing source) or a problem (enumerating multiple times a resource intensive source).

 

“Rendez-vous” Operators

These operators are also interesting because they force the enumeration of all the elements of the enumerable, and in the particular case of ToArray, this allows the creation of an immutable version of the content of the enumerable. These are useful in conjunction with lazy operators to prevent them to enumerate their source again, when enumerated multiple times.

If we the previous sample, and update it a bit:


static void Main()
{
   var items = GetItems().ToArray();

   Console.WriteLine(items.Count());
   Console.WriteLine(items.Count());
}

We get this result :


0
1
2
3
4
5
5

Because Count() needs to know all the elements of the source enumerator to determine the count.

These operators also enumerate their source with each use, but using ToArray/ToList prevents their result to enumerate the source again.

The case of multiple enumerations

A concrete example of the problem posed by the multiple enumerations is the creation of an enumerable partitionning operator. In this example, we can see that the enumerable passed as the source is used by to "Where" different operators, which implies that the source enumerable will be enumerated twice. Storing the whole content of the source enumerable by means of a ToArray/ToList is possible, but that would be a possible waste of resource, mainly because we can't know if the output enumerable will be enumerated completely (If that is possible, as in the case of an infinite enumerable, ToArray is not applicable).

An intermediate operator between "Lazy" and "Rendez-vous" would be useful.

EnumerableEx.MemoizeAll

The EnumerableEx class brings us an extension, MemoizeAll (built from the Memoization concept), that is just the middle ground we're looking for, and will cache elements from the source enumerator when they are requested. A sort of "lazy" ToArray.

If we take the example of Mark Needham, we would modify it like this :


var evensAndOdds = Enumerable.Range(1, 10)
                             .MemoizeAll()
                             .Partition(x => x % 2 == 0);

In this example, the MemoizeAll does not have a real benefit on the performance side, since Enumerable.Range is not a very expensive operator. But in the case where the source of the "Partition" operator would be a most expensive enumerable, like a Linq2Sql query, the lazy caching could be very effective.

One of the comments suggests that a GroupBy based implementation could be written, but this operator also evaluates the source operator when a group is enumerated. The MemoizeAll is then again appropriate for better performance, but as always, this is a tradeoff between processing and memory.

By the way, Bart de Smeth discusses the part of the elimination of side effects linked the multiple enumeration of enumerables by using Memoize and MemoizeAll, which is not really an issue in the previous example, but is nonetheless a very interesting subject.

 

.NET 4.5 ?

On a side note, I find regrettable that the EnumerableEx extensions did not make their way in .NET 4.0... They are very useful, and not very complex. They may have arrived too late in the development cycle of .NET 4.0... Maybe in .NET 4.5 :)

WinForms, DataBinding and Updates from multiple Threads

By jay at January 02, 2010 23:08 Tags: ,

Cet article est disponible en francais.

When one is trying to use the MVC model on the WinForms, it is possible to use the INotifyPropertyChanged interface to allow DataBinding between the controler and form.

It is then possible to write a controller like this :

    

public class MyController : INotifyPropertyChanged
{
// Register a default handler to avoid having to test for null
public event PropertyChangedEventHandler PropertyChanged = delegate { };

public void ChangeStatus()
{
Status = DateTime.Now.ToString();
}

private string _status;

public string Status
{
get { return _status; }
set
{
_status = value;

// Notify that the property has changed
PropertyChanged(this, new PropertyChangedEventArgs("Status"));
}
}
}

The form is defined like this :


public partial class MyForm : Form
{
private MyController _controller = new MyController();

public MyForm()
{
InitializeComponent();

// Make a link between labelStatus.Text and _controller.Status
labelStatus.DataBindings.Add("Text", _controller, "Status");
}

private void buttonChangeStatus_Click(object sender, EventArgs e)
{
_controller.ChangeStatus();
}

}


The form will update the “labelStatus” when the “Status” property of controller changes.

All of this code is executed in the main thread, where the message pump of the main form is located.

 

A touch of asynchronism

Let’s imagine now that the controller is going to perform some operations asynchronously, using a timer for instance.

We update the controller by adding this :


private System.Threading.Timer _timer;

public MyController()
{
_timer = new Timer(
d => ChangeStatus(),
null,
TimeSpan.FromSeconds(1), // Start in one second
TimeSpan.FromSeconds(1) // Every second
);
}


By altering the controller this way, the “Status” property is going to be updated regularly

The operation model of the System.Threading.Timer implies that the ChangeStatus method is called from a different thread than the thread that created the main form. Thus, when the code is executed, the update of the label is halted by the following exception :

   Cross-thread operation not valid: Control 'labelStatus' accessed from a thread other than the thread it was created on.

The solution is quite simple, the update of the UI must be performed on the main thread using Control.Invoke().

That said, in our example, it’s the DataBinding engine that hooks on the PropertyChanged event. We must make sure that the PropertyChanged event is called “decorated” by a call to Control.Invoke().

We could update the controller to invoke the event on the main Thread:


set
{
_status = value;

// Notify that the property has changed
Action action = () => PropertyChanged(this, new PropertyChangedEventArgs("Status"));
_form.Invoke(action);
}


But that would require the addition of WinForms depend code in the controller, which is not acceptable. Since we want to put the controller in a Unit Test, calling the Control.Invoke() method would be problematic, as we would need a Form instance that we would not have in this context.

 

Delegation by Interface

The idea is to delegate to the view (here the form) the responsibility of placing the call to the event on the main thread. We can do so by using an interface passed as a parameter of the controller’s constructor. It could be an interface like this one :


public interface ISynchronousCall
{
void Invoke(Action a);
}


The form would implement it:


void ISynchronousCall.Invoke(Action action)
{
// Call the provided action on the UI Thread using Control.Invoke()
Invoke(action);
}


We would then raise the event like this :


_synchronousInvoker.Invoke(
() => PropertyChanged(this, new PropertyChangedEventArgs("Status"))
);

But like every efficient programmer (read lazy), we want to avoid writing an interface.

 

Delegation by Lambda

We will try to use lambda functions to call the method Control.Invoke() method. For this, we will update the constructor of the controller, and instead of taking an interface as a parameter, we will use :


public MyController(Action<Action> synchronousInvoker)
{
_synchronousInvoker = synchronousInvoker;
...
}

To clarify, we give to the constructor an action that has the responsibility to call an action that is passed to it by parameter.

It allows to build the controller like this :


_controller = new MyController(a => Invoke(a));

Here, no need to implement an interface, just pass a small lambda that invokes an actions on the main thread. And it is used like this :


_synchronousInvoker(
() => PropertyChanged(this, new PropertyChangedEventArgs("Status"))
);

This means that the lambda specified as a parameter will be called on the UI Thread, in the proper context to update the associated label.

The controller is still isolated from the view, but adopts anyway the behavior of the view when updating “databound” properties.

If we would have wanted to use the controller in a unit test, it would have been constructed this way :


_controller = new MyController(a => a());

The passed lambda would only need to call the action directly.

 

Bonus: Easier writing of the notification code

A drawback of using INotifyPropertyChanged is that it is required to write the name of the property as string. This is a problem for many reasons, mainly when using refactoring or obfuscation tools.

C# 3.0 brings expression trees, a pretty interesting feature that can be used in this context. The idea is to use the expression trees to make an hypothetical “memberof” that would get the MemberInfo of a property, much like typeof gets the System.Type of a type.

Here is a small helper method that raises events :


private void InvokePropertyChanged<T>(Expression<Func<T>> expr)
{
var body = expr.Body as MemberExpression;

if (body != null)
{
PropertyChanged(this, new PropertyChangedEventArgs(body.Member.Name));
}
}

A method that can be used like this :


_synchronousInvoker(
() => InvokePropertyChanged(() => Status)
);

The “Status” property is used as a property in the code, not as a string. It is then easier to rename it with a refactoring tool without breaking the code logic.

Note that the lambda () => Status is never called. It is only analyzed by the InvokePropertyChanged method as being able to provide the name of a property.

 

The Whole Controller


public class MyController : INotifyPropertyChanged
{
// Register a default handler to avoid having to test for null
public event PropertyChangedEventHandler PropertyChanged = delegate { };

private System.Threading.Timer _timer;
private readonly Action<Action> _synchronousInvoker;

public MyController(Action<Action> synchronousInvoker)
{
_synchronousInvoker = synchronousInvoker

_timer = new Timer(
d => Status = DateTime.Now.ToString(),
null,
1000, // Start in one second
1000 // Every second
);
}

public void ChangeStatus()
{
Status = DateTime.Now.ToString();
}

private string _status;

public string Status
{
get { return _status; }
set
{
_status = value;

// Notify that the property has changed
_synchronousInvoker(
() => InvokePropertyChanged(() => Status)
);
}
}

/// <summary>
/// Raise the PropertyChanged event for the property “get” specified in the expression
/// </summary>
/// <typeparam name="T">The type of the property</typeparam>
/// <param name="expr">The expression to get the property from</param>
private void InvokePropertyChanged<T>(Expression<Func<T>> expr)
{
var body = expr.Body as MemberExpression;

if (body != null)
{
PropertyChanged(this, new PropertyChangedEventArgs(body.Member.Name));
}
}
}

A C# Traverse extension method, with an F# detour

By Jay at May 17, 2009 09:10 Tags: , ,

Cet article est disponible en Français.

The Traverse extension method in C#

Occasionally, you'll come across data structures that take the form of single linked lists, like for instance the MethodInfo class and its GetBaseDefinition method.

Let's say for a virtual method you want, for a specific type, discover which overriden method in the hierarchy is marked with a specific attribute. I assume in this example that the expected attribute is not inheritable.

You could implement it like this :

    

    private static MethodInfo GetTaggedMethod(MethodInfo info)
    {
        MethodInfo ret = null;

        do
        {
            var attr = info.GetCustomAttributes(typeof(MyAttribute), false) as MyAttribute[];

            if (attr.Length != 0)
                return info;

            ret = info;

            info = info.GetBaseDefinition();
        }
        while (ret != info);

        return null;
    }


This method has two states variables and a loop, which makes it a bit harder to stabilize. This is a method that could easily be expressed as a LINQ query, but (as far as I know) there is no way to make a enumeration of a data structure which is part of a linked list.

To be able to do this, which is "traverse" a list of objects of the same type that are linked from one to the next, an extension method containing a generic iterator can be written like this :


    public static class Extensions
    {
        public static IEnumerable<T> Traverse<T>(this T source, Func<T, T> next)
        {
            while (source != null)
            {
                yield return source;
                source = next(source);
            }
        }
    }


This is a really simple iterator method, which calls a method to get the next element using the current element and stops if the next value is null.

It can be used easily like this, using the GetBaseDefinition example :


   var methodInfo = typeof(Dummy).GetMethod("Foo");

   IEnumerable<MethodInfo> methods = methodInfo.Traverse(m => m != m.GetBaseDefinition() ? m.GetBaseDefinition() : null);


Just to be precise, the lambda is not exactly perfect, as it is calling GetBaseDefinition twice. It can definitely be optimised a bit.

Anyway, to go back at the first example, the GetTaggedMethod function can be written as a single LINQ query, using the Traverse extension :


    private static MethodInfo GetTaggedMethod(MethodInfo info)
    {
        var methods = from m in methodInfo.Traverse(m => m != m.GetBaseDefinition() ? m.GetBaseDefinition() : null)
                      let attributes = m.GetCustomAttributes(typeof(MyAttribute), false)
                      where attributes.Length != 0
                      select m;

        return methods.FirstOrDefault();
    }


I, for one, find this code more readable... But this is a question of taste :)

Nonetheless, the MethodInfo linked-list is not the perfect example, because the end of the chain is not a null reference but rather the same method we're testing. Most of the time, a chain will end with a null, which is why the Traverse method uses null to end the enumeration. I've been using this method to perform queries on a hierarchy of objects that have parent objects of the same type, and the parent's root set to null. It has proven to be quite useful and concise when used in a LINQ query.

An F# Detour

As I was here, I also tried to find out what an F# version of this code would be. So, with the help of recursive functions, I came up with this :

    let rec traverse(m, n) =
       let next = n(m)
       if next = null then
           [m]
       else
           [m] @ traverse(next, n)


The interesting part here is that F# does not require to specify any type. "m" is actually an object, and "n" a (obj -> obj) function, but returns a list of objects. And it's used like this :

    let testMethod = typeof<Dummy>.GetMethod("Foo")

    for m in  traverse(testMethod, fun x -> if x = x.GetBaseDefinition() then null else x.GetBaseDefinition()) do
       Printf.printfn "%s.%s" m.DeclaringType.Name m.Name


Actually, the F# traverse method is not exactly like the C# traverse method, because it is not an extension method, and it is not lazily evaluated. It is also a bit more verbose, mainly because I did not find an equivalent of the ternary operator "?:".

After digging a bit in the F# language spec, I found out it exists an somehow equivalent to the yield keyword. It is used like this :

    let rec traverse(m, n) =
       seq {
           let next = n(m)
           if next = null then
               yield m
           else
               yield m
               yield! traverse(next, n)
       }


It is used the same way, but the return value is not a list anymore but a sequence.

I also find interesting that F# is able to return tuples out of the box, and for my attribute lookup, I'd have the method and I'll also have the attribute instance that has been found. Umbrella also defines tuples useable from C#, but it's an addon.

F# is getting more and more interesting as I dig into its features and capabilities...

Using Multiple Where Clauses in a LINQ Query

By Jay at December 06, 2008 15:32 Tags: , , ,
Cet article est disponible en francais.
 
After writing my previous article where I needed to intercept exceptions in a LINQ Query, I found out that it is possible to specify multiple where clauses in a LINQ Query.

Here is the query :


var q = from file in Directory.GetFiles(@"C:\Windows\Microsoft.NET\Framework\v2.0.50727", "*.dll")
        let asm = file.TryWith(f => Assembly.LoadFile(f))
        where asm != null
        let types = asm.TryWith(a => a.GetTypes(), (Exception e) => new Type[0])
        where types.Any()
        select new { asm, types };


This query is able to find assemblies for which it is possible to list types. The point of using multiple Where clauses is to avoir evaluating chunks of a query if previous chunks can prevent it. By the way, the TryWith around the Assembly.GetTypes() is there to intercept exceptions raised when loading types, in case dependencies would not be available at the moment of the enumeration.

A useful LINQ trick to remember !

F#, TryWith, Maybe and Umbrella

By Jay at December 06, 2008 13:49 Tags: , , ,
Cet article est disponible en Français.
 
I've thrown myself a bit in the discovery of F#, and even though I do not intend to make it my first language, I intend to use techniques and features found in it and try to port them into C#. New additions in C# 3.0 make it a good target for functional concepts.

There seem to be a consensus for the fact that F# is not a multi-purpose language, as C# is also not, for instance with the writing of parallel code. C# is not a perfect language for this, but F# seems to be. At the opposite, F# does not seem to be a language of choice for writing GUI code. For my part, and considering that F# if not really official, reusing concepts will be enough for now.

TryWith Extension

Using F#, I had to write this:


   let AllValidAssemblies = [
        for file in Directory.GetFiles(@"C:\Windows\Microsoft.NET\Framework\v2.0.50727", "*.dll") ->
        System.Reflection.Assembly.LoadFile(file)
    ]

This code creates a list of assemblies that can be loaded in the current AppDomain. There is however an issue with the invocation of the Assembly.LoadFile method, because it raises an exception when the file is not loadable for some reason. This is a non-modifiable behavior, even though we would like to return a null instead of an exception.

To work around this, there is a feature in F# that can do this :


    let EnumAllTypes = [
        for file in Directory.GetFiles(@"C:\Windows\Microsoft.NET\Framework\v2.0.50727", "*.dll") ->
        try System.Reflection.Assembly.LoadFile(file) with _ -> null
    ]

The point of the try/with block is to transform any exception into a null reference.

To transpose the creation of this list in C# with a LINQ query, the same problem arises. We must intercept the exception raised by LoadFile and convert it to a null reference.

Here is the equivalent in C#, without the exception handling :


    var q = from file in Directory.GetFiles(@"C:\Windows\Microsoft.NET\Framework\v2.0.50727", "*.dll")
            let asm = Assembly.LoadFile(file)
            select asm;

When LoadFile raises an exception, the execution of the request is interrupted, which is a problem.

Extension Methods can be of a great value here, and even though a normal method could do the trick, we can write this :


    public static class Extensions
    {
        public static TResult TryWith<TInstance, TResult>(this TInstance instance, Func<TInstance, TResult> action)
        {
           try {
              return action(instance);
           }
           catch {
              return default(TResult);
           }
        }
    }


The idea behind this method is to reproduce the behavior of the try/with F# construct. With this method, we can update the LINQ query into this :


    var q = from file in Directory.GetFiles(@"C:\Windows\Microsoft.NET\Framework\v2.0.50727", "*.dll")
            let asm = file.TryWith(f => Assembly.LoadFile(f))
            select asm;


This is creates the same list the F# code does, with null references for assemblies that could not be loaded.

The TryWith method can be overloaded to be a bit more flexible, like calling a method for a specific exception :



    public static TResult TryWith<TInstance, TResult, TException>(
           this TInstance instance, Func<TInstance, TResult> action,
           Func<TException, TResult> exceptionHandler
        )
        where TException : Exception
    {
        try {
           return action(instance);
        }
        catch (TException e) {
           return exceptionHandler(e);
        }
        catch {
           return default(TResult);
        }
    }


By the way, there is an interesting bug with this code. If we execute this :


    string value = null;
    var res = value.TryWith(s => s.ToString(), (Exception e) => "Exception");


The behavior is different depending on whether it is executed with or without the debugger with the x86 runtime. It seems that the code generator "forgets" to add the handler for the TException typed exception, which is annoying. This is not a big bug, mainly because it only appears when the x86 debugger is present. With the x64 runtime debugger, there is no problem though. For those interesting in this, the bug is on Connect.

Maybe Extension

I also added recently in Umbrella an extension named Maybe, which has a behavior rather similar to TryWith, but without the exceptions :


    public static TResult Maybe<TResult, TInstance>(
         this TInstance instance,
         Func<TInstance, TResult> function)
    {
       return instance == null ? default(TResult) : function(instance);
    }


The point of this method is to be able to execute code if the original value is not null. For instance :


    object instance = null;
    Console.WriteLine("{0}", instance.Maybe(o => o.GetType());


This allows the evaluation of the GetType call, only if "instance" is not null. With a method call like this, it is possible to write an "if" block, with inside a LINQ query, this because a bit more complex.

The idea for the code is not new and is similar to the functional Monad concept. It has been covered numerous times, and an implementation more in line with F# can be found on Matthew Podwysocki's Blog.

Pollution ?

When we're talking about pollution linked to Extension Methods, we're talking about Intellisense pollution. We can quickly find ourselves dealing with a bunch of extensions that are useful in the context of the current code, which renders Intellisense unusable. With Umbrella, these two extensions are somehow polluting all types, because they are generic without constraints.

Although these are only two very generic extensions, this can apply to almost any block of code, but they could find themselves better placed in an Umbrella Extension Point, rather than directly on every types.

We could have this :



    object instance = null;
    Console.WriteLine("{0}", instance.Control().Maybe(o => o.GetType());


But I have some issues with this approach : To be able to create an extension point the Control methd has to create an instance of the IExtensionPoint. This add a new instantiation during the execution, although the lambda also creates itself a hidden instance, we're counting anymore... There is also the fact that it lengthens the code line, but is only aesthetics. We'll see what pops out ...

Anyway, it is interesting to see the impact the learning a new language has on the style of writing code with another language that one's been using for a long time...

Local Variables in Lambda Expressions

By Jay at November 20, 2008 19:58 Tags: ,
Cet article est disponible en français.
 
After a quick chat with Eric Lippert about a post  on the use in lambda expressions of a local variable declared in a foreach loop, Eric pointed me out that this piece of code :

    int a = 0;
    Action action = () => Console.WriteLine(a);
    action();

 
Is actually not expanded by the compiler to this code :

    [CompilerGenerated]
    private sealed class <>c__DisplayClass1
    {
       public int a;

       public void <Main>b__0()
       {
          Console.WriteLine(this.a);
       }
    }

    void Main()
    {
       int a = 0;

       var display = new <>c__DisplayClass1();
       display.a = a;

       var action = new Action(display.<Main>b__0);

       action();
    }

 
I made the assumption that a local variable was simply copied in the "DisplayClass" if it is not used after the creation of the lambda, which  is not the case.

If we take this slightly different sample :


    int a = 0;
    Action action = () => Console.WriteLine(a);
    a = 42;
    action();

 
My assumption would have made this code display "0". This is correct because lambda expressions (and anonymous methods) "capture" the variable and not the value; The execution must display 42.

Actually, this latter piece of code is expanded like this:

    var display = new <>c__DisplayClass1();

    display.a = 0;

    var action = new Action(display.<Main>b__0);

    display.a = 42;

    action();


 
We can see that in fact, the variable that was previously local, on the stack, has been "promoted" as a field memberr of the "Display Class". This means that all references to this "local" variable, inside or outside of the lambda, are replaced to point to the current instanc e of the "DisplayClass".

This is quite simple actually, but we can feel that the "magic" behind the C# 3.0 syntactic sugar is the result of a lot of thinking !

I will end this post by a big thanks to Eric Lippert, who took the time to answer me, even though he's probably under heavy load with the developement of C# 4.0. (With the contravariance of generics, yey !)

Lambda Expression and ForEach loops

By Jay at November 18, 2008 20:58 Tags: , ,

Cet article est disponible en français.

To enhance the performances of a type serializer, and to use a small extension that I recently wrote for Umbrella I stumbled upon an interesting small "Side Effect" seen when creating lambda expressions inside a foreach loop.

Let's take this simple piece of code :


    var actionList = new List<Func<int>>();

    foreach (var value in Enumerable.Range(0, 10))
    {
       actionList.Add(() => value);
    }

    actionList.ForEach(func => Console.Write("{0} ", func()));

Which outputs this :


   9 9 9 9 9 9 9 9 9 9



Which is, of course, what we could have expected.

Lambda expression have the ability to use variables that are in the scope when they are declared. This makes them very interesting, but to properly use them, it is best to understand how they are "materialized" by the compiler.

Like a lot of features of C#, like the using, foreach, iterators or lock, lambdas are syntactic sugar destined to simplify the writing of code that is most of the time pretty verbose. It is possible to write the expanded code for these keyswords in C#.

Let's take this other piece of code :


    int a = 0;
    Action action = () => Console.WriteLine(a);
    action();


The lambda expression is "materialized" by the C# compiler under the form of a "Display Class", that allows the storage of the local variable "a" :



    [CompilerGenerated]
    private sealed class <>c__DisplayClass1
    {
       public int a;

       public void <Main>b__0()
       {
          Console.WriteLine(this.a);
       }
    }

We can see that the indentifiers for the generated class are not valid in C#, but are valide from the CLR point of view. We can also see that the local variable used during the declaration is present as a member variable, in the class that contains the code of the lambda expression. The compiler will then write this to create an instance of the lambda expression :


    int a = 0;

    var display = new <>c__DisplayClass1();
    display.a = a;

    Action action = new Action(display.<Main>b__0);

    action();


There also, this is not valid C#.

But then, what happens for the foreach case so that the content of the variable is repeated ?

If we analyze the first code sample generated by the compiler with Reflector, there is nothing much fancy to see with the C# visualizer :


    List<Func<int>> actionList = new List<Func<int>>();
    using (IEnumerator<int> CS$5$0000 = Enumerable.Range(0, 10).GetEnumerator())
    {
       while (CS$5$0000.MoveNext())
       {
          int value = CS$5$0000.Current;
          actionList.Add(delegate {
             return value;
          });
       }
    }



The lambda expression is represented as an anonymous method, which is a synonym of lambda, but that does not explain the behavior.

We must look at the generated IL to understand the behavior, and this is the correct C# code that is generated :


    List<Func<int>> actionList = new List<Func<int>>();
    using (IEnumerator<int> CS$5$0000 = Enumerable.Range(0, 10).GetEnumerator())
    {
       var myLambda = new <>c__DisplayClass4();

       while (CS$5$0000.MoveNext())
       {
          int value = CS$5$0000.Current;

          myLambda.value = value;

          actionList.Add(new Func<int>(myLambda.b_0));
       }
    }



We can easily see what the problem is : The instance of the class containing the lambda is created only once, and reused many times to assign a new value for each iteration. This explains why the execution of all the lambdas return the last enumerated value, because they all refer to the same instance of the "DisplayClass" type.

However, if we write the code this way :


    foreach (var value in Enumerable.Range(0, 10))
    {
       int myValue = value;
       actionList.Add(() => myValue);
    }

The behavior changes, and this time, each lambda has the correct value.

From the compiler point of view, the creation of a new instance for the container class must be the consequence of the creation of a new variable. For the case of a ForEach statement, this is not the case and the variable is treated as created only once, then reused..

So, from the compiler point of view, a ForEach loop is expanded like this :


    using (var it = Enumerable.Range(0, 10).GetEnumerator())
    {
       int value;

       while (it.MoveNext())
       {
          value = it.Current;
          actionList2.Add(() => value);
       }
    }



This is probably a question of interpretation, but I wasn't exactly expecting this...

So, one must pay attention to the way local variables are used in lambda expressions, depending on their declaration location.

I'll explain in a later post why I did have to use lamda expressions in a ForEach loop.

About me

My name is Jerome Laban, I am a Software Architect, C# MVP and .NET enthustiast from Montréal, QC. You will find my blog on this site, where I'm adding my thoughts on current events, or the things I'm working on, such as the Remote Control for Windows Phone.