[WP7] HttpWebRequest and the Flickr app 'Black Screen' issue
TL;DR: While trying to fix the "Black Screen" issue of the Windows Phone 7 flickr app 1.3, I found out that HttpWebRequest is internally making a synchronous call to the UI Thread, making a network call negatively impact the UI. The entire building of an asynchronous web query is performed on the UI thread, and you can't do anything about it.
Edit: This post was formerly named "About the UI Thread performance and HttpWebRequest", but was in fact about Yahoo's Flickr application and was enhanced accordingly.
When programming on Windows Phone 7, you'll hear often that to improve the perceived performance, you'll need to get off of the UI Thread (i.e. the dispatcher) to perform non UI related operations. By good perceived performance, I mean having the UI respond immediately, not stall when some background processing is done.
To acheive this, you'll need to use the common asynchrony techniques like queueing in the ThreadPool, create a new thread, or use the Begin/End pattern.
All of this is very true, and one very good example of bad UI Thread use is the processing of the body of a web request, particularly when using the WebClient where the raised events are in the context of the dispatcher. From a beginner's perspective, not having to care about changing contexts when developing a simple app that updates the UI, provides a particularly good and simple experience.
But that has the annoying effect of degrading the perceived performance of the application, because many parts of the application tend to run on the UI thread.
HttpWebRequest to the rescue ?
You'll find that the HttpWebRequest is a better choice in that regard. It uses the Begin/End pattern and the execution of the AsyncCallback is performed in the context of ThreadPool. This performs the execution of the code in that callback in a way that does not impact the perceived performance of the application.
Using the Reactive Extensions, this can be written like this :
[code:c#]
var request = WebRequest.Create("http://www.google.com");
var queryBuilder = Observable.FromAsyncPattern(
(h, o) => request.BeginGetResponse(h, o),
ar => request.EndGetResponse(ar));
queryBuilder()
/* Perform the expensive work in the context of the AsyncCall back */
/* from the WebRequest. This will be the ThreadPool. */
.Select(response => DoSomeExpensiveWork(response))
/* Go back to the UI Thread to execute the OnNext method on the subscriber */
.ObserveOnDispatcher()
.Subscribe(result => DisplayResult(result));
[/code]
That way, you'll get most of your code to execute out of the UI thread, where that does not impact the perceived performance of the application.
Why would it not be to the rescue then ?
Actually, it will always be (as of Windows Phone NoDo), but there's a catch. And that's a big deal, from a performance perspective.
Consider this code :
[code:c#]
public App()
{
/* some application initialization code */
ManualResetEvent ev = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem(
d =>
{
var r = WebRequest.Create("http://www.google.com");
r.BeginGetResponse((r2) => { }, null);
ev.Set();
}
);
ev.WaitOne();
}
[/code]
This code is basically beginning a request on the thread pool, while blocking the UI thread in the App.xaml.cs file. This makes the construction (but not the actual call on the network) of the WebRequest synchronous, and makes the application wait for the request to begin before showing any page to the user.
While this code is definitely not a best practice, there was a code path in the Flickr 1.3 application that was doing something remotely similar, in a more convoluted way. And if you try it for yourself, you'll find that the application hangs in a deadlock during the startup of the application, meaning that our event is never set.
What's happening ?
If you dig a bit, you'll find that the stack trace for a thread in the thread pool is the following :
mscorlib.dll!System.PInvoke.PAL.Threading_Event_Wait()
mscorlib.dll!System.Threading.EventWaitHandle.WaitOne()
System.Windows.dll!System.Windows.Threading.Dispatcher.FastInvoke(...)
System.Windows.dll!System.Net.Browser.AsyncHelper.BeginOnUI(...)
System.Windows.dll!System.Net.Browser.ClientHttpWebRequest.BeginGetResponse(...)
WindowsPhoneApplication2.dll!WindowsPhoneApplication2.App..ctor.AnonymousMethod__0(...)
The BeginGetResponse method is trying to execute something on the UI thread. And in our example, since the UI thread is blocked by the manual reset event, the application hangs in a deadlock between a resource in the dispatcher and our manual reset event.
This is also the case for the EndGetResponse method.
But if you dig even deeper, you'll find in the version of the System.Windows.dll assembly in the WP7 emulator (the one in the SDK is a stub for all public types), that the BeginGetResponse method is doing all the work of actually building the web query on the UI thread !
That is particularly disturbing. I'm still wondering why that network-only code would need to be executed to UI Thread.
What's the impact then ?
The impact is fairly simple : The more web requests you make, the less your UI will be responsive, both for processing the beginning and the end of a web request. Each call to the methods BeginGetResponse and EndGetResponse implicitly goes to the UI thread.
In the case of Remote Control applications like mine that are trying to have remote mouse control, all are affected by the same lagging behavior of the mouse. That's partially because the UI thread is particularly busy processing Manipulation events, this explains a lot about the performance issues of the web requests performed at the same time, even by using HttpWebRequest instead of WebClient. This also explains why until the user stops touching the screen, the web requests will be strongly slowed down.
The Flickr 1.3 "Black Screen" issue
In the Flickr application for which I've been able to work on, a lot of people were reporting a "black screen" issue, where the application stopped working after a few days.
The application was actually trying to update a resource from the application startup in an asynchronous fashion using the HttpWebRequest. Because of a race condition with an other lock in the application and UI Thread that was waiting in the app's initialization, this resulted in an infinite "Black Screen" that could only be bypassed by reinstalling the application.
Interestingly enough, at this point in the application's initialization, in the App's class constructor, the application is not killed after 10 seconds if it is not showing a page to the user. However, if the application stalls in the constructor of the first page, the application is automatically killed by the OS after something like 10 seconds.
Regarding the use of the UI Thread inside the HttpWebRequest code, applications that are network intensive to get a lot of small web resources like images, this is has a negative impact on the performance. The UI thread is constantly interrupted to process network resources query and responses.
Can I do something about it ?
During the analysis of the emulator version of the System.Windows.dll assembly, I noticed that the BeginGetResponse is checking whether the current context is the UI Thread, and does not push the execution on the dispacther.
This means that if you can group the calls to BeginGetResponse calls in the UI thread, you'll spend less time switching between contexts. That's not the panacea, but at the very least you can gain on this side.
What about future versions of Windows Phone ?
On the good news side, Scott Gu annouced at the Mix 11 that the manipulation events will be moved out the the UI thread, making the UI "buttery smooth" to take his words. This will a lot of applications benefit from this change.
Anyway, let's wait for Mango, I'm guessing that will this will change is a very positive way, and allow us to have high performance apps on the Windows Phone platform.