Skip to content

XPerf: A CPU sampler for Silverlight

[originally posted on http://blogs.msdn.com/seema, moving hosting sites]


For those of you who are a) building graphics-intensive applications or b) trying to debug your performance, I would like to introduce xperf and xperfview. These are two profiling tools which can be used to analyze the performance of any Microsoft technology, including Silverlight.

These tools have existed internally for quite some time, and I’m excited to see that they have been released. I use xperf as my first step for profiling any app — it’s a simple CPU sampler and gives you a peek to why on earth something is taking up so many CPU cycles. XPerf plugs into the Event Tracing for Windows library, and listens to all the internally embedded events that all Windows products trigger. With the publicly available Silverlight symbols, you can see how these events line back to what Silverlight is doing under the hood, while your app is running.

Note: with xperf, managed code (your app code) will show up as samples on the methods that your app code calls. XPerf does not have the capabilities to show CPU cycles in managed code.

With XPerf, one can find answers to questions like:

  • Is my app asking Silverlight to constantly spin on CPU cycles.
  • Whether one UI layout or design is more expensive than the other.
  • Whether the time is spent in drawing (agcore.dll), the plug-in’s interactions with (npctrl.dll), or in compilation/JIT (coreclr.dll)
  • How much is stretching/blending/rotating that video going to cost in terms of CPU cycles? How about if you encode it differently?

Please note that the tools only work on Vista and Windows Server 2k3. For profiling on a mac, see my previous post on how to use Shark from the Apple CHUD tools.

The first-time instructions are as follows:

1. Install XPerf (formerly xperfinfo) and xperfview (formerly xperf) as available here: http://msdn.microsoft.com/en-us/library/cc305187.aspx

[10/9-9am I updated the above link: download link http://www.microsoft.com/whdc/system/sysperf/perftools.mspx]

Close everything that is not user-specific (close VS, but keep your virus checker, etc).

2. Startup your sample

I personally start up my sample without profiling because the startup sampling is very different from that of the runtime.

3. Open an Administrator-level command prompt. Set your symbol path in the cmd shell:
set _NT_SYMBOL_PATH= srv*C:\symbols*http://msdl.microsoft.com/downloads/symbols

This will set your symbol path to point to the Microsoft symbol server. The symbol server hosts debugging symbols for most all Microsoft products.

[update: cd into C:\Program Files\Microsoft Windows Performance Toolkit]

4. xperf -on base

Run the app for maybe 10 seconds for this initial run. We want to understand what method is taking up the most cycles. At the end of this, you will see a graph of CPU usage over time – so you can use various methods to see where time is going: repeatedly execute a user action (expand/collapse, scroll, etc), let the media play for a while, etc. If you are trying to compare several different options, you can execute different actions at widely-spaced intervals, and compare the graphs for each.

5. xperf –d myprofile.etl

Step #4 will both stop the profile and start the writing of the event trace log (etl) file. This can take a while, depending on the complexity of your application, how many other apps were running, and how long the profile extended.

6. xperfview myprofile.etl

We see now see multiple graphs. We are most interested in that of the CPU cycles, which will look something like this:

xperfview full graph

In my sample, I have a dual-core machine so you see two lines, one for each CPU. This graph is of a media file playing, which should have an almost steady hit for the CPU. I can drag and select any section of the graph and take a closer look:

7. TraceàLoad Symbols

8. Select the area of the CPU graph that you want to see

        • Right-click and select Summary Table

How to open summary table

9. Accept the EULA for using symbols, expand IExplore.exe (or firefox.exe)

At this point, you’ll see how the time breaks down between the different processes and modules running within windows. Agcore.dll, npctrl.dll, and coreclr.dll are all Silverlight.

10. If the symbols server has correctly been set, you’ll see that you can expand and see the breakdown of time in each method call within agcore.dll:

Summary Table of Olympics on Beta2 Bits

The way to read this graph is that IE is taking up 40% of the CPU on average, during the selected sample. Within that, 32% of the time is spent in Silverlight core (graphics/decoding/property engine, etc are all represented in agcore.dll), and then of that time, 29% (9.24/32.33) of the drawing time is spent in one method call.

Note: with xperf, managed code (your app code) will show up as samples on the methods that your app code calls. XPerf does not have the capabilities to look at app code.

Now what? Well, with just the above think about how you can answer the questions at the top:

  • Is my app asking Silverlight to constantly spin on CPU cycles.
    Do I see activity when the app does not seem to be doing anything? Where does the summary table say that time is going? Perhaps turn on EnableRedrawRegions and see if there is a draw being forced.
  • Whether one UI layout or design is more expensive than the other.
    When I scroll now, is my average CPU higher or lower than before?
  • Whether the time is spent in drawing (agcore.dll), the plug-in’s interactions with (npctrl.dll), or in compilation/JIT (coreclr.dll)
  • How much is stretching/blending/rotating that video going to cost in terms of CPU cycles? How about if you encode it differently?
    Are there new methods that you see in the stack, now that you have stretched the media? Blended it? What is the average time for CPU cycles compared to your baseline? I know that it is cheaper for me to embed an icon overlay directly into the video, but what does it really cost in terms of CPU time?

Note: The tools only work on Vista and Windows Server 2k3

For later usage, your usage pattern is quicker:

  1. Set your symbol path in an Admin-level cmd shell:
    set _NT_SYMBOL_PATH= srv*C:\symbols*http://msdl.microsoft.com/downloads/symbols
  2. xperf -on base
  3. xperf –d myprofile.etl
  4. xperfview myprofile.etl

The platform code and the CPU samples will change version to version, so please keep that in mind when profiling between upgrades.

See anything interesting? Drop me a note!

Have fun!

.seema.

For those of you who are a) building graphics-intensive applications or b) trying to debug your performance, I would like to introduce xperf and xperfview. These are two profiling tools which can be used to analyze the performance of any Microsoft technology, including Silverlight.

These tools have existed internally for quite some time, and I’m excited to see that they have been released. I use xperf as my first step for profiling any app — it’s a simple CPU sampler and gives you a peek to why on earth something is taking up so many CPU cycles. XPerf plugs into the Event Tracing for Windows library, and listens to all the internally embedded events that all Windows products trigger. With the publicly available Silverlight symbols, you can see how these events line back to what Silverlight is doing under the hood, while your app is running.

Note: with xperf, managed code (your app code) will show up as samples on the methods that your app code calls. XPerf does not have the capabilities to show CPU cycles in managed code.

With XPerf, one can find answers to questions like:

  • Is my app asking Silverlight to constantly spin on CPU cycles.
  • Whether one UI layout or design is more expensive than the other.
  • Whether the time is spent in drawing (agcore.dll), the plug-in’s interactions with (npctrl.dll), or in compilation/JIT (coreclr.dll)
  • How much is stretching/blending/rotating that video going to cost in terms of CPU cycles? How about if you encode it differently?

Please note that the tools only work on Vista and Windows Server 2k3. For profiling on a mac, see my previous post on how to use Shark from the Apple CHUD tools.

The first-time instructions are as follows:

1. Install XPerf (formerly xperfinfo) and xperfview (formerly xperf) as available here: http://msdn.microsoft.com/en-us/library/cc305187.aspx

[10/9-9am I updated the above link: download link http://www.microsoft.com/whdc/system/sysperf/perftools.mspx]

Close everything that is not user-specific (close VS, but keep your virus checker, etc).

2. Startup your sample

I personally start up my sample without profiling because the startup sampling is very different from that of the runtime.

3. Open an Administrator-level command prompt. Set your symbol path in the cmd shell:
set _NT_SYMBOL_PATH= srv*C:\symbols*http://msdl.microsoft.com/downloads/symbols

This will set your symbol path to point to the Microsoft symbol server. The symbol server hosts debugging symbols for most all Microsoft products.

[update: cd into C:\Program Files\Microsoft Windows Performance Toolkit]

4. xperf -on base

Run the app for maybe 10 seconds for this initial run. We want to understand what method is taking up the most cycles. At the end of this, you will see a graph of CPU usage over time – so you can use various methods to see where time is going: repeatedly execute a user action (expand/collapse, scroll, etc), let the media play for a while, etc. If you are trying to compare several different options, you can execute different actions at widely-spaced intervals, and compare the graphs for each.

5. xperf –d myprofile.etl

Step #4 will both stop the profile and start the writing of the event trace log (etl) file. This can take a while, depending on the complexity of your application, how many other apps were running, and how long the profile extended.

6. xperfview myprofile.etl

We see now see multiple graphs. We are most interested in that of the CPU cycles, which will look something like this:

xperfview full graph

In my sample, I have a dual-core machine so you see two lines, one for each CPU. This graph is of a media file playing, which should have an almost steady hit for the CPU. I can drag and select any section of the graph and take a closer look:

7. TraceàLoad Symbols

8. Select the area of the CPU graph that you want to see

        • Right-click and select Summary Table

How to open summary table

9. Accept the EULA for using symbols, expand IExplore.exe (or firefox.exe)

At this point, you’ll see how the time breaks down between the different processes and modules running within windows. Agcore.dll, npctrl.dll, and coreclr.dll are all Silverlight.

10. If the symbols server has correctly been set, you’ll see that you can expand and see the breakdown of time in each method call within agcore.dll:

Summary Table of Olympics on Beta2 Bits

The way to read this graph is that IE is taking up 40% of the CPU on average, during the selected sample. Within that, 32% of the time is spent in Silverlight core (graphics/decoding/property engine, etc are all represented in agcore.dll), and then of that time, 29% (9.24/32.33) of the drawing time is spent in one method call.

Note: with xperf, managed code (your app code) will show up as samples on the methods that your app code calls. XPerf does not have the capabilities to look at app code.

Now what? Well, with just the above think about how you can answer the questions at the top:

  • Is my app asking Silverlight to constantly spin on CPU cycles.
    Do I see activity when the app does not seem to be doing anything? Where does the summary table say that time is going? Perhaps turn on EnableRedrawRegions and see if there is a draw being forced.
  • Whether one UI layout or design is more expensive than the other.
    When I scroll now, is my average CPU higher or lower than before?
  • Whether the time is spent in drawing (agcore.dll), the plug-in’s interactions with (npctrl.dll), or in compilation/JIT (coreclr.dll)
  • How much is stretching/blending/rotating that video going to cost in terms of CPU cycles? How about if you encode it differently?
    Are there new methods that you see in the stack, now that you have stretched the media? Blended it? What is the average time for CPU cycles compared to your baseline? I know that it is cheaper for me to embed an icon overlay directly into the video, but what does it really cost in terms of CPU time?

Note: The tools only work on Vista and Windows Server 2k3

For later usage, your usage pattern is quicker:

  1. Set your symbol path in an Admin-level cmd shell:
    set _NT_SYMBOL_PATH= srv*C:\symbols*http://msdl.microsoft.com/downloads/symbols
  2. xperf -on base
  3. xperf –d myprofile.etl
  4. xperfview myprofile.etl

The platform code and the CPU samples will change version to version, so please keep that in mind when profiling between upgrades.

See anything interesting? Drop me a note!

Have fun!

.seema.

Perf Debugging Tips: EnableRedrawRegions; a performance bug in VideoBrush

[originally posted on http://blogs.msdn.com/seema, I am moving hosting sites]

Was chatting with Andy Beaulieu at Remix Boston, and he was commenting that it seems that Silverlight only draws when needed — it is true, we try to not waste your CPU cycles. For Perf debugging, a way to tell when you are causing a redraw is to turn on the control’s EnableRedrawRegions property.

agControl.settings.EnableRedrawRegions = true;

With this feature on, when a section of the plugin causes a draw, that section will draw in a different color. This setting is not for those susceptible to seizures =P

Bug found: With this setting, I investigated a performance issue with VideoBrush. In our 1.0 Silverlight bits, VideoBrush always requests a redraw in the next frame. This is a bug. Unfortunately, if the control framerate is set to the default (60 fps), then any VideoBrush will be redrawn 60 times in a second. In actuality, VideoBrush should draw at the framerate that its media is getting refreshed (eg. 30fps, 15fps, or 0fps if paused), and should not take up so many CPU cycles. We have a fix for this bug in 1.1. If you are blocked by this bug, please let me know and I can pass the info along to our servicing team.

[update] By the way, this bug was fixed in a servicing pack in dec 2007.

A workaround is to set the framerate on your Silverlight control to be the same as the framerate of your video. There are visual artifacts to this workaround, but it is slight

Sys.Silverlight.createObjectEx({

source: “xaml/Scene.xaml”,

parentElement: document.getElementById(“SilverlightControlHost”),

id: “SilverlightControl”,

properties: {

width: “500”,

height: “500”,

background: “black”,

framerate: 30 //only as much as needed

}

Macs and Silverlight Perf

[this post was originally posted on http://blogs.msdn.com/seema on Oct 2, 2007. It is reposted here as I’m moving hosting sites -Seema]

I met recently with two designers trying to figure out “what was Silverlight doing under the covers? did I accidentally turn on some feature?”

My first thought is to reference my post on how to minimize CPU usage. We have to learn how to conserve not only energy but CPU cycles (yes, even if you don’t see a difference on your dev box)

My second thought, as I stared at the designers’ laptop: have you checked out Apple’s Shark tool made for the Mac? Pretty quick and easy to learn — you can navigate through the entire callstack of Silverlight.

For the Silverlight mac developers out there, check out Shark:

My general usage:

  1. Launch my silverlight application in a browser
  2. Direct Shark towards the browser’s process
  3. Click on Start (it should have a 30sec timeout by default)

When you see the callstack, anything with agcore is from Silverlight.

Why is it named agcore? From Wikipedia: “Ag from the Latin Argentum is the chemical symbol for silver

Enjoy, tell me about your experiences. Any successes? Any questions about what is showing up in the trace?

Silverlight: A few thoughts on minimizing CPU usage

[this post was originally posted on http://blogs.msdn.com/seema on Aug 9, 2007. It is reposted here as I’m moving hosting sites -Seema]

The first two suggestions will have the most drastic improvement on the performance of your Silverlight application, and can affect CPU usage, framerate, and application responsiveness.

  1. IsWindowless=false is faster

Do not turn on isWindowless unless your design requires overlay of other HTML content on top of Silverlight content.

  1. Opaque Background is faster

Do not set a transparent channel in the Silverlight HTML control background property. Setting the background to a transparent or semi-transparent value will add tremendous cost as each render call is forced through the blending pipeline, regardless if the transparency has any visual effect. (A background of 0, transparent, #11aabbcc, etc. will cause blending)

If you simply want to set the control’s background property to match that of the HTML, background:document.body.style.backgroundColor will suffice.

Note: I am referring to the background of the Silverlight HTML control. Setting an opacity on elements within your Silverlight app has relatively minimal cost.

Note: if you do decide to use transparency, please make sure to test the performance on Mac:Safari.

  1. Only offer the quality that is needed for your design.

–          Framerate: you can set the max framerate for the entire contents of the control in properties. Many websites run all animations and media at ~15fps, and most users do not notice.
Note: I set the value of the framerate property below.

–          Media: When encoding your media file, remember that the average media file on the web is roughly encoded at 320×230, with ~15fps.

  1. Test & Debug

–          Quality and performance vary on different machines, and even on different OS/browser configurations.

–          The below onLoadHandler shows how to display the framerate, for debugging purposes, in IE or Safari’s status bar. If your desired framerate is out of reach, you should set the framerate property lower so as not to peg the user’s CPU.

Sys.Silverlight.createObjectEx({

source: “xaml/Scene.xaml”,

parentElement: document.getElementById(“SilverlightControlHost”),

id: “SilverlightControl”,

properties: {

width: “500”,

height: “500”,

background: “black”, //NO ALPHA =)

version: “0.9”,

framerate: “15”    //only as much as needed

},

events: { onLoad:onLoadHandler} });

function onLoadHandler() {

/* To see the framerate displayed in the browser status bar */

agControl = document.getElementById(“SilverlightControl”);

agControl.settings.EnableFrameRateCounter = true;

}

WPF Layered Windows, update for Hardware Acceleration

Good news! A QFE has been released to enable WPF HW acceleration on layered windows on XP:

http://support.microsoft.com/kb/937106/en-us

enjoy!

WPF: “Why do my bitmaps look blurry?” by Anthony Hodsdon & Miles Cohen

[this post was originally posted on http://blogs.msdn.com/seema on Nov 07, 2006. It is reposted here as I’m moving hosting sites -Seema]

Guest Writers: Anthony & Miles are Developers on the WPF 2D Graphics team. Anthony specializes on our Geometry, Miles focuses on the Brushes codepath.

We in WPF-land have been fielding a lot questions from folks concerned that bitmaps viewed in WPF tend to look a little soft — to be precise, that crisp lines in bitmaps tend to smear across pixels. As a lot of you have surmised, the central reason for this is that WPF is a resolution independent framework: instead of talking about pixels, we talk about inches(or more precisely, DIPs, which are defined to be 1/96th of an inch). The precise location of the pixels gets abstracted out as an implementation detail.

In order for bitmaps to look crisp in WPF, the centers of the bitmap pixels have to align precisely with the centers of the device (monitor) pixels. This essentially means two things: the DPI of the bitmap has to be the same as the DPI of the device and the top-left-hand corner of the bitmap needs to be an integral number of pixels from the top-left-hand corner of the device. Because layout doesn’t know anything about pixels (either on the bitmap or on the device), though, there is no easy way to satisfy these two conditions.

What can I do to fix this?

In V1, we strongly urge developers to stay away from bitmaps with crisp edges (also known as “high-frequency bitmaps). As a general rule-of-thumb, you should only use bitmaps that you’d be comfortable compressing with Jpeg – if youneed the visual quality of a lossless codec, it’s probably not going to look good in WPF.

If you absolutely must use crisp bitmaps (and we fully understand that there are a few scenarios where this is necessary), there are a couple of things that you can try doing:

  1. Always use bitmaps that are the same DPI as the device on which you’re displaying. Since you don’t know at compile time what DPI your device is, in practice this means creating at least 3 copies of every bitmap: one for 96 DPI (the most common today), one for 120 DPI (the up-and-coming standard), and one for 144 DPI (common on some laptops). Of course, this still doesn’t handle every case: users are free to set whatever DPI they like on their machine and bitmaps might still look blurry in accessibility scenarios as well. You also have the problem of determining what the device DPI is in the first place – something that may not be possible in certain scenarios (e.g. Partial Trust).
  2. Ensure that the bitmap is placed an integral number of pixels from the top-left-hand corner of the device. This one is more difficult. If all your elements are fixed-layout (e.g. Canvases), you can precisely position your content relative to the window (and in V1 windows are always pixel-aligned). If you make use of any of the more complicated layout elements (Grids, DockPanels, FlowDocuments, or the like), though, you must walk up the visual tree to calculate your offset from the window origin, and you have to do this every time layout changes (for instance, every time the user scrolls through the document).

What about pixel snapping?

Even though layout doesn’t know anything about pixels, WPF does in fact offer a mechanism for making vector content look crisp. This is the much talked about SnapsToDevicePixels property that WPF designers have grown to know and love (okay, “love” may be an exaggeration J). This works wonders on things like Controls and Shapes, but surprisingly it does nothing to improve the crispness of bitmaps. To understand why, recall how pixel-snapping works: Snapping defines a set of guidelines that which edges should be made crisp. Then, deep in our rasterizer (deep enough that we’re interacting with device pixels), we nudge vector content that lies along these edges to be pixel-aligned. Portions of geometry that don’t lie along the edges are stretched ever-so-slightly the corresponding amount.

The key point in all of this is that it’s geometry that’s pixel-snapped, not what’s used to fill that geometry (in this case, bitmaps). And geometry is really the only thing we can snap: if we “snapped” the fill, too, we’d end up stretching it by a couple of pixels, and any amount of stretching will cause the bitmap pixels to be misaligned with the device pixels – exactly what makes bitmaps look soft in the first place.

Is there ever going to be a solution?

While we weren’t able to solve this problem in V1, we fully appreciate its importance among designers, and it is a high priority item for us in the future — both because of the crispness issue and because our current approach effectivelyhalves the fidelity of the bitmap (do an internet search for “Nyquist frequency” for more info on this). When we do introduce a fix, though, it probably won’t be part of pixel-snapping. As I hope I’ve conveyed, pixel-snapping is a fundamentally different operation than what’s desired (despite the name).

In any case, thank you all very much for your feedback on this issue, and be rest assured that we are investigating this for the future.


Actually, if a crisp bitmap is stretched by only a few pixels, it’ll look worse than soft. Because the bitmap and device will be almost “in phase” the bitmap will “beat”, causing parts of the bitmap to look crisp and other parts to look blurry. In many cases, this looks worse than if the bitmap is uniformly blurry:

Crisp:
Offset by .5 pixels (uniformly blurry):
Scaled by .99375 (two pixels on a 120 dpi screen):

On some monitors, it seems that thin WPF lines are blurred across two pixels instead of one. Ick. How do I get sharply rendered lines?

[this post was originally posted on http://blogs.msdn.com/seema on Oct 31, 2006. It is reposted here as I’m moving hosting sites -Seema]

Rectangle Without Pixel snapping With Pixel Snapping
An anti-aliased rectangle zoomed in on Magnifier:

WPF offers a way to get sharp lines and keep anti-aliasing, by auto-magically aligning the horizontal and vertical edges of a UIElement to land on the pixel grid when you set UIElement.SnapsToDevicePixels=true. The property inherits: set SnapsToDevicePixels on your topmost UIElement, and on the first child of every VisualBrush that you want sharp lines on.

When SnapsToDevicePixels=true, the layout system produces 2 horizontal guidelines and 2 vertical guidelines that coincide with the bounding box of any UIElement. These ‘guidelines’ are produced during the Measure/Arrange passes, and hint to MIL to place that boundary lines on the pixel edge.

SnapsToDevicePixels only affects horizontal and vertical lines. Setting this property will also help situations where you want abutting edges. Here are some pictures of the old ListBox style with and without SnapsToDevicePixels set:

Without Pixel Snapping With Pixel Snapping
96 dpi

Why should I care about the pixel grid? WPF graphics are anti-aliased and device-independent, which is a great story for high-resolution machines. However, if you have a black one-pixel line that lands in between two pixels on a low-resolution machine, we will color the two pixels 50% black and 50% white (also known as Grey).

Shall I just adjust my line offsets manually? Should I round my doubles to ints? No, this will make your application device-dependent. On a machine with 120-dpi, an app optimized for 96 dpi will have blurry lines.

How does it work? The WPF graphics layer, MIL, takes any specified guideline and ensures that that coordinate lands on a whole pixel. To do this, MIL might shift your lines up to .5 pixel in either direction. Any line affected by the horizontal guideline will be adjusted the same distance as the guideline. Thus, if you have a horizontal line with the same y-coordinate as the horizontal guideline, that line will be placed on the pixel grid. If you have a vertical line with the same x-coordinate as the vertical guideline, that line will be placed on the pixel grid.

Pixel snapping only affects horizontal and vertical lines that have a specified GuidelineSet.  SnapsToDevicePixels creates guidelines that coincide with the bounding box of your element — for a Rectangle, having SnapsToDevicePixels generate those guidelines will ensure that the lines of the Rectangle land on the pixel grid.

A few things to remember:

This is a hinting mechanism, not sure-fire.

We still respect the anti-aliasing rules: if you specify a line that is thicker than a pixel, we will still spread that line across 2 pixels.

WPF: Layered windows…SW is sometimes faster than HW

[this post was originally posted on http://blogs.msdn.com/seema on Oct 25, 2006. It is reposted here as I’m moving hosting sites -Seema]

Previously, I had posted that Avalon’s layered windows on XP will be rendered via the software pipeline. One can create a layered window by setting Window.AllowsTransparency=”true”. I’ve seen a few forum posts about performance issues, and the key takeaway point is that your mileage will vary depending on your video card.

In order to render layered windows via our hardware pipeline, we need to circumvent the fact that DX Present cannot handle a window with transparency. Instead of using DX present, we render via DX, grab the surface (via GetDC) and present with GDI. GDI is not hardware accelerated on the Vista WDDM video driver model. The above results in a chain of bitblits from video memory to system memory to video memory which creates a performance hit on many video cards. However, with a unified memory model (eg. Intel Lakeport 945g), the rendering of layered windows is faster in hardware than in software, as the bitblit goes from system memory to system memory.

To substantiate the performance hit that could arise from the aforementioned bitblit chain with a few numbers, I did a bit of testing on a card without a unified memory model (nvidia geforce 6800, WDDM video driver). My setup is a recent Vista build, Aero Glass theme, 2.5GHz P4, 1.5 GB RAM, single-monitor, single-video card.

Animating the translation
Scenario: Translating a single 50×50 rectangle around a 300×300 semi-transparent window
Observed: Framerate with hardware acceleration was at ~50 frames/sec. With software, ~ 60frames/sec.

Resizing, forcing an entire redraw
Scenario: With the mouse, resizing an empty 300×300 semi-transparent window with no window style to 300×600
Observed: Framerate was ~8 frames/sec when hardware accelerated. With software, ~60 frames/sec.

Frames with sufficiently complex rendering, particularly 3D, can pass the inflection point and will see a boon to having hardware accelerated layered windows. For layered windows with simple 2D content, our hardware codepath has typically a worse framerate than the software rendering codepath on the video cards without a unified memory model. The enabling of layered windows came late in the game for v1, and we’re looking at developing a more refined codepath for rendering layered windows in vNext, stay tuned.

Key takeaway: a semi-transparent window is a particularly complex feature for WPF’s rendering system. Figure out early what type of hardware that you expect your app to run on, the performance that is acceptable for your users, and whether you want to use Window transparency.

WPF: After animating text, the text seems to pause for 1 second and then render more sharply than before. Why is that?

[this post was originally posted on http://blogs.msdn.com/seema on Oct 20, 2006. It is reposted here as I’m moving hosting sites -Seema]

In all 4 images, we see anti-aliased, sub-pixel positioned, ClearType text. The rendering on the right column is pixel snapped, which sends the glyphs through a refined codepath by Mikhail Lyapunov for rendering the final text in an extremely polished state; every time one changes the location or size of the text, it runs through this codepath and smoothly renders the text for that location on the pixel grid.

We turn off pixel snapping during an animation/scrolling primarily because animation looks best if the text is anti-aliased but not pixel snapped. Snapping animated glyphs to the pixel grid causes weird visual artifacts during the animation.

For V1, the rendering system automatically detects an animation or scrolling of text, turns off pixel snapping, and turns it back on when the animation is completed.

So why 1 second? We animate snapping back on at a hard-coded speed, and that speed was optimized for the most common case of scrolling text at fontsizes of 10-12.

At some point, we discussed enabling the developer to animate the strength of the snapping of the text (so that text smoothly shapes into the final rendering) but the API design and underlying codepath was out of reach for this version.

Please let us know if you (the designer/developer) want control over “strength of Pixel Snapping” in vNext via the ladybug website.

WPF HW Acceleration of Layered Windows for RTM

[this post was originally posted on http://blogs.msdn.com/seema on Sept 18, 2006. It is reposted here as I’m moving hosting sites -Seema]

To obtain GPU-accelerated rendering, Windows Presentation Foundation (WPF) normally renders and presents graphical content through the DirectX pipeline – including the composition of the scene geometry and presentation of the results.

Since before Windows XP, Win32 has supported an alternative window presentation mechanism called “layered windows”.  Layered windows allow for top-level window transparency effects when composed with the desktop; on WPF, this feature is available by setting Window.AllowsTransparency=”true”.  For example, to get a dropshadow effect, WPF composes Menus and Popups as layered windows.

There are two aspects to rendering: composing the scene, and presenting the surface. The Windows Vista D3D 9.0 graphics API provides support for rendering to surfaces with an alpha channel, but does not directly support a mechanism to present that surface to the desktop and retain alpha information. On Windows Vista, WPF renders via hardware accelerated DX, acquires the surface viaIDirect3DSurface9::GetDC, and presents to the screen via GDI.

DirectX 9.0C, the version of DX available on Windows XP, does not support IDirect3DSurface9::GetDC of a surface with an alpha channel. As such, support for hardware accelerated rendering when usingWindow.AllowsTransparency=”true” will not be available on Windows XP from the RC1 and RTM releases of WPF.