Do not await what does not need to be awaited
As the usage of await
seeps more and more into general C# code, I’m finding some small “leaks” that make me sad sometimes. This one is pretty simple. Looks like that every time somebody uses XxxAsync
method, he or she also awaits it. Makes sense, or does it?
The problem
Obviously if you need the result of an asynchronous action you’d use await
. But if you’re just passing the value up, the await
is just a complication for the compiler, because more code is generated, and for JIT/runtime, because more code is executed. Let me show some example.
Granted there’s some interface (because that’s where I see it often).
interface ISomething
{
Task<string> GetSomethingAsync();
}
And you implement it (i.e. using some service).
public class Something : ISomething
{
readonly IService _service;
public Something(IService service)
{
_service = service;
}
public async Task<string> GetSomethingAsync() => await _service.FetchSomethingAsync();
}
Nothing wrong, right?
Absolutely. The code is fine and works as expected. But that doesn’t mean something cannot be improved. The await _service.FetchSomethingAsync()
expression forces compiler to create the state machine that’s behind this feature. And this code is not exactly small (see below for numbers).
The solution
In these cases, it’s easier to just pass the task higher directly. So, the class can look like this then.
public class Something : ISomething
{
readonly IService _service;
public Something(IService service)
{
_service = service;
}
public Task<string> GetSomethingAsync() => _service.FetchSomethingAsync();
}
With this nothing is generated and it’s just a regular method call.
The gotchas
There’s at least one gotcha I know people are surprised when they think about this for the first time. Take the following method.
public async Task FooBar()
{
using (var foo = new Foo())
{
await SomethingAsync(foo);
}
}
Can this be rewritten the way I showed before?
No! When you swap the async
/await
for plain return
, the code will go over the finally
block (from using
) and will dispose the Foo
, but the SomethingAsync
might be still running and using it. Queue dramatic explosion.
Some numbers
All the above is a theory which can or not translate into some real hard cold numbers in the runtime. Benchmark time (using BenchmarkDotNet).
First, let’s check code size of resulting binary. Bare metal netcoreapp2.0
console application having just one extra class with one asynchronous method is 6144
vs 5632
bytes on disk. OK, I agree not exactly a deal breaker, especially considering today’s storage prices. Where does the difference come from? Well, it’s from the generated state machine.
[CompilerGenerated]
[StructLayout(LayoutKind.Auto)]
private struct <Do>d__0 : IAsyncStateMachine
{
public int <>1__state;
public AsyncTaskMethodBuilder <>t__builder;
private ConfiguredTaskAwaitable.ConfiguredTaskAwaiter <>u__1;
void IAsyncStateMachine.MoveNext()
{
int num = this.<>1__state;
try
{
ConfiguredTaskAwaitable.ConfiguredTaskAwaiter awaiter;
if (num != 0)
{
awaiter = Task.Delay(1000).ConfigureAwait(false).GetAwaiter();
if (!awaiter.get_IsCompleted())
{
this.<>1__state = 0;
this.<>u__1 = awaiter;
this.<>t__builder.AwaitUnsafeOnCompleted<ConfiguredTaskAwaitable.ConfiguredTaskAwaiter, Test.<Do>d__0>(ref awaiter, ref this);
return;
}
}
else
{
awaiter = this.<>u__1;
this.<>u__1 = default(ConfiguredTaskAwaitable.ConfiguredTaskAwaiter);
this.<>1__state = -1;
}
awaiter.GetResult();
}
catch (Exception exception)
{
this.<>1__state = -2;
this.<>t__builder.SetException(exception);
return;
}
this.<>1__state = -2;
this.<>t__builder.SetResult();
}
[DebuggerHidden]
void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
{
this.<>t__builder.SetStateMachine(stateMachine);
}
}
And as my trusty ILSpy shows // Code size 156 (0x9c)
. Does this affect the speed of execution or allocations?
I first tested fast path scenario, where the awaited task has already completed, using Task.CompletedTask
.
Method | Mean | Error | StdDev | Scaled | ScaledSD | Allocated |
---|---|---|---|---|---|---|
WithoutAwaitFastPath | 1.418 ns | 0.0237 ns | 0.0221 ns | 1.00 | 0.00 | 0 B |
WithAwaitFastPath | 23.269 ns | 0.0503 ns | 0.0471 ns | 16.42 | 0.25 | 0 B |
Well, that’s quite a difference. Going through the await
and state machine in this fast path scenario isn’t cheap.
Next, I tested a call where some callback really happens. I used Task.Delay(1)
, thus don’t look at absolute numbers, but only at the difference.
Method | Mean | Error | StdDev | Scaled | Allocated |
---|---|---|---|---|---|
WithoutAwait | 15.62 ms | 0.0085 ms | 0.0079 ms | 1.00 | 312 B |
WithAwait | 15.62 ms | 0.0096 ms | 0.0089 ms | 1.00 | 528 B |
The execution speed here is the same because the execution itself is fully overruled by the duration of the asynchronous method itself (in this case the timer callback), as it should be. But there’s a 216
bytes difference in allocation (.NET Core 2.0.3, 64-bit RyuJIT). Does it matter? In normal code probably not. But why to give something out when the “fix” is super easy and clean.
Summary
Was this just my rambling about a problem that in real world doesn’t matter that much? Yes, it was. Although sometimes I really need to connect the smallest dots and every clue matters.
And you don’t write array[index + 0]
, why would you do the + 0
with an extra await
…