Yesterday I participated in a discussion about how to test the following class (it was actually an interview question). Here’s the class under test:
public class ClassUnderTest
{
private SomeService service;
public ClassUnderTest(SomeService service)
{
this.service = service;
}
public void Foo(int fooInput)
{
if (fooInput > 0)
this.service.DoSomething();
}
}
What SomeService does is not relevant. Just something. As you can see, the ClassUnderTest is not very cooperative for the purpose of testing its functionality: it does not expose any properties, and the only method it contains does not return anything. How can we test it?
It looks like the only validation option that is left is to check that “something” is done, and the only way to check it is through interaction testing. Here is a couple of tests that achieve this (I used “method-condition-outcome” naming convention, but the names could be rephrased in BDD style):
[Test]
public void Foo_PositiveInput_SomethingIsDone()
{
// Arrange
var service = MockRepository.GenerateMock<SomeService>();
var cut = new ClassUnderTest(service);
// Act
cut.Foo(1);
// Assert
service.AssertWasCalled(s => s.DoSomething());
}
[Test]
public void Foo_NonPositiveInput_SomethingIsNotDone()
{
// Arrange
var service = MockRepository.GenerateMock<SomeService>();
var cut = new ClassUnderTest(service);
// Act
cut.Foo(0);
// Assert
service.AssertWasNotCalled(s => s.DoSomething());
}
I expressed during the discussion a lack of my enthusiasm about this solution noting that I tend to use state-based testing if possible. But is this possible in the example above? And what is a disadvantage of the interaction-based tests in this case?
Let’s have a closer look. First, the above tests essentially duplicate internal logic of the Foo implementation. If we merged two tests into one, then the logic of the assert part of the code would reflect pretty much what’s inside the implementation of Foo: “for positive input service.DoSomething should be called, otherwise not”.
Since the logic of the tests follow the logic of the code and was probably written by the same developer in the same time, there is very little in tests that validates the actual functionality. If I believe that Foo should call DoSomething for positive input, I will most likely get it right in the test that validates that Foo calls DoSomething for positive input. But what if my assumption is wrong? What if DoSomething should be called for input bigger than 100, and for other input values Foo should call DoSomethingElse? No, these tests can’t validate it. I’d say they can’t validate it by design because they are designed to reflect the internal logic of the method they validate. So the correctness of Foo implementation must be verified using other means, and these tests will ensure the Foo implementation stays correct assuming the original algorithm is right.
And this becomes the main value of such tests: they serve as regression tests protecting correct code from being improperly changed (perhaps by another developer who had to extend the original code and didn’t get it right). However, this sort of protection is not refactoring friendly. Let’s see why.
Imagine that SomeService class is refactored, DoSomething is still there but it’s semantics has changed and no longer fits the implementation of Foo. Instead, Foo is supposed to call a new method DoSomethingElse:
public void Foo(int fooInput)
{
if (fooInput > 0)
this.service.DoSomethingElse();
}
The system with the new change works properly, but one of the test for ClassUnderTest now fails! Of course, it has no knowledge about what is right or wrong semantically, it only cares about a certain method to be called. Depending on a system complexity, searching for the failure problem can take some time, but even if it’s an easy fix, we have a test that fails for a wrong reason.
In the article by Steve Freeman et al “Mock Roles, No Objects” they warn about mirroring target code logic in the tests: “Some uses of Mock Objects set up behaviour that shadows the target code exactly, which makes the tests brittle. This isparticularly common in tests that mock third-party libraries. The problem here is that the mock objects are not being used to drive the design, but to work with someone else’s. At some level, mock objects should shadow a scenario for the target code, but only because the design of that code should be driven by the test. Complex mock setup for a test is actually a hint that there is a missing object in the design.”
So the interaction-based tests can be more brittle, as we saw in our example. But what are the choices for that example? If we don’t verify calling DoSomething for positive Foo input, then what else can we verify?
To try to answer this question, let’s first figure out what kind of testing we are dealing with: unit or integation. For black box unit testing I am afraid there is not much to validate. No properties are exposed, the only method is void. If you are a code coverage junkie, you can maybe write a couple of tests to validate that the class can be instantiated and used without raising unexpected exceptions, but the value of such work is outside the scope of this blog post. Let’s focus instead on integration testing. How can we validate that the class does what it should?
As soon as we leave the idealistic constraints of black box unit testing and focus on overall functionality validation, we no longer need to express the validation goal in terms of ClassUnderTest methods, not to say about their internal implementation. We focus on the outcome of Foo in terms of what it does to the rest of the system. Does it change data in a database, send email, draws a graph on a screen?
In most cases business logic operations result in some (persistent or in-memory) state change. Then we can strive to catch these changes and use them in the test assertsions. This will help increasing test trustworthiness and tolerance to refactoring. However, determining state changes that correspond to certain action is not always easy, and even when it is, it may result in higher test complexity – you will have to insert some additional code that retrieves the state that is expected to change. And apart from this, there are still scenarios when the result of a certain action is not easily converted to the state available for validation in a test code. Nat Pryce in his old blog post gave an example of such scenario: tests for a graphical simulator. Really, how would you verify that a cell was drawn on a display using state based testing? Other scenarios that may fall in this category include calling external services, sending messages etc. They may (and usually do) leave the traces that can be used for state based testing, but tests will become complex and overloaded with state retrieval details. Clearly, interaction based tests may have their place here.
So what can we test with interaction-based test? Mark Seemann in his forthcoming book “Dependency Injection in .NET” classify dependencies as either stable or volatile. According to him, stable dependencies “are already there, tend to be very backwards compatible and invoking them have deterministic outcomes”. An example of stable dependency is .NET BCL. “Other examples may include specialized libraries that encapsulate algorithms relevant to your application. If you are developing an application that deals with chemistry, you may reference a third-party library that contains chemistry-specific functionality.” Volatile dependencies, on the other hand, do not provide a sufficient foundation for applications.
I believe separating things on stable and volatile is also helpful when deciding what to test using interaction-based testing, we only need to replace word “dependency” to “expectation”. We can use interaction-based testing as long as our expectations are stable, otherwise test becomes brittle. So coming back to original tests for Foo, we should ask ourselves a question: is DoSomething a stable expectation for a given scenario, or it is only valid for a current implementation and may not survive refactoring? If so, the expectation is volatile. Then we should inspect the call graph, find a stable expectation and use it in interaction-based tests. Examples of stable expectations are calls to external services, external API, output to display. As long as the feature is unchanged, the system will be making same outbound calls and draw same pictures on a screen. These activities are stable and can become foundations for interaction-based tests.
So how the original Foo test should look if we find that DoSomething internally sends an email? It can look like this:
[Test]
public void Foo_Should_Send_Notification_Mail_On_Positive_Input()
{
// Arrange
var mailServer = MockRepository.GenerateMock<MailServer>();
var cut = new ClassUnderTest(new SomeService);
// Act
cut.Foo(1);
// Assert
mailServer.AssertWasCalled(s => s.SendMail());
}
Note that we no longer care about Foo exact implementation. It’s implementation is volatile. We only expect something that is stable in the scope of the given feature. Like sending mail (more accurate test code could verify that the mail was sent to a right person with a right subject, but we should be careful about not overspecifying the expectations, otherwise they will be no longer stable).
When I was about to finish this post, I found another old blog post, by Jeremy D. Miller, where he also uses email sending as an example of an operation that it worth testing using interaction-based tests: “We could use a state-based strategy to test the email. We could run the test, then run around and ask the expected recipients to check their inbox. […] You could also check an audit trail, but that's not the real functionality being tested. The easiest approach in the case of the email tests is to use interaction testing to verify that the email was sent to the SMTP service.”
So I believe interaction-based testing can be an efficient way to validate that operations result in proper actions, as long as tests don’t focus on volatile interactions and instead select stable expectations related to the feauture itself rather than it’s implementation details.
Getting mock framework API rigth is uneasy task. Mock framework designers don’t have as much freedom as designers of other libraries: the purpose of a mock framework is not to expose an arbitraty API (unknown at the time of mock framework design) and intercept it in a transarent way, so developers can use mocked object as they were using real instances.
This is why API of modern mock frameworks is full of technique such as extension methods and lambda expressions. In addition these frameworks use so called fluent approach to API. Unfortunately such methods are not easily portable to other, especially non-imperative languages. Being fluent in one language does not mean fluent in the other one.
I decided to try various mock frameworks with F#. I realized that their C#-friendly syntax will look in F# clumsy, but I coudl overcome it by writing small wrappers. What concerned me more is incomatibility of F# functional delegates and expressions comparing to C#. For my tests I’ve chosen a Zoo example listed in an excellent series of blog posts by Richard Banks dedicated to mock framework comparison. Richard ran his comparison on free frameworks:
- Rhino.Mocks
- NSubstitute
- Moq
I’ve extended this set with two commercial products:
- Typemock Isolator
- Telerick JustMock
So far I only tried very basic mocking described in the first post of Richard’s series: faking return value. As I expected, even such a simple operation became a challenge when executed from F# code. I managed to make tests work only for two and half frameworks. Below is a description of what I did and how it was possible to have half of success.
1. Rhino.Mocks: success
Rhino.Mocks is the most popular .NET mock framework. This was the test code that I wanted to port to F#:
[Test]
public void Return()
{
var monkey = MockRepository.GenerateMock<IMonkey>();
monkey.Stub(m => m.Name).Return("Spike");
var actual = monkey.Name;
Assert.AreEqual("Spike", actual);
}
As you can see, “monkey” is used both as an instance of an object that implements IMoney interface (so we can call money.Name directly) and as an object exposing mocking API (so we can call a Stub method on it). This is achieved by using extension methods, and extension methods are language specific – being defined in C# code they are not available in F# as extensions. But they can be used from a static class where they are originally defined:
[<Test>]
member this.Return() =
let monkey = MockRepository.GenerateMock<IMonkey>()
RhinoMocksExtensions.Stub<IMonkey, string>(monkey, fun m -> m.Name).Return("Spike");
monkey.Name
|> should equal "Spike"
The test works, but RhinoMocksExtensions class has not been meant to be exposed to general public, so I put it in a little wrapper:
module FsMock.RhinoMocks
open Rhino.Mocks
type mock<'T when 'T : not struct>() =
let instance = MockRepository.GenerateMock<'T>()
member this.Object = instance
module Mock =
let arrange<'T, 'R when 'T : not struct> (f : ('T -> 'R)) (r : 'R) (m : mock<'T>) =
RhinoMocksExtensions.Stub<'T, 'R>(m.Object, Function<'T, 'R>(f)).Return(r)
Now I can rewrite the origianl test in a F# style:
[<Test>]
member this.Return() =
let monkey = mock<IMonkey>()
monkey
|> Mock.arrange (fun m -> m.Name) "Spike"
let monkey = monkey.Object
monkey.Name
|> should equal "Spike"
Both the original and rewritten tests passed.
2. NSubstitute: success
NSubstitute is a new kid on the block. The framework prioritizes simplicity over coverage of all possible mocking scenarios. Unlike other frameworks, NSubstitutes avoids using lambda expressions. This makes it easier to use in F#.
Here’s the C# code using NSubstitute:
[Test]
public void Return()
{
var monkey = Substitute.For<IMonkey>();
monkey.Name.Returns("Spike");
var actual = monkey.Name;
Assert.AreEqual("Spike", actual);
}
Like Rhino.Mocks, NSubstitute also uses extension methods, so porting the above test to F# results in the following code:
[<Test>]
member this.Return() =
let monkey = Substitute.For<IMonkey>()
SubstituteExtensions.Returns(monkey.Name, "Spike")
monkey.Name
|> should equal "Spike"
Again, I created a little helper to make syntax F# friendly:
module FsMock.NSubstitute
open NSubstitute
type mock<'T when 'T : not struct>() =
let instance = Substitute.For<'T>()
member this.Object = instance
module Mock =
let arrange<'T, 'R when 'T : not struct> (f : ('T -> 'R)) (r : 'R) (m : mock<'T>) =
SubstituteExtensions.Returns<'R>(f(m.Object), r)
Now the test code looks identical to Rhino.Mocks example:
[<Test>]
member this.Return() =
let monkey = mock<IMonkey>()
monkey
|> Mock.arrange (fun m -> m.Name) "Spike"
let monkey = monkey.Object
monkey.Name
|> should equal "Spike"
And the test passed!
NB! I have deliberately unified mocking API when creating F# wrappers for different frameworks. Since F# API will be quite different from it’s C# counterpart, I didn’t want to multiply number of API sets. Instead I came up with a common and easy to understand set of names (“mock”, “arrange”) that is used in each wrapper.
3. Moq: half success
Now that sounds strange. A test should either succeed of fail. So what has happenned to Moq? Here’s the story.
Moq is a second most popular framework after Rhino.Mocks, and it has pioneered act-arrange-assert API based on lambda-expressions. This is how the C# test looks when using Moq:
[Test]
public void Return()
{
var monkey = new Mock<IMonkey>();
monkey.Setup(m => m.Name).Returns("Spike");
var actual = monkey.Object.Name;
Assert.AreEqual("Spike", actual);
}
Although the expression “m => m.Name” looks exactly like in Rhino.Mocks, there is a big difference behind. Rhino.Mocks uses Func<T> delegate, and Moq is based on LINQ expressions. Working with such expressions in F# requires use of so called quotation expressions that should be converted to LINQ and then downcasted to a generic LINQ expression. The corresponding code looks quite criptic:
[<Test>]
member this.Return() =
let monkey = new Mock<IMonkey>()
let expr = (<@@ System.Func<IMonkey, string>(fun (m : IMonkey) -> m.Name) @@>).ToLinq()
monkey.Setup(expr :?> System.Linq.Expressions.Expression<System.Func<IMonkey, string>>).Returns("Spike");
let monkey = monkey.Object
monkey.Name
|> should equal "Spike"
But this does not work. Here’s the output:
Test 'MoqTests+MoqTests.Return' failed: System.NullReferenceException : Object reference not set to an instance of an object.
at Moq.MethodCall.SetFileInfo()
at Moq.MethodCall..ctor(Mock mock, Expression originalExpression, MethodInfo method, Expression[] arguments)
at Moq.MethodCallReturn..ctor(Mock mock, Expression originalExpression, MethodInfo method, Expression[] arguments)
at Moq.MethodCallReturn`2..ctor(Mock mock, Expression originalExpression, MethodInfo method, Expression[] arguments)
at Moq.Mock.<>c__DisplayClass15`2.b__14()
at Moq.PexProtector.Invoke[T](Func`1 function)
at Moq.Mock.SetupGet[T1,TProperty](Mock mock, Expression`1 expression)
at Moq.Mock.<>c__DisplayClass12`2.b__11()
at Moq.PexProtector.Invoke[T](Func`1 function)
at Moq.Mock.Setup[T1,TResult](Mock mock, Expression`1 expression)
at Moq.Mock`1.Setup[TResult](Expression`1 expression)
C:\Projects\NET\MockComparison\TempTests\MoqTests.fs(20,0): at MoqTests.MoqTests.Return()
So why half success then? Well, if I execute the same code from F# interactive window, it works!
> let monkey = new Mock<IMonkey>()
let expr = (<@@ System.Func<IMonkey, string>(fun (m : IMonkey) -> m.Name) @@>).ToLinq()
monkey.Setup(expr :?> System.Linq.Expressions.Expression<System.Func<IMonkey, string>>).Returns("Spike");
printfn "%s" monkey.Object.Name;;
Spike
val monkey : Mock<IMonkey>
val expr : Expression = m => m.Name
Note the word “Spike” printed right after the line that begins with “printfn”. This is the output.
So I don’t really have a clue why the same code works in F# interactive session but fails being compiled in an assembly. I may need to investigate some more.
4. Typemock Isolator: failure
Ironically, both commercial frameworks failed to be used with F#. The C# code that used to test Typemock Isolator looks like this:
[Test]
public void Return()
{
var monkey = Isolate.Fake.Instance<IMonkey>();
Func<string> func = () => monkey.Name;
Isolate.WhenCalled(func).WillReturn("Spike");
var actual = monkey.Name;
Assert.AreEqual("Spike", actual);
}
Here's the corresponding F# code:
[<Test>]
member this.Return() =
let monkey = Isolate.Fake.Instance<IMonkey>();
Isolate.WhenCalled(System.Func<string>(fun ignore -> monkey.Name)).WillReturn("Spike");
monkey.Name
|> should equal "Spike"
This test fails with the following output:
Test 'TypeMockTests+TypeMocksTests.Return' failed: TypeMock.TypeMockException :
*** Cannot call Isolate.WhenCalled() with method group fake.Invoke. Try using Isolate.WhenCalled( () => fake.Invoke()) instead
at ec.a()
at dm.b(Boolean A_0)
at im.c(Boolean A_0)
at im.a(Object A_0, Boolean A_1, Func`1 A_2, Action A_3, Action A_4, Action A_5)
at im.c(Object A_0)
at TypeMock.ArrangeActAssert.ExpectationEngine`1.a(TResult A_0)
C:\Projects\NET\MockComparison\TempTests\TypeMockTests.fs(21,0): at TypeMockTests.TypeMocksTests.Return()
I described the problem to Typemock developers and was advised to try a different approach presented by Roy Osherove in his blog post. Unfortunately the modified test still fails, although with a different exception.
5. JustMock: failure
JustMock is also a commercial product, and it’s API also lacked F# compatibility. This is the original C# code:
[Test]
public void Return()
{
var monkey = Mock.Create<IMonkey>();
Mock.Arrange(() => monkey.Name).Returns("Spike");
var actual = monkey.Name;
Assert.AreEqual("Spike", actual);
}
JustMock also uses LINQ expressions, so I had to use a similar trick to what I had to do with Moq:
[<Test>]
member this.Return() =
let monkey = Mock.Create<IMonkey>();
let expr = (<@@ System.Func<string>(fun ignore -> monkey.Name) @@>).ToLinq()
Mock.Arrange<string>(expr :?> System.Linq.Expressions.Expression<System.Func<string>>).Returns("Spike");
monkey.Name
|> should equal "Spike"
The code compiles and does not throw any expection during the execution. But the “Name” property is not set, so “should equal” assertion fails.
Conclusion: no offence, just exploring possibilities
One thing that I want to get straigh is that these tests say absolutely nothing about general design quality of respective mock frameworks. As I mentioned in the beginning, API design requirements of such frameworks require their developers to apply language-specific tricks to make mocking simple. So the fact that some API fits another programming language is more luck rather than conscious vision. Moreover, I only tried the simplest of a large variety of functions exposed by mock frameworks. A slightly more complicated scenario would fail all frameworks, I am sure.
However, since more developers express interests in F# and start using it in their projects, I believe it’s time for mock frameworks developers to consider this new territory and extend their products to offer full suport for this exciting language.