What strategies are people using to UnitTest code that is non-deterministic? * MockRandomizer -- Trick your random number generator into generating predictable numbers (such as 0,1,2,3...) * DontChangeTheCodeTest -- If you absolutely can't solve the problem, there's always this resort. Some example problems below. ---- I've been writing a MarkovChainer and would like a repeatable way to test certain results. I try passing it a sort of MockRandomizer and controlling it externally, but the randomness is buried so deep inside the chainer that it's extremely hard for me to think about how to test the chainer. (There's also an issue of a CombinatorialExplosion, I suppose; for almost any markov chainer there's an infinite number of possible chains you can receive.) ''Why is it difficult to mock the randomizer? Some consider it to be a CodeSmell when they encounter code that is difficult to unit test.'' I agree that under most circumstances, code that's difficult to UnitTest is a smell. But I have the sneaking suspicion that non-deterministic code is far more difficult than other types of code, even if it's well-factored. I tried mocking the randomizer, and it was really quite difficult. Maybe I'm doing something wrong: * The code that uses a randomizer -- using java.util.Random -- makes multiple calls to "nextInt(int n)", with int n being lots of different things. To test this, I have to write a M''''''ockRandom that says arcane things like ''the first time someone calls 'nextInt(5)', return 4. The next time they call 'nextInt(5)', return 2.'' Yes, you can do this, but the resulting code is even harder to read than the code it tests, which seems like a smell to me as well. ''I think you are making this hard. Load list of pre-determined numbers from a file. Dish them out one at a time. Verify the now deterministic result. What am I missing? -- DanilSuits'' * You have to take care not to have endless loops. For example, with a word-based MarkovChainer that starts with the sentence "the quick brown fox jumped over the lazy dog" it's possible to accidentally write a MockRandomizer that always returns "the quick brown fox jumped over the quick brown fox jumped over the quick brown fox ..." and never ends. This is practically impossible with any genuine pseudo-random number generator, though chain generation may in fact be quite slow if you're unlucky. * The chainer is decomposed into small, manageable chunks already, but that doesn't seem to help the problem. In particular, it stores a collection of link-relationships: "Given a prefix word like 'the', what's the next likely word?" I can test at that level to reduce the CombinatorialExplosion. But I also want to expose the entire collection to a test, because the link-relationships are innately related to one another. (In particular, the relation is that with the exception of the markers of the chain end, each prefix must also be a suffix in at least one link-relationship.) * Furthermore, within the link-relationships, the suffixes are not given any order. They're stored as a Set, I believe, because order is entirely unimportant when you're picking something at random. Set ordering is non-deterministic as well. Now, of course I could make it a S''''''ortedSet -- which would make it more testable. But that means I would be ''adding more functionality just to make code testable'', and that feels like a smell as well. In short, yes, you can UnitTest it. But I can't think of a way to do it that doesn't introduce more of a mess than it cleans up. ''You're trying to do too much. I think you're mixing UnitTest''''''s and AcceptanceTest''''''s. UnitTest''''''s are used to test interactions between a few (at most) classes. MockObject''''''s are used to reduce the number of complex classes interacting. If you want to test the overall behaviour of the system, such as the final results, that's an acceptance test and you shouldn't use MockObject''''''s for that. UnitTest''''''s are tools that help you write code. A side effect is that they verify behaviour, but except at the lowest level, that is not their main purpose.'' ''Example: You want to test that the state transitions occur with the proper probability. Write a M''''''ockRandom class that returns [0-1] based on a predefined list of numbers. If the probabilities are 0.2, 0.3, and 0.5, set up the mock random to return boundary conditions 0.0, 0.19, 0.2, 0.49, 0.5, 1.0. Then assert that the proper state transition occurs in each case. If it works, you're done. Now, writing the acceptance test to see that you get the proper overall behaviour is a different beast. You'll want to keep the randomness and test things like averages and variances.'' ---- Writing a hashCode() method for a Java class. Two instances of this class are equal if they match across three String fields and a Date field. (It's for eliminating duplicates in a database.) So hashCode() uses the four fields to determine a proper hashCode. But after I programmed it the simplest way I can think -- get the hashCodes of all the values, and sum them, not worrying about integer overflow -- I realized that it would be possible for two non-equal instances to return the same hashCode, if their fields had different hashCodes that happened to come to the same sum. A proper UnitTest would come up with those two non-equal instances and test with them. But since hashCode() is defined non-deterministically, I can't imagine how I would do that. ''Take courage: The requirements for hashCode(), according to Sun, are not as vexing.'' : http://java.sun.com/products/jdk/1.2/docs/api/java/lang/Object.html#hashCode() ''Of particular interest: ''"It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer rsults."'' In other words, it is not hashCode() that you need to test, but whatever uses hashCode(). And that can be tested with a mock object which returns a deterministic hashCode.'' In other words again: "equal() objects must have the same hashCode()" - that's all you need for hashCode() to be correct. So your test could just be to create some equal objects and check that they have the same hashCode()s. The idea is that if two objects have different hash codes then you know they aren't equal, but if they have the same hash code then they ''might be'' the equal (use equal() to find out for sure). For hashCode() to be ''good'' it should usually return different values for non-equal() objects, but that's another kettle of fish! ''Well, then let me redefine the problem. I, as the creator of the class, want non-equal methods to return non-distinct hashCodes. Especially since the only reason I'm defining hashCode() in the first place is to use the class as an element in a Hash''''''Set, and I have absolutely no control over what H''''''ashSet does with the elements internally. So my requirements for hashCode() are more rigid than Sun's requirements. They're still reasonable requirements, though.'' Would it be easier to write your own Set? You may note that the lexical key of your members form a total ordering of the class. That is given u=(s1,s2,s3,d), v=(S1,S2,S3,D), u300); assertTrue(f[3]>300); assertTrue(f[6]>300); } ---- It looks like there is a pattern here trying to crawl out. EncapsulateRandom, perhaps. Deck.shuffle() depends on a random number generator to arrange the cards. So we refactor the RNG into it's own object, and use a mock to replace it in the UnitTest''''''s testShuffle () { deck = new Deck(); deck.rand = new Mock''''''Rand(); // call shuffle, and verify the results. We // should be able to predict the exact ordering // of the deck, because we can predict the exact // sequence generated by Mock''''''Rand } Poker.getHand() depends on Deck.shuffle(), but the randomness is buried so deep inside the Deck that it's extremely hard for me to think about how to test it. So instead, mock the Deck! testGetHand() { poker = new Poker (); poker.deck = new Mock''''''Deck(); // call getHand, and verify the results. Again // we can predict the exact results, because we // can predict the behavior of Deck } MatryoshkaDoll, rinse, and repeat. Ward's test would end up looking like public void testChoosing() { player.board = new Board ("xx xx xx"); player.rand = new Mock''''''Rand() ; int f[] = {0,0,0,0,0,0,0,0,0}; for (int i=0; i < 1000; i++) f[player.choose()]++; // and because we know what the right answers are.... int expected[] = {347,0,0,321,0,0,332,0,0} ; for (int i=0; i < 9; i++) assertEquals( expected[i], f[i] ) ; } I prefer this approach, though it is more work, because this is a UnitTest, a safety harness to ensure that refactoring the code doesn't change its behavior. ---- Ah, this is much much more helpful, thanks. I can definitely see the value of a strategy like this. Maybe it can be stated like this: * Isolate the non-deterministic qualities into as few classes as possible, and test those classes using one of two different methods: * pass it a M''''''ockRandom class, or * call the method a number of times, and test to make sure the distributions are close to what you expect. (Note that this will be easier to code, but there will be occasions when it returns false negatives. * If you have other classes that rely on those non-deterministic classes, keep the non-deterministic behavior out of their tests with liberal use of MockObject''''''s. -- FrancisHwang ---- See UnitTestingRandomness, NonDeterministic ---- CategoryTesting