articleRandom numbers

mtn's picture

People who are used to other programming languages usually use some kind of rand, rnd or similar function to generate random numbers. This article shows how you can do it in Eiffel.

Generating a random number

The base library's RANDOM class is modeled as a sequence of random numbers. This gives you the well known interface with features such as start, forth and item which are probably also the most useful for a sequence. The following method shows how one uses this class.

new_random: INTEGER
    -- Random integer
    -- Each call returns another random number.
  do
    random_sequence.forth
    Result := random_sequence.item
  end
 
feature {NONE} -- Implementation
 
  random_sequence: RANDOM
    -- Random sequence

To get a new random number one first moves the cursor of the list to the next number by invoking forth. And then calls item to actually read a number. This is a well known pattern in Eiffel and is named the command query separation. It takes some time to get used to, but once learned, its usage through Eiffel is pretty consistent. Now the command query separation falls somewhat short when mathematical expressions rely on several random numbers. But that is not a big problem, because as can be seen above it's very easy do write a wrapper function. random_integer returns a new random number each time when it is called.

two_new_dice: TUPLE [INTEGER, INTEGER]
        -- Returns a tuple containing two unrelated random numbers between 1 and 6.
    do
        Result := [new_random \\ 6 + 1, new_random \\ 6 + 1]
    end

Here we can also see why command query separation is a good thing: You could think that random_integer does not change but that it is simply a variable containing the same value. This can in other situations be even more misleading. That is why you should whenever possible apply the command query principle so that you can rely on the fact that no state changes are done by a query.

Initializing the random generator

Very few computers support true random number generators. Usually one uses a mathematical function which produces (hopefully) uncorrelated data. Some real randomness is then introduced by connecting the so called "seed" to a hopefully random event. The seed usually an integer number which is used to initialize the algorithm. In the RANDOM class "set_seed" is available as a creation feature. The question is now, what should the seed be? If you want each time the same random sequence yielding in the exact same behavior for each program execution you can initialize with an arbitrary fixed number, like 73 for example. If you like to have different random number sequences each time one usually does this by taking some kind of time measurement as the seed. This assumes that the program start or better the creation of the time object is actually a more or less random event.

This can be done in the following way

    local
      l_time: TIME
      l_seed: INTEGER
    do
         -- This computes milliseconds since midnight.
         -- Milliseconds since 1970 would be even better.
      create l_time.make_now
      l_seed := l_time.hour
      l_seed := l_seed * 60 + l_time.minute
      l_seed := l_seed * 60 + l_time.second
      l_seed := l_seed * 1000 + l_time.milli_second
      create random_sequence.set_seed (l_seed)
    end

ArchivesSize
random.zip1.87 KB
Contents
random.ecf1.09 KB
random_numbers.e953 bytes
readme.txt422 bytes
root_class.e808 bytes

Comments

Command-Query Separation principle

colin-adams's picture

Routines `random_integer` and `two_dice' breaks CQS because they return a result and have a side effect.

The way you write them hides the side effect, but if you put an only clause in the postconditions you would see that this is the case.

This leasson should be removed.

Colin Adams

I discuss the CQS issue in the article.

mtn's picture

And I have written code which has performance problems and is unreadable if you apply CQS. For sure these cases are rare, but you should never follow a principle blindly. You might have to make trade offs because if you apply CQS rigorously in cases where the memory footprint is important you might decide otherwise to save space. What do others think? Violating the CQS principle is not really necessary for the purpose of this example. -- mTn-_-|

Command-Query Separation principle

peter_gummer's picture

We all agree that the `random_integer' function violates CQS. To my mind, it does so unnecessarily. As an old-time C programmer, its use in the lesson feels ok to me; but reason and experience tell me it's a bad thing to do, so I would not do it. We can rewrite the `two_dice' function to adhere to CQS, eliminating the side-effect confusion, while keeping it reasonably concise:

	two_dice: TUPLE [INTEGER, INTEGER]
			-- A new tuple containing two unrelated random numbers between 1 and 6.
		do
			create Result
			random_sequence.forth
			Result [1] := random_sequence.item \\ 6 + 1
			random_sequence.forth
			Result [2] := random_sequence.item \\ 6 + 1
		end

There's a bit of code duplication here, unfortunately. We can get rid of the duplication with a really funky in-line agent:

	two_dice: TUPLE [INTEGER, INTEGER]
			-- A new tuple containing two unrelated random numbers between 1 and 6.
		do
			create Result
 
			(1 |..| 2).do_all (agent (dice: like two_dice; index: INTEGER)
				do
					random_sequence.forth
					dice [index] := random_sequence.item \\ 6 + 1
				end (Result?))
		end

Hmmm, the cure might be worse than the code duplication it was intended to fix! But note that the agent is a command, so unlike the lesson's `random_integer' function it doesn't violate CQS. We could extract the inline agent to a separate routine:

	two_dice: TUPLE [INTEGER, INTEGER]
			-- A new tuple containing two unrelated random numbers between 1 and 6.
		do
			create Result
			add_die (Result, 1)
			add_die (Result, 2)
		end
 
	add_die (dice: like two_dice; index: INTEGER)
			-- Add a random number between 1 and 6 to `dice' at `index'.
		do
			random_sequence.forth
			dice [index] := random_sequence.item \\ 6 + 1
		end

This is nicer.

But the `two_dice' function still has a side-effect on `random_sequence'. Is this a violation of CQS? It depends on whether `random_sequence' is part of the abstract state of this class. The lesson places it in the implementation, so I think `two_dice' is ok: it's a concrete side-effect, not an abstract one, so it doesn't violate CQS.

Shaking the dice

colin-adams's picture

Peter, you are right to say that the cure is worse than the problem. That is always the case with an inline agent, without exception (at least, I've never seen one yet).

As you say, the side effect is concrete, not abstract, so it DOES violate CQS.

It also does not model the real-world situation accurately.

In the real world, to get a score from two six-sided dice, we place the two dice in a cup, and shake them. That's a lot of side effects to get a number, so it does not seem appropriate to try to model this with a mathmatical function.

random_number_generator: RANGED_RANDOM
  -- Random number generator
 
cup: DS_CELL [INTEGER]
  -- Cup for shaking two dice
 
shake_two_dice (a_cup: DS_CELL [INTEGER]; a_generator: RANGED_RANDOM) is
  -- Roll two dice making the result available in `cup'
 require
  a_cup_not_void: a_cup /= Void
  a_cup_empty: a_cup.item = Void
  random_number_generator_not_void: a_generator /= Void
 local
   l_first, l_second: INTEGER
 do
  l_first := random_sequence.next_item_in_range (1, 6)
  l_second := random_sequence.next_item_in_range (1, 6)
  a_cup.put (l_first + l_second)
 ensure
  dice_roll_result_in_cup: a_cup.item /= Void
 end
 
 print_dice_roll is
   -- Roll two six-sided dice and print result.
  do
   create cup.make (Void)
   create random_number_generator.make_default
   shake_two_dice (cup, random_number_generator)
   print (cup.item.out + "%N")
  end
Colin Adams

Command-Query Separation principle

peter_gummer's picture

Colin, you wrote, "the side effect is concrete, not abstract, so it DOES violate CQS." The definition of the Command-Query Separation principle on page 751 of OOSC2 states, "Functions should not produce abstract side effects." If we agree that the side effect in `two_dice' is concrete, not abstract, then it doesn't violate OOSC2's definition of CQS.

Concrete and abstract

colin-adams's picture

Sorry, I forgot the OOSC2 terminiology. In OOSC2 terms, the side effect is both concrete and abstract. Note that in OOSC2 terminiology, all abstract side effects are also concrete, by definition.

This one is abstract too. That it is, you can observe the difference by calling any of the functions a second time - you will not see the same result, except by lucky coincidence. Colin Adams

Concrete and abstract

peter_gummer's picture

I don't see any abstract side effect, Colin. See OOSC2, page 752, under the heading "Functions that create objects". The `two_dice' function creates, initialises and returns a new TUPLE object. Returning a new object is not a side-effect, concrete or abstract: see "Definition: concrete side effect" on page 749 of OOSC2. The `two_dice' function does not attach the new TUPLE to an attribute, it simply returns it.

Or is this being too sneaky? The fact that two_dice.is_equal (two_dice) will return false (well, it will usually return false) seems to violate referential transparency. (See OOSC2 page 750.)

Personally I have no qualms about `two_dice'. Given the random nature of `two_dice', a perfectly reasonable postcondition for it might be:

		ensure
			no_referential_transparency_because_it_is_random: (<<False, True>>).has (Result.is_equal (two_dice))

Changing state of Current

colin-adams's picture

Calling two_dice changes the observable state of Current - proof - if you call it again, then you (usually) get a different value. Colin Adams

Abstract Side Effect

maverick's picture

It seems a little vague to speak of abstract side effect without specifying a model. In this case, I think there is a model and a justification for each positions. In OOSC, Meyer introduce the random number generator as a traversable object. In this case, the corresponding model is composed of a sequence of numbers (about which the invariant may state that the numbers have certain probabilistic distribution) and an integer indicating what is the next number to be returned to the user. In this case, generating a random number is a side effect and, according to CQS principle, should be implemented as a command.

But another possible model may be only a set of possible numbers or a range of numbers. In this case, the specification of a routine that generate a random numbers may be non deterministic without ever changing the abstract state of an object.

An operation which returns a non-deterministic result is a common thing in B but one could argue that B does not have CQS and does not have a need for it since no operation invocation can be performed in expressions. Back to Eiffel, it certainly is arguable whether non deterministic specification is the right thing for a query but it certainly is legal even according to CQS.

Cheers!

Simon

RANGED_RANDOM

juliant's picture

It's interesting that you use a helper class RANGED_RANDOM with a feature 'next_item_in_range' which clearly violates the CQS. That's exactly what Martin meant when he said that sometimes it is easier to write/use a helper function which does not bother about CQS...

Ah yes, you are right

colin-adams's picture

I didn't notice, which is odd. But that's exactly the problem - if you write classes that violate CQS, then the clients of those classes may not see that they are changing state. It is very often easier to violate CQS - that does not make it right to do so.

So my example needs changing to use RANDOM and to do the multiplication. Colin Adams

It's not really a problem

juliant's picture

It's not really a problem that you did not notice since you knew that the state changed because the feature was named 'next_item' which implies that it is not the current item in the sequence but the next. Although the CQS principle was violated, you could use the feature wihtout any problems. The same goes with 'random_integer' which actually states that the call to this query is random, but maybe it would better be named 'next_random_integer' to be even more obvious.

I'm not against CQS, but I think it can be justified to ignore it if the code gets more readable or if there are other reasons like performance or interfacing with low level devices.

Not for beginners (or experts)

colin-adams's picture

Neither performance nor interfacing with low level devices occurs here. In any case, we shouldn't be teaching beginners to ignore CQS. This article needs removing (then re-written from scratch, perhaps). Colin Adams

What is the question?

The main principle of the CQS is: Asking a question should not change the answer. But: we're talking about a random number generator! If I ask for a random number, I WANT a different answer every time I ask.

Let's look at the whole thing a little differently: Is it wrong to write a class that gives me a random number every time I ask for one? Certainly not. So my class looks as follows:

deferred
  MY_RANDOM
 
feature
  new_random: DOUBLE is
      -- returns a random number every time you ask for one
    deferred
    end

Now I can have different implementations for MY_RANDOM. One might use the current time to give me a random, one might do something else. And: one might even use RANDOM to implement new_random. Of course, by doing that, it violates CQS (like in the article). So what? The point is that nobody has access to the implementation. That's the same in the article: Nobody has access to random_sequence. So should I not be able to use RANDOM to implement MY_RANDOM? I think it's wrong to tell me I shouldn't be able to use RANDOM, just because it's violating CQS. And I see no difference between implementing MY_RANDOM using RANDOM and the feature described in the article.

Another reason I like the approach: With MY_RANDOM, you can easily call a feature which takes four random values:

r: MY_RANDOM
...
initialize (r.new_random, r.new_random, r.new_random, r.new_random)

How would this feature be called using RANDOM without violating CQS? One could store four randoms in four local variables, or store them in an array using a loop for example. Certainly, that's possible. But readability would suffer extremely from that IMO. And how is one going to change his code to use another random number generator, one that doesn't use a sequence to produce its random numbers? One would have to change code in lots of places, instead of just one. When I want a random number, I don't care if the random number is generated by a sequence or not, I just want a random number. That's exactly what the feature in the article or the class MY_RANDOM provides me.

So, in my opinion, the article does not needed to be removed. Together with the comments, readers will have a good sense of what the problems with the approach might be, and in which cases they should not follow CQS blindly.

I made small modifications

mtn's picture

As an outcome of this discussion I slightly adapted the article: `random_integer' is now called `new_random' and `two_dice' is renamed into `two_new_dice'. All the other content remained unchanged since the first publication.

After all I think there is a reason why it is called the CQS principle and not the CQS law. I know that some of you feel like this is not enough, but the comment sequence will stay online and the reader can make his own picture.

Thanks for all the comments!

-- mTn-_-|

Selective quoting of OOSC2

colin-adams's picture

I am somewhat bemused by the apparent appeal to OOSC2 (as if it were holy scripture) in order to support the case for making random numbers an exception to the CQS principal. This is highly selective quoting.

For a start, on p. 758, Professor Meyer says: "every object-oriented developer should apply this principle without exception".

Even more bemusing is that back on page 754-755, he explicitly gives the example of random numbers.

A principle that must be followed without exception is, in effect, a law.

Colin Adams

My summary

Together with my comment "What is the question?", I think the whole discussion comes down to the question if you agree with the following statement or not:

"A random number does not mean much by itself; it must be understood in relation to its predecessors and successors in the sequence" (Meyer, OOSC2, p. 755).

If you agree with the statement, then you should apply CQS also for random numbers like it is described in OOSC2, using

random: RANDOM
...
random.forth
l_item := random.item

If you disagree, then you should be able to write a feature next_random. And as soon as you have that, it doesn't matter how you implement it (therefore you can also implement it using

RANDOM<e> or change your implementation later).
 
I for my part disagree with Meyer's statement, that's why I'm using <e>next_random
. If you agree with the statement, you shouldn't use the feature.

Quote from Colin: A principle that must be followed without exception is, in effect, a law I agree, but the quote you gave says "every object-oriented developer SHOULD apply this principle without exception", not "every object-oriented developer MUST apply this principle without exception" :)

Re: My summary

"A random number does not mean much by itself; it must be understood in relation to its predecessors and successors in the sequence" (Meyer, OOSC2, p. 755). Well, that's my laugh for the day.

Of course a random number doesn't mean anything, that's pretty much the definition of a random number, but it doesn't mean any more in relation to its predecessors or successors. There are no absolute (non-probabilistic) rules governing what a sequence of random numbers will be like. If there were any such rules, then the sequence in question would not be random. And any probabilistic 'rules' that we might apply would actually be in an attempt to check that the sequence was not meaningful, or adhering to any rules in any obvious way (and such 'rules' would be somewhat arbitrary, in that they would merely be checking for the absence of any particular patterns that we might consider obvious). Meyer is absolutely 100% wrong to apply CQS to random numbers in the way he does, and it makes CQS look bad.

Syndicate content