eiffelroom

articleRandom numbers

mtn's picture

People who are used to other programming languages usually use some kind of rand, rnd or similar function to generate random numbers. This article shows how you can do it in Eiffel.

Generating a random number

The base library's RANDOM class is modeled as a sequence of random numbers. This gives you the well known interface with features such as start, forth and item which are probably also the most useful for a sequence. The following method shows how one uses this class.
new_random: INTEGER
    -- Random integer
    -- Each call returns another random number.
  do
    random_sequence.forth
    Result := random_sequence.item
  end

feature {NONE} -- Implementation

  random_sequence: RANDOM
    -- Random sequence

To get a new random number one first moves the cursor of the list to the next number by invoking forth. And then calls item to actually read a number. This is a well known pattern in Eiffel and is named the command query separation. It takes some time to get used to, but once learned, its usage through Eiffel is pretty consistent. Now the command query separation falls somewhat short when mathematical expressions rely on several random numbers. But that is not a big problem, because as can be seen above it's very easy do write a wrapper function. random_integer returns a new random number each time when it is called.

two_new_dice: TUPLE [INTEGER, INTEGER]
        -- Returns a tuple containing two unrelated random numbers between 1 and 6.
    do
        Result := [new_random \\ 6 + 1, new_random \\ 6 + 1]
    end

Here we can also see why command query separation is a good thing: You could think that random_integer does not change but that it is simply a variable containing the same value. This can in other situations be even more misleading. That is why you should whenever possible apply the command query principle so that you can rely on the fact that no state changes are done by a query.

Initializing the random generator

Very few computers support true random number generators. Usually one uses a mathematical function which produces (hopefully) uncorrelated data. Some real randomness is then introduced by connecting the so called "seed" to a hopefully random event. The seed usually an integer number which is used to initialize the algorithm. In the RANDOM class "set_seed" is available as a creation feature. The question is now, what should the seed be? If you want each time the same random sequence yielding in the exact same behavior for each program execution you can initialize with an arbitrary fixed number, like 73 for example. If you like to have different random number sequences each time one usually does this by taking some kind of time measurement as the seed. This assumes that the program start or better the creation of the time object is actually a more or less random event.

This can be done in the following way

local
      l_time: TIME
      l_seed: INTEGER
    do
         -- This computes milliseconds since midnight.
         -- Milliseconds since 1970 would be even better.
      create l_time.make_now
      l_seed := l_time.hour
      l_seed := l_seed * 60 + l_time.minute
      l_seed := l_seed * 60 + l_time.second
      l_seed := l_seed * 1000 + l_time.milli_second
      create random_sequence.set_seed (l_seed)
    end

ArchivesSize
random.zip1.87 KB
Contents
random.ecf1.09 KB
random_numbers.e953 bytes
readme.txt422 bytes
root_class.e808 bytes
colin-adams's picture

Command-Query Separation principle

Routines `random_integer` and `two_dice' breaks CQS because they return a result and have a side effect.

The way you write them hides the side effect, but if you put an only clause in the postconditions you would see that this is the case.

This leasson should be removed.

Colin Adams

mtn's picture

I discuss the CQS issue in the article.

And I have written code which has performance problems and is unreadable if you apply CQS. For sure these cases are rare, but you should never follow a principle blindly. You might have to make trade offs because if you apply CQS rigorously in cases where the memory footprint is important you might decide otherwise to save space. What do others think? Violating the CQS principle is not really necessary for the purpose of this example. -- mTn-_-|

peter_gummer's picture

Command-Query Separation principle

We all agree that the `random_integer' function violates CQS. To my mind, it does so unnecessarily. As an old-time C programmer, its use in the lesson feels ok to me; but reason and experience tell me it's a bad thing to do, so I would not do it. We can rewrite the `two_dice' function to adhere to CQS, eliminating the side-effect confusion, while keeping it reasonably concise:

two_dice: TUPLE [INTEGER, INTEGER]
            -- A new tuple containing two unrelated random numbers between 1 and 6.
        do
            create Result
            random_sequence.forth
            Result [1] := random_sequence.item \\ 6 + 1
            random_sequence.forth
            Result [2] := random_sequence.item \\ 6 + 1
        end

There's a bit of code duplication here, unfortunately. We can get rid of the duplication with a really funky in-line agent:

two_dice: TUPLE [INTEGER, INTEGER]
            -- A new tuple containing two unrelated random numbers between 1 and 6.
        do
            create Result

            (1 |..| 2).do_all (agent (dice: like two_dice; index: INTEGER)
                do
                    random_sequence.forth
                    dice [index] := random_sequence.item \\ 6 + 1
                end (Result, ?))
        end

Hmmm, the cure might be worse than the code duplication it was intended to fix! But note that the agent is a command, so unlike the lesson's `random_integer' function it doesn't violate CQS. We could extract the inline agent to a separate routine:

two_dice: TUPLE [INTEGER, INTEGER]
            -- A new tuple containing two unrelated random numbers between 1 and 6.
        do
            create Result
            add_die (Result, 1)
            add_die (Result, 2)
        end

    add_die (dice: like two_dice; index: INTEGER)
            -- Add a random number between 1 and 6 to `dice' at `index'.
        do
            random_sequence.forth
            dice [index] := random_sequence.item \\ 6 + 1
        end

This is nicer.

But the `two_dice' function still has a side-effect on `random_sequence'. Is this a violation of CQS? It depends on whether `random_sequence' is part of the abstract state of this class. The lesson places it in the implementation, so I think `two_dice' is ok: it's a concrete side-effect, not an abstract one, so it doesn't violate CQS.

colin-adams's picture

Shaking the dice

Peter, you are right to say that the cure is worse than the problem. That is always the case with an inline agent, without exception (at least, I've never seen one yet).

As you say, the side effect is concrete, not abstract, so it DOES violate CQS.

It also does not model the real-world situation accurately.

In the real world, to get a score from two six-sided dice, we place the two dice in a cup, and shake them. That's a lot of side effects to get a number, so it does not seem appropriate to try to model this with a mathmatical function.

random_number_generator: RANGED_RANDOM
  -- Random number generator

cup: DS_CELL [INTEGER]
  -- Cup for shaking two dice

shake_two_dice (a_cup: DS_CELL [INTEGER]; a_generator: RANGED_RANDOM) is
  -- Roll two dice making the result available in `cup'
 require
  a_cup_not_void: a_cup /= Void
  a_cup_empty: a_cup.item = Void
  random_number_generator_not_void: a_generator /= Void
 local
   l_first, l_second: INTEGER
 do
  l_first := random_sequence.next_item_in_range (1, 6)
  l_second := random_sequence.next_item_in_range (1, 6)
  a_cup.put (l_first + l_second)
 ensure
  dice_roll_result_in_cup: a_cup.item /= Void
 end

 print_dice_roll is
   -- Roll two six-sided dice and print result.
  do
   create cup.make (Void)
   create random_number_generator.make_default
   shake_two_dice (cup, random_number_generator)
   print (cup.item.out + "%N")
  end
Colin Adams

peter_gummer's picture

Command-Query Separation principle

Colin, you wrote, "the side effect is concrete, not abstract, so it DOES violate CQS." The definition of the Command-Query Separation principle on page 751 of OOSC2 states, "Functions should not produce abstract side effects." If we agree that the side effect in `two_dice' is concrete, not abstract, then it doesn't violate OOSC2's definition of CQS.

colin-adams's picture

Concrete and abstract

Sorry, I forgot the OOSC2 terminiology. In OOSC2 terms, the side effect is both concrete and abstract. Note that in OOSC2 terminiology, all abstract side effects are also concrete, by definition.

This one is abstract too. That it is, you can observe the difference by calling any of the functions a second time - you will not see the same result, except by lucky coincidence. Colin Adams

peter_gummer's picture

Concrete and abstract

I don't see any abstract side effect, Colin. See OOSC2, page 752, under the heading "Functions that create objects". The `two_dice' function creates, initialises and returns a new TUPLE object. Returning a new object is not a side-effect, concrete or abstract: see "Definition: concrete side effect" on page 749 of OOSC2. The `two_dice' function does not attach the new TUPLE to an attribute, it simply returns it.

Or is this being too sneaky? The fact that two_dice.is_equal (two_dice) will return false (well, it will usually return false) seems to violate referential transparency. (See OOSC2 page 750.)

Personally I have no qualms about `two_dice'. Given the random nature of `two_dice', a perfectly reasonable postcondition for it might be:

ensure
            no_referential_transparency_because_it_is_random: (<<False, True>>).has (Result.is_equal (two_dice))

colin-adams's picture

Changing state of Current

Calling two_dice changes the observable state of Current - proof - if you call it again, then you (usually) get a different value. Colin Adams

maverick's picture

Abstract Side Effect

It seems a little vague to speak of abstract side effect without specifying a model. In this case, I think there is a model and a justification for each positions. In OOSC, Meyer introduce the random number generator as a traversable object. In this case, the corresponding model is composed of a sequence of numbers (about which the invariant may state that the numbers have certain probabilistic distribution) and an integer indicating what is the next number to be returned to the user. In this case, generating a random number is a side effect and, according to CQS principle, should be implemented as a command.

But another possible model may be only a set of possible numbers or a range of numbers. In this case, the specification of a routine that generate a random numbers may be non deterministic without ever changing the abstract state of an object.

An operation which returns a non-deterministic result is a common thing in B but one could argue that B does not have CQS and does not have a need for it since no operation invocation can be performed in expressions. Back to Eiffel, it certainly is arguable whether non deterministic specification is the right thing for a query but it certainly is legal even according to CQS.

Cheers!

Simon

juliant's picture

RANGED_RANDOM

It's interesting that you use a helper class RANGED_RANDOM with a feature 'next_item_in_range' which clearly violates the CQS. That's exactly what Martin meant when he said that sometimes it is easier to write/use a helper function which does not bother about CQS...

colin-adams's picture

Ah yes, you are right

I didn't notice, which is odd. But that's exactly the problem - if you write classes that violate CQS, then the clients of those classes may not see that they are changing state. It is very often easier to violate CQS - that does not make it right to do so.

So my example needs changing to use RANDOM and to do the multiplication. Colin Adams

juliant's picture

It's not really a problem

It's not really a problem that you did not notice since you knew that the state changed because the feature was named 'next_item' which implies that it is not the current item in the sequence but the next. Although the CQS principle was violated, you could use the feature wihtout any problems. The same goes with 'random_integer' which actually states that the call to this query is random, but maybe it would better be named 'next_random_integer' to be even more obvious.

I'm not against CQS, but I think it can be justified to ignore it if the code gets more readable or if there are other reasons like performance or interfacing with low level devices.

colin-adams's picture

Not for beginners (or experts)

Neither performance nor interfacing with low level devices occurs here. In any case, we shouldn't be teaching beginners to ignore CQS. This article needs removing (then re-written from scratch, perhaps). Colin Adams

What is the question?

The main principle of the CQS is: Asking a question should not change the answer. But: we're talking about a random number generator! If I ask for a random number, I WANT a different answer every time I ask.

Let's look at the whole thing a little differently: Is it wrong to write a class that gives me a random number every time I ask for one? Certainly not. So my class looks as follows:

deferred
  MY_RANDOM

feature
  new_random: DOUBLE is
      -- returns a random number every time you ask for one
    deferred
    end

Now I can have different implementations for MY_RANDOM. One might use the current time to give me a random, one might do something else. And: one might even use RANDOM to implement new_random. Of course, by doing that, it violates CQS (like in the article). So what? The point is that nobody has access to the implementation. That's the same in the article: Nobody has access to random_sequence.
So should I not be able to use RANDOM to implement MY_RANDOM? I think it's wrong to tell me I shouldn't be able to use RANDOM, just because it's violating CQS. And I see no difference between implementing MY_RANDOM using RANDOM and the feature described in the article.

Another reason I like the approach: With MY_RANDOM, you can easily call a feature which takes four random values:

r: MY_RANDOM
...
initialize (r.new_random, r.new_random, r.new_random, r.new_random)

How would this feature be called using RANDOM without violating CQS? One could store four randoms in four local variables, or store them in an array using a loop for example. Certainly, that's possible. But readability would suffer extremely from that IMO. And how is one going to change his code to use another random number generator, one that doesn't use a sequence to produce its random numbers? One would have to change code in lots of places, instead of just one. When I want a random number, I don't care if the random number is generated by a sequence or not, I just want a random number. That's exactly what the feature in the article or the class MY_RANDOM provides me.

So, in my opinion, the article does not needed to be removed. Together with the comments, readers will have a good sense of what the problems with the approach might be, and in which cases they should not follow CQS blindly.

mtn's picture

I made small modifications

As an outcome of this discussion I slightly adapted the article: `random_integer' is now called `new_random' and `two_dice' is renamed into `two_new_dice'. All the other content remained unchanged since the first publication.

After all I think there is a reason why it is called the CQS principle and not the CQS law. I know that some of you feel like this is not enough, but the comment sequence will stay online and the reader can make his own picture.

Thanks for all the comments!

-- mTn-_-|

colin-adams's picture

Selective quoting of OOSC2

I am somewhat bemused by the apparent appeal to OOSC2 (as if it were holy scripture) in order to support the case for making random numbers an exception to the CQS principal. This is highly selective quoting.

For a start, on p. 758, Professor Meyer says: "every object-oriented developer should apply this principle without exception".

Even more bemusing is that back on page 754-755, he explicitly gives the example of random numbers.

A principle that must be followed without exception is, in effect, a law.

Colin Adams

My summary

Together with my comment "What is the question?", I think the whole discussion comes down to the question if you agree with the following statement or not:

"A random number does not mean much by itself; it must be understood in relation to its predecessors and successors in the sequence" (Meyer, OOSC2, p. 755).

If you agree with the statement, then you should apply CQS also for random numbers like it is described in OOSC2, using

random: RANDOM
...
random.forth
l_item := random.item

If you disagree, then you should be able to write a feature next_random. And as soon as you have that, it doesn't matter how you implement it (therefore you can also implement it using RANDOM or change your implementation later). I for my part disagree with Meyer's statement, that's why I'm using next_random. If you agree with the statement, you shouldn't use the feature.

Quote from Colin:
A principle that must be followed without exception is, in effect, a law
I agree, but the quote you gave says "every object-oriented developer SHOULD apply this principle without exception", not "every object-oriented developer MUST apply this principle without exception" :)

about - contact