|

Replication schemes

Posted on Sunday, 16 January 2011 at 16:48.

A new package ReplicateEffects is available, exposing module Control.Replicate. The source code is available on GitHub. In this post I will explain what it does and how to use it.

Module Control.Applicative not only defines the Applicative and Alternative type classes, it also offers some useful combinators to express how often an action should be run: many takes an action and runs it zero or more times, collecting the results in a list. Function some does the same but performs the action at least once. Finally, there is optional which performs its argument action zero or one time, returning a Maybe value.

Module Control.Replicate separates such replication schemes from the actual action. It, too, defines a many, some, and opt, but to actually run an action x that many times, we give the scheme and the action in question to the run operator *! (read: times) like so: many *! x, some *! x, opt *! x.

Why is this useful? Well, it turns out that these replication schemes are highly composable, in standard ways. They themselves are instances of Applicative, Category and Alternative. With these combinators, we can sum them, multiply them, and indicate choice, respectively. Let’s look at some examples.

Atomic building blocks

The primitive, atomic building blocks are zero and one. If we pass these to *!, the action is not run at all, or run exactly once. Their types are:

zero :: b -> Replicate a b
one  :: Replicate a a
(*!) :: Alternative f => Replicate a b -> f a -> f b

The schemes are represented by type constructor Replicate. If we look at the type of *!, we can see that Replicate‘s first type parameter indicates the result type of the action, while its second type parameter indicates the type of the result after running that action so many times. In the case of one, the two arguments are identical. In the case of zero, the scheme will not run the action at all but it still needs to produce a b. This is why zero takes an argument of type b.

Summing schemes

Schemes are Applicative. We can create two and three, the schemes that run an action exactly two and three times, using one as building block:

two :: Replicate a (a, a)
two = (,) <$> one <*> one

three :: Replicate a (a, a, a)
three = (,,) <$> one <*> one <*> one

Look at their result types: the tuples indicate precisely how many times the action is run. You can read <*> as plus: 2 = 1 + 1, 3 = 1 + 1 + 1.

Of course pure is also defined for schemes. The identity element of addition is 0, and this is exactly what pure means. It is a synonym for the zero we saw earlier.

Multiplying schemes

Schemes also form a Category. We can use it to multiply them. Here are two examples:

twiceThree :: Replicate a ((a, a, a), (a, a, a))
twiceThree = two . three

thriceTwo :: Replicate a ((a, a), (a, a), (a, a))
thriceTwo = three . two

In both cases an action is run six times, but their results are nested differently. We will see another multiplication example in a moment.

The identity element for multiplication is 1. Scheme one exactly matches the type of function id in the Category type class.

Choice

Until now the examples have only seen schemes for running an action exactly so many times. But schemes are Alternative and can encode multiple frequencies. This is how opt, the scheme that runs an action zero or one times, is defined:

opt :: Replicate a (Maybe a)
opt = zero Nothing <|> Just <$> one

Schemes many and some also use choice:

many :: Replicate a [a]
many = zero [] <|> some

some :: Replicate a [a]
some = (:) <$> one <*> many

We now have many ways to combine replication schemes, and if we use choice together with sums or products, it’s not always immediately clear what the resulting scheme means. That’s why the module also exposes a function sizes which lists the frequencies a scheme allows:

> sizes one
[1]
> sizes two
[2]
> sizes opt
[0,1]
> take 10 (sizes many)
[0,1,2,3,4,5,6,7,8,9]
> take 10 (sizes some)
[1,2,3,4,5,6,7,8,9,10]

In this sense, the schemes encode sets of Peano literals, and <|> computes the union of two sets.

Now it’s also clear what the empty scheme is: the scheme that doesn’t allow an action to occur with any frequency; not even zero times.

> sizes empty
[]

As promised, another example that uses multiplication:

even :: Replicate a [(a, a)]
even = many . two

> take 10 (sizes even)
[0,2,4,6,8,10,12,14,16,18]

This scheme allows all even occurrences of an action, and its type reflects that exactly: there is no way to capture an odd number of as in [(a, a)].

Dice

Another combinator available in the module is between :: Int -> Int -> Replicate a [a], which limits the frequency of an action to a lower and upper bound:

> sizes (between 5 10)
[5,6,7,8,9,10]

What frequencies does (,) <$> between 3 5 <*> two allow? Let’s check:

> sizes ((,) <$> between 3 5 <*> two)
[5,6,7]

This makes sense: if we run an action 3, 4 or 5 times and then another two times, we’ve run it 5, 6 or 7 times.

What does between 7 9 . three mean? What about three . between 7 9?

> sizes (between 7 9 . three)
[21,24,27]
> sizes (three . between 7 9)
[21,22,23,24,25,26,27]

If the schemes become a bit more involved, it can be helpful to think about them as dice throws. Then between 7 9 . three means: throw a die with 7, 8 and 9 eyes on it, and use the outcome to decide how many times to throw a die with exactly 3 eyes. This has possible outcomes [9,12,15].

In the other case, we throw die between 7 9 three times, ending up with the full range 21-27 as possible outcomes.

Open questions

Currently if you try to evaluate sizes (many . opt), the program hangs. This is true not for just opt but for any scheme that allows frequency zero. Is there a bug in the definitions of the combinators, or is it unreasonable to expect the library to produce output in this case?

Another problem is that sizes (id . r) takes longer than just r for no apparent good reason. (Try r = exactly 1000.) Perhaps some profiling will show what the problem here is.

Comments

By Twan van Laarhoven on Monday, 17 January 2011 at 00:02:

It seems that `sizes (exactly n)` already needs time quadratic in n. I think this is caused by the call to fmap, there will be n calls to fmap, each of which runs over the entire list of n things. Other things exhibit even worse scaling, such as `sizes (some . some)`

It might be better to do fmap lazily somehow, for example by storing in `Cons` a function that will be applied to the final result. Something like

data Replicate a b = Nil | forall c. Cons (c -> b) (Maybe c) (Replicate a (a -> c))

I have not yet tried whether this will work.

Also note that a value of type `Replicate a b` is equivalent to a subset of the naturals, plus for each element of this set a function of that many as to b. Your type already tries to capture this, but maybe the function and the subset could be stored separately?

By Dean Herington on Wednesday, 02 February 2011 at 20:13:

The following, from your blog entry:

> sizes (between 7 9 . three)
[9,12,15]

didn’t seem right. Sure enough, with your package:

> sizes (between 7 9 Control.Category.. three)
[21,24,27]

By Martijn on Monday, 23 May 2011 at 12:34:

Thanks, Dean! Fixed it.

By Martijn on Monday, 23 May 2011 at 12:35:

Thank you Twan, that does seem to work. :-)

Leave a comment!

Martijn loves to receive comments! Add yours by filling out the fields below.