Hello everyone. I am Jan,

and this talk will be about. the package called fipp

that we recently released and. it has to do with

the Bayesian Mixture Modeling. in clustering applications.

So, if you are not into Bayesian. Modeling or clustering,

this might not be the talk. for you.

Now, Bayesian Mixture Models. involves three steps usually.

And the first step is

a probabilistic generation. of random partitions

and this can come from. a Probability Distribution

or Stochastic Processes . and its discretization.

And then, the second step is . an allocation of observations

into these blocks.

And then, the third step would be. learning of these parameters.

So, here the parameter could be. those that are specific to each

cluster as well as those . that are common across clusters,

that sort of guides. the probabilistic

generation of random partitions. in step 1.

And each partition has a unique. Young diagram representation.

So, this partition corresponds. to this Young diagram,

and the way to read this is each. row corresponds to each block,

and the number of columns. within each row

corresponds to the number of balls. within each partition.

So, that’s why this . and this is the same thing.

And a prior on partition . essentially takes this Young

diagram and returns a probability

for all possible Young . diagrams of sample size N,

in this case N=6.

Now, let's say that . we know a-priori

that observation Y1, Y2 and Y3, . Y4 should likely

belong to the same cluster.

And then, let's say the MAP, . Maximum a posteriori allocation,

looks like this, and did we learn. this from data?

More like the posterius, so . we learn something from data.

But the role of data may not be. that much in some cases.

And this chosen partition has. a Young representation --

Young diagram representation. as follows.

And other potential partitions are. this and this and this

and the reason is because,. you know,

Y1 and Y2 and Y3 and Y4 has to be. in the same partition.

So, you get these squares.

Now, and depending on the model. and the hyperparameter you choose,

it could be that you're assigning. a much higher prior

probabilities on this partition. rather than these.

And that would mean that a-priori

you're assigning a very high . weight on this compared to these.

And then, even if you try . various allocations,

even a posteriori,

your likelihood might be. overwhelmed by your prior.

And you're just given . two high posterior

probability on this partition.

Then what you're doing is that,. you know,

you're comparing . this allocation to, you know,

this allocation, . all others are the same.

And you're picking this allocation. as the maximum

a posteriori estimate.

So, you're not really using. data that much.

You're only using data to compare. this to this.

So, you don't want to end up. in a situation like this.

And in order to prevent. this from happening,

And you can generate all possible. Young diagrams and then

compute these probabilities.

And this is possible if you only. have six observations

or something, because they are. only 11 possible Young diagrams.

Unless, if you have like. 100 observations,

which shouldn't be that unusual,

then there would be this . in the Young diagram.

So, we can't do this approach.

And this is where the fipp package. comes into handy,

because we take an alternative. approach and what we do is to

consider computing a symmetric . and additive functional over

all possible prior partitions.

And one function that we find . particularly

useful is called relative entropy,

and it basically quantifies . evenness of partition sizes or

partitions that are kind of even,

it returns a value -- we will get. a value 1,

which is the maximum.

And the partitions . that are uneven,

we'll get values that are close. to 0, like this and this.

And this is how fipp . package can be used.

So, N is the sample size.

Here we plug in 6, so we don't. really need a fipp package.

And Kplus is the constraint

that we have unfortunately put on,

and this is the length of . the partition,

which corresponds to role. of the Young diagram.

So here we are fixing that the 3,

which means that we are only. constraining

these three partitions.

And then,

then we also supply the type.. This is the model.

Here we supply Dirichlet . Process Mixture,

and alpha is the concentration. parameters of this model,

which we set to 1.

And then the mean we get is 0.87.

And this is a bit higher than

the arithmetic mean of this . and this and this,

which means that the DPM with. the alpha equals to 1,

assigns slightly higher prior . weight on this partition

as opposed to these,. but not too much.

So, we're not likely in this . situation at least.

And we can do more advanced. thing with the fipp package

like comparing DPM with the -- . its discretized version,

this is not normal distributions. with different prior

distributions on the number. of clusters and partitions.

But this is something advanced,

and I don't have the time. to introduce.

So, if you're interested in,

feel free to contact us and. we can provide more details.

And thank you for listening.