辅导案例-MATH 4430/6602
York University MATH 4430/6602 3.0AF (Stochastic Processes) Class Project, due December 2, 2020 MATH 4430 students should form groups of 3-4 people. MATH 6602 students should work individually. Try to make sure that somebody in each group has some experience coding. Everybody in each group is expected to understand everything the group does, so you are expected to talk to each other and explain your contributions to each other. Each group should select one of the following three projects. You will need to carry out Monte Carlo simulations using a modern program- ming language. I suggest you use R, which is a versatile and widely-used pro- gramming language designed for statistics and simulation. It is open source and free to download, either as R or R-Studio. The built-in documentation is good, and there are many other introductions available on-line. If your group has a really strong preference for using some other programming language (eg Python, C++, Matlab) then you may do so. But even if you are proficient in one of those languages, this would still be a good opportunity to try out R. References: There is a bit of information about the Metropolis Algorithm and the Gibbs Sampler in Chapter 2.6 of Rosenthal (though he chooses a rather dif- ferent special case of the Gibbs sampler than I do). Much more detail (and an introduction to R) can be found in Introducing Monte Carlo Methods with R by Robert and Casella (Springer 2010), which is available electronically through the York Library. I personally like Neal Madras’s book Lectures on Monte Carlo Methods (AMS and Fields Institute, 2001), but you may not be able to find that in electronic form. 1. Metropolis Algorithm for expectations 1.1. Introduction. The Metropolis algorithm gives a way of producing samples from a target probability distribution µ on a discrete state space U with all µi > 0. It is used in situations where it is di cult or impossible to directly sample from µ (the way you would if you were generating random numbers from a normal distribution). It uses an irreducible transition matrix Q on U that is symmetric (ie Qij = Qji for each i, j). Any such Q will work, but the art of using Metropolis lies in choosing a Q that is likely to make the algorithm work quickly. We define new transition probabilities as follows pij = ( qij min(1, µj/µi), j 6= i 1 Pk 6=i qikmin(1, µk/µi), j = i. 1 2This looks complicated, so let me describe it a dynamic way: Starting from i use Q to propose a state j to move to. If µj µi then accept that move. If µj < µi then flip a biased coin so that we accept the move with probability µjµi < 1 and reject the move otherwise (and just stay at state i). Or, what amounts to the same thing, generate a uniform random number U on [0, 1] and accept the move if U µjµi . For example, if all the µi’s are equal, then the target distribution is uniform on U . In that case, P = Q and there is no rejection. Because Q is symmetric, it is reversible with respect to the uniform distribution, so µ is an invariant distribu- tion. You may find it strange to think that sampling from a uniform distribution could be hard, but if U is a large complicated set (so large that we can’t count its elements easily), this may in fact be the case. Note that we don’t actually need to know the values of µi, but just the ratios µj µi . So in the uniform case, we know these ratios = 1, even if we can’t count the size of U precisely in order to find µi = 1 #U . The following refer to the general Metropolis algorithm, described above. First task: Write out a proof that P is reversible with respect to µ. Conclude that µ is an invariant probability distribution for P . Second task: Write out a proof that P is irreducible, and that if µ isn’t perfectly uniform then P is aperiodic. [Hint: Show that if i! j under Q then i! j under P . For aperiodicity, consider a site where a transition could be rejected.] Therefore if you pick any state to start in and run the chain for a long time n then the random variable Xn will be sample from approximately the distribution µ. Going further, Xn, Xn+1, . . . , Xn+m will all have distributions approximately µ. They won’t be independent, but if m is also large, then we can still show thatX i f(i)µi ⇡ 1 m+ 1 mX k=0 f(Xk). So expectations with respect to the distribution µ can be worked out this way. This often works well in practice, even though it is sometimes hard to get rigorous bounds on how large n and m must be for the answers to be reliable. 1.2. Travelling Salesman Problem. Suppose we have N cities located at the vertices of a graph. The graph has an edge between cities if it is possible to travel between them, and we label the edge with the travel time. A (Hamiltonian) circuit is a sequence of cities that starts and ends at some city, visits that city only at the beginning and end, and visits every other city exactly once. The classic Travelling Salesman Problem is to find the shortest such circuit. When N is large, this is hard. Our problem will be a bit di↵erent: I want you to find the average travel time of circuits. In other words, the expected travel time for a circuit chosen uniformly from all circuits. Use Metropolis with µ = 1 (so no rejection). 3If every pair of cities is connected by an edge, we wouldn’t actually require MCMC, because sampling uniformly from Hamiltonian circuits in a complete graph isn’t hard. But if the graph is not complete, then that is no longer true. So what I want you to do is the following: (1) Label 20 cities as 1, 2, . . . , 20. (2) Pick 5 pairs of cities {i, j} for which direct travel between i and j will be forbidden (eg because of border restrictions or because there are no flights between those cities). (3) Generate random positive distances between the other 185 pairs of cities {i, j} with i 6= j (and explain why is it 185). You should generate these using random sampling from some distribution on the positive reals, with a density. (4) Choose some large n and m and use Metropolis to sample uniformly from the circuits, and compute the average travel time (so f applied to a circuit is the travel time). (5) Check if your choice of n and m is robust enough, by trying this with 2n and 2m instead, and seeing if your answers change much. If they do, try again with larger values of n or m. Make sure you use the same travel distances as before. Of course, you need to start with a Q. I suggest that you use the following. Take your state space to be sequences of 20 cities without any repetitions. If i is such a sequence, then choose two of those cities at random and interchange them (unless that would make us travel between one of the 5 forbidden pairs). For example, if you pick cities 7 and 9, and they occur 3rd and 6th in the list respectively, then the new proposed state will have city 7 as the 6th and city 9 as 3rd. You will have to check that this doesn’t introduce direct travel between one of the 5 forbidden pairs. If it does then you will reject the interchange, ie Q will leave the sequence unchanged this step. Third task: Check that this Q is symmetric There is a problem here, namely whether Q is irreducible or not. On the complete graph this chain would be, and the algorithm works well. On some graphs it isn’t (for example, this approach fails badly if the graph is a lattice). In practice it is hard to confirm irreducibility one way or the other. I think it is unlikely that you’ll run into problems with the numbers I’ve chosen, but one never knows. So just try it and hope for the best! Fourth task: Code this up, carry out the above steps, and then write up a short report (3-5 pages). Explain the problem, answer the questions I posed as the first three tasks, explain what you did, and summarize your results, including what parameters you settled on and how long the computations took. Include your code as an appendix. Then e-mail me the report. 42. Metropolis Algorithm for Optimization 2.1. Introduction: This is the same as in the previous section, so read that and carry out the first two tasks. 2.2. Knapsack Problem. This is another classic optimization problem. You have N objects. Object i is worth vi and weighs wi. You must select which ones of these to put into a knapsack. You want to maximize the value of the selected objects, but there is a constraint. Namely that the total weight cannot exceed some number W . In other words, we want max{ X i2I vi | I ⇢ {1, . . . , N}, X i2I wi W}. If N is large, this is hard. In particular, if N is so large that you can’t just compute all the possibilities and pick the best, you need to do something di↵erent. We’ll try instead to compute a random list of possibilities to try, that stands a reasonable chance of finding the best value, or at least, of coming close to it. The answer you’ll report will be the best choice among the ones you’ve seen. Your states will be N -tuples of 0’s and 1’s (z1, z2, . . . , zN). zi = 0 means that the ith object isn’t in the knapsack. zi = 1 means it is. Let S be the set of states satisfying the weight constraint, ie such that P i ziwi W . The Q chain will pick i uniformly from {1, . . . , N} and will try changing zi to 1 zi (ie if the ith object is in, we take it out, and otherwise we put it in). If this violates the constraint on total weight (ie if it tries to move outside S) then the Q chain rejects the move, and the state stays unchanged for that step. Third task: Check that Q is symmetric and irreducible as a Markov chain on S. Of course, not every way of choosing random samples is going to work out in a reasonable amount of time. There are reasons to believe that sampling from the following distribution is often a good choice: µi = Ce