# The Lecturers Problem

HelgaCon is coming up, the house convention I organize, where I and 25 gaming pals hole up for a weekend and play games until our eyes bleed. Scheduling this thing is also something of a passion project for me. I’ve written and rewritten the software for it, and recently it’s struck me to try and productize the thing. Or at least make it usable by someone other than just myself.

Fair warning – this post is going to be math heavy, so buckle up.

The way I run registration for HelgaCon is that I have my GMs submit their games with all the details — title, description, min and max players — but I they don’t specify *when* the game will run. We divvy up the weekend into five four-hour time slots, and then everyone ranks all the games in preference order. Your favorite game is rank 1, the next is rank 2, etc. Then I run the whole thing through an algorithm I wrote to schedule who plays what and when, and it works pretty well.

The problem is interesting enough that I started using it as an interview question for AI engineers. Here’s the boiled down version, which I re-branded as professors giving lectures rather than DMs running games, for the sake of simplicity:

The Lecture Schedule ProblemSix professors have been invited to give lectures at an upcoming seminar. The event has access to two lecture halls, so can host two simultaneous lectures during each of the morning, afternoon, and evening sessions, for a total of six lectures.

Each professor will give one lecture, and then sit as a guest of honor at two other lectures. Thus, any given lecture should contain one lecturer and two guests of honor. The professors are each asked to rank the five other lectures in the order they would like to attend as a guest of honor.

Create an algorithm that schedules when and where each lecture will take place, and which professors will be guests of honor at each lecture, that ensures each professor attends the highest priority lectures possible according to his submitted preference list.

As I mentioned, I’ve written the code to solve this problem, so it makes a great discussion problem during an interview. I personally solved it using a genetic algorithm, after first attempting a couple of other techniques. It’s always interesting to see how a new person tackles it.

Now I’m translating my code from a django project to a ruby on rails project, chiefly because I write rails code every day and it’s just a bit more comfortable for me at this point. I did all the data collection and web site rendering, and now I’m on to the crunchy bit – the scheduling algorithm itself. I decided it was finally time to consider something that’s been nagging at me for a while — can I just brute force it?

Most good AI engineers ask that question pretty quickly, and if you do the math on the above example it seems pretty reasonable. In fact, let’s go ahead and do the math. Each lecture has two guests of honor, which must be selected from a group of five professors (obviously we must exclude the lecturer himself). This is a straight forward application of the **combination formula**. Where n is the number of options and r is the number of selections, C(n, r) is the total number of unique combinations possible. In this case, C(5, 2) gives us 10 possible combinations of different guests of honor for each lecture.

It would appear you could use the same formula for the number of combinations of side-by-side lectures, C(6, 2). What the specific times are doesn’t really matter, just which two lectures get paired to run simultaneously. However, the real data is a bit more complicated. At my convention, I don’t assume that there are the same number of games in every time slot. Delta runs his Outdoor Spoliation game for 10+ players, while some GMs of smaller indy games (The Quiet Year, Fall of Magic) may only want 1-3 players in addition to themselves. I also let GMs select “preferred” times, so they can try and nudge the system into placing their spooky Cthulhu game late at night for example. It gets complicated pretty quickly, suffice it to say that not all time slots can really be treated equally.

For this reason I take a more pessimistic view of placing the games into time slots, and just calculate it using the rule of product. If each of the six lectures can be assigned to any of three times, then there are 6^{3} or 216 total possible permutations of lectures to time slots. I know this includes a lot of totally invalid options, such as putting all six lectures in the morning, but it’s easy to reject those when it comes to scoring. Sure, in our simplified case it would be easy to pre-reject them, but again there are so many more variables in the real use case. Time to put the brute into brute force.

So 216 possible scheduled games times 10 variants each gives us 2,160 total unique schedules. The last step then is to walk through each option, score it, and then pick the best score. Scoring schedules is actually pretty easy, for the most part it’s just the sum of the ranks of all the players. It’s like golf – the lowest score is the best schedule. We can detect the broken schedules (double-booked professors, empty lecture halls, etc) and juice them up with a big penalty score modifier.

Here’s a bit of sample code I wrote in python that sets up some test data and calculates the time it takes to score a schedule. It’s grossly simplified from the real scoring algorithm, as again this simplified version lacks a lot of the variables in the full data set, so it probably under-estimates the time it really takes. Still, we can at least use it as a baseline for comparison. When I run this on my computer I get a time of 0.000009 seconds per score. If we multiply this by our 2,160 unique schedules we find we can brute force it in about 2 hundredths of a second (0.01944 seconds). OK, so we can brute force the easy version really quickly, but what about the real data?

Well, we can start by scaling up the numbers to those of the actual convention. Typically I have around 15 games running in 5 time slots for 25 players. If we assume we’ll always do exactly 3 games per time slot, we get 15^{5} or 759,375 possible variations. And if you think that number got big quickly, consider that since most games have a wide range of number of players, we have to sum all the combination formula results for every possible number of players. If the average game seats 4-6 players, then we calculate C(25, 4) + C(25, 5) + C(25, 6) for a total of 30,297,995 possible player configurations for each game!

Finally we can figure out how long it will take to calculate all those scores by multiplying 30,297,995 by 759,375 (23,007,539,953,125) by the time per score (0.000009 seconds) for 207,067,859.578125 seconds, or roughly six and a half years.

Did I get any of the math wrong up there? Quite possibly — I’m sure you guys will let me know. But the end result is that I’m pleased to have gone down the fancy AI algorithm route for solving this problem rather than banging my head against the brute force method. I’m still pretty sure I came out ahead.

Smashing! And your current system works magically well in practice for the last several years.

Funny thing, in the discrete math textbook I’m currently teaching out of (Rosen) there is an example specifically on “Scheduling Talks” (Sec. 3.1, Ex. 7). It’s a very different set of assumptions: Talks come with a pre-specified start and end time, and you’re trying to fit as many as you can in a day into one lecture hall. Turns out, the greedy algorithm that just sequentially adds feasible talks with the earliest end time is optimal.

You’ve thought about this so much longer than I have. If I were an interviewee, I might at least explore trying a greedy algorithm where you can initially give each individual talk a desirability score. Schedule the best talk in the first slot. Then parallel to that, go through the list…. [thinks] … naw, this isn’t going to be optimal. May reflect more on that in the near term. It’s a tough problem!

I was just talking with some co-workers about this at lunch. Here’s a really fascinating tidbit: my first attempt at solving this was basically a greedy algorithm. When I switched to a genetic algorithm, I used the original greedy algorithm to populate the first generation. I was worried however that this was creating a local maxima, so I switched to a first generation created by pure random values. This came out much worse.

Eventually I discovered the best results came from a mix of the two. The system now creates the initial generation with what looks like a greedy algorithm, but with random noise injected at certain decision points. It favors reasonable choices, but there’s just enough chaos in there to allow it to discover paths to the global maxima.

Not on my best day, Paul! LOL

Will you ever talk about the mass combat rules you use for d&d battles and storming the castle and stuff?

That’s really Dan’s area. You might search his blog for “Book of War”, which are his custom rules for such. I’m not sure if he ever wrote anything specifically for castle storming though. I used those rules for the Battle of Bridgefaire in my own campaign.

Do you base scoring on the raw sum of ranks, or do you square or otherwise adjust to that very bad ranks are unlikely? I.e., are 10 1-rank shifts equivalent to 1 10-rank shift?

I use the raw sum. A square is an interesting idea. I did once try weighting the scores to favor the top 3, under the assumption that players would vastly prefer games 1, 2, 8, 11 than 4, 5, 6, 7. As I recall, it did not significantly change the final results.