Friday, November 13, 2009

Failure

The second of five performance contests was this Wednesday at math team. This one went a lot worse than the last one, which I think is evident from the score distribution:



First, nobody got a zero! Why? Well, turns out that question 2 was flawed. The triangle with side lengths 5, 15, and 16 does not have an area of 72 and does not have an inradius of 4. This was brought to my attention during the contest and I decided to give everyone credit for the problem. After all, every triangle satisfying the problem statement has a side length of 1, and 2, and 56, and :). (Hint: there are no triangles satisfying the problem statement)

Second, 2 and 4 have about the same number of people! That's pretty easy to explain. The contests are set up so that there will be three tiers. The top tier consists of A teamers who are capable of solving all of these problems. They are expected to always get 1 and 2, usually get 3 and 4, and then sometimes get 5 or 6. The second tier consists of B teamers who are capable of doing well, but either don't have the speed or the knowledge to complete the entire contest in the time allotted. They are expected to always get 1 and 2 and then get some of 3 and 4. Finally, there are the lower teams who are expected to be working on 1 and 2 the entire time.

So what does this mean? This essentially means that A team has a 6 question contest, B team has a 4 question contest, and the rest of the team has a 2 question contest. When #2 turns out bad, it turns the A team's contest into 5 questions, the B team's contest into 3 questions, and the rest of the team's contest into 1 question. The impact on the A and B contests is negligible, especially since they are supposed to always solve the first two problems. However, the contest for the majority of the team was reduced from two problems to one, which resulted in the massive clump at 2 and 4.

Let's look at the test in more detail (you can see it on the wiki page linked to in the first paragraph).

The first problem is straightforward, provided that you can list the primes up to 43 and count them correctly.

The second problem is screwed up, as I said, but you can see what I had intended on the solutions page. (Maybe I should've solved for the true side lengths that would make it work so that you can also solve it with e.g. Heron's formula)

The third problem has an amusing story. It originated from a dinner at IOI when we were eating with the Canadians. One of them proposed a few problems to us. The first was a geometry problem which I can't remember, but Travis eliminated it rather quickly. Then he said "Here's a problem that took me a while to solve. See if you can do it. Find if ." To this, I responded, "Wait, isn't that utterly trivial? It's just the difference of two geometric series..." Their response? "Damn! Why didn't I think of that?"

The fourth problem has a somewhat different history. I first saw this kind of problem at MathCamp 2006 in a class titled "Calculus Without Calculus". It was a nice problem, but I basically never saw it again until last year, when it appeared on an ARML practice. However, the lines in the ARML problem were given to be perpendicular (BAD was right) and so most of the people who solved the problem simply calculus bashed it. After the calculus solution was presented, I went up to the board, drew the circle, and explained that not only could the problem be solved with just power of a point, it was independent of the fact that BAD was a right angle.

The next week, I wrote a TJML and included a similar problem, but changed the angle to something obnoxious: 82.5 degrees. The result was that one person solved the problem and nobody else, and many people didn't even realize that I had presented the exact same solution at the previous week's ARML practice.

Fast forward to the summer when I was writing problems for performance contests. I remembered this failing in the team and decided it was about time that they learned to listen to and remember solutions, so I put the problem in my database and aggressively classified it as a medium level problem.

Problem five is a classic use of Hensel Lifting. Many people commented after the contest "I have never heard of Hensel's Lemma." That was intended, but you can solve the problem without knowing Hensel's Lemma if you think of first solving the equation mod 5, then "lifting" it to mod 25, and then mod 125. This was a problem that was simply meant as a "here is a useful technique that you should remember if you want to do well" problem.

Finally, we get to problem six. This one was thrust into the performance contests because of some team contest last year (I can't remember which contest it was) in which there was a similar problem and none of the rest of the members of my team knew how to do it (meaning none of the top members of last year's math team). Again, this problem was meant as a wake-up call, so that people would learn how to approach problems like this, since they have appeared on various contests in the past (I think there was one at Duke last year). In another sense, it was a problem that said "This is something that is useful to know if you want to win." Essentially, some of my problems are problems that nobody will get, and if you know how to do them you have a huge advantage. An obvious example of this is HMMT 2009 Calculus #10. TJ A essentially got 24 free points because we knew complex analysis and nobody else did (it should have been 40 free points, but Kee Young and I were pretty silly during the test).

So how would I have liked this to turn out? Well, barring the screwup on number 2, I'd expect from our current team:

All of the top 15 to solve #1
All of the top 15 to solve #2
Most of the top 15 to solve #3
Some of the top 15 to solve #4
Few of the top 15 to solve #5
Few of the top 15 to solve #6

and I'd want:

All of the top 15 to solve #1
All of the top 15 to solve #2
Most of the top 15 to solve #3
Most of the top 15 to solve #4
Some of the top 15 to solve #5
Few of the top 15 to solve #6

What I wanted actually pretty much happened for problems 1, 2, 5, and 6. Problem 5 is supposed to get a few more solvers than #6, but still not many The problem is the midrange problems. However, as bad as the distribution is, I strongly disagree with making the middle section of the contest easier. The fact is, our A team is not up to par. I'm actually not sure why this is the case. I said in an earlier post that this might be because the class of 2010 made a block that just rose to the surface as a single chunk as the pieces of the team above it disappeared, so as long as nobody else in that block was working, anyone in the block would see their ranking go up for nothing, so why would they work?

Other people (namely Dan) have suggested that the problem is inherent in the ranking system itself. If rankings were kept private, as schools such as Exeter do, then unless one is vastly superior to the rest of the team, there is no magic website to go to that will tell one whether or not he will be on the A team or the B team or on any team at all. TJ has such a magic webpage, and given the fact that these are TJ people, it's inevitable that somebody will notice that no matter what happens, they will be on A team even if they skip the last practice. And then some people might go further and actually skip the practice.

I'm not going to say that Dan's theory doesn't hold water. It very well could. In fact, there are some people who I think would be likely to fall into such mental traps. However, I do have issues with keeping rankings private. The first is somewhat obvious: TJ kids will figure it out anyway. Even if we radically change how scores are calculated, unofficial results will become commonplace. This has been seen in the world of informatics. USACO releases all scores on a month-to-month basis. While their specific parameters are not releasd, our senior computer team keeps its own rankings page which does a simple average and generally correlates well with the camp selections. Additionally, during the International Olympiad in Informatics, results from day 1 appear on Russian sites even before day 2 starts. The International Mathematics Olympiad does it differently, since all grading is done after both days occur and everyone's score except for one problem is made entirely public officially as coordination proceeds. That is much more similar to what we do at math team. The difference is that when 5 of the 6 IMO problem scores are posted, you might know that you definitely made the gold cutoff, but you still have already done problem 6. When 13 of the 14 math team practices are posted, you might know that you definitely made A team, but you're still able to not go to the last practice, and so some people might just do that.

So maybe Dan's idea has some merit, but I think all of this is fundamentally a problem of the idea that results are what matter. Many people take this so far as to believe in the "big fish in a small pond" theory, which says that being valedictorian at a small, no-name school is better than being simply an above average student at a large, prestigious school. By being at TJ, I think most of you have realized that there are some things more important than being valedictorian. But have you realized that there are more important things than being on our A team?

My worry with the block of 2010ers was that they might think to turn TJ into one of the lesser schools so that they could in turn become higher ranked in terms of their own school and put something more impressive on their college application, which would make the admissions officers think that they are better than they actually are, since the TJ name carries a large reputation, and changes in that reputation won't propagate very fast. And with the large block of lazy 2010ers, this was actually possible.

At this point, I think I've broken the block. Our B team contains almost no seniors and the seniors on A team definitely deserve it, since most of them have performed well on at least one of my performance contests. Looking down at the lower spectrum of the rankings, I can see that many of the seniors that I felt were being too lazy to do as well as they were doing are in fact dropping like flies. But most importantly, the seniors are no longer a huge block floating to the surface. And while there might be a huge group of seniors at the top, in all honesty the gap between the A team and the B team is the smallest it's been in years. If the B team improves just a little bit, I am completely confident that they can continue the TJ legacy in the coming years.

However, some people, who will remain nameless here but probably aren't hiding it very much, believe the big fish in a small pond theory to an unacceptable extent. They think that winning the B division of ARML is better than being a random team in A division because they get more prizes. That's a problem.

Some people, when they read this post, probably thought that when I said our A team isn't up to par, I meant that they can't keep up with the other teams. That's not at all what I meant. Actually, I think that perhaps other than AAST and Exeter, we have one of the strongest A teams in the nation. And with the gap between A and B as small as it is, that also goes for the ARML A team. However, we're still not up to par. We're lacking on the mathematical thought process and how to attack a problem that has never been seen before.

I always wondered: how was it possible that some people made our ARML A team and then performed substantially worse at the competition than members of our B team? Obviously something was wrong with the selection process, but what? Well, I believe the answer was that our contests were too formulaic; they could be mastered by just doing math team for a few years and then recognizing the problems. But what happens then when ARML comes up with a new problem type? Those who are good at problem solving but not as fast because they don't have the problem types memorized are on our B team and get the problem, while those who were fast from just problem recognition flounder. That was the fundamental problem of our selection process.

In fact, 99% of all contest problems are very similar to a previous contest problem. So this formulaic method works to a very large extent, and China is able to use it to great success at IMO. And the results of this method are extremely clear: China dominated problems 1-5, while Japan dominated problem 6. Quite simply, had Japan gotten a perfect score on problem 4 and even a single more point on problem 3, China would not have won the IMO. Why did Japan dominate problem 6? Because nobody had ever seen that kind of problem before. Why did China dominate problems 1-5? Because they had all seen those same types of problems before.

I've had several arguments with Richard Peng, who coached the USA IOI team the past few years, about what the correct training method is. He insists that the Chinese method is the correct one because it brings in the most gold medals, while I say that the correct method is to pretend that the Chinese method doesn't work, and instead work on how to solve new problems.

TJ's method tries to emulate the Chinese method of training by providing a thorough overview of all of the types of contest math problems. The first error is that it only covers an overview of the types of problems that we do during eighth period. Those problems that are written by ARML, HMMT, and PUMaC are completely out there, and our eighth period practices don't cover the correct material. The second error is that there simply isn't enough time to provide a thorough covering of all of contest math. We are in school for about 40 weeks per year, of which about 30 are used for math team. If you do 12 problems every week in eighth period, that is only 360 problems. Can 360 problems cover all of contest math? Not at all.

So back to my performance tests. They are my way of pretending that the Chinese method of training doesn't exist. Instead of having the problems be archetypical problems that are likely to appear on a contest, my problems are the kinds of problems that are challenging and "out there". They're meant to have people exercise the thinking process that they use when they don't know how to do a problem, so when they invariably are stuck on a problem at PUMaC or HMMT, they have had practice with dealing with that situation before. In fact, since Arvind is reviewing the performance contests, I'll say that it's not unlikely that HMMT will specifically dodge the types of questions that I have given you. That should not detract from their usefulness if I have done my job right.

Will it be enough to win? In truth, it doesn't matter. Winning is great fun when it happens, but it shouldn't go to your head and you should be focusing on learning first, winning second (if even that). I have made the math team wiki public for that reason: I would rather have our competitors know much of what we know and give us a good contest than have a trophy on my shelf.

10 comments:

  1. This post just went into my bookmarks, more than anything else because of the last paragraph.

    ReplyDelete
  2. "I always wondered: how was it possible that some people made our ARML A team and then performed substantially worse at the competition than members of our B team?"

    Part of it is that sometimes people randomly do well or poorly on a contest due to luck or some other factor; it isn't necessarily that A team was chosen incorrectly. I don't recall any people that consistently did poorly on team selection and then consistently well in real contests, or vice versa.

    ReplyDelete
  3. the NC A team always has people that perhaps shouldn't be there, but did well on AMC/AIME/practices (or maybe there was no one else to fill up the spot...)

    ReplyDelete
  4. @Chris: While chance does sometimes play a role,

    1. You can't measure correlation between doing bad on actual contests and good on practices or vica versa because, frankly, there are so few actual contests that we take during high school.

    2. It's what Brian said about ignoring the Chinese training method all over again. We don't want a model based on rote training or, in your argument, an acceptance that luck will screw with the system, so we ignore it.

    ReplyDelete
  5. On the topic of B team doing better than A team... Brian, do you think that it could be due to the fact that A team students have such confidence (in themselves and in each other) that they don't bother to double-check solutions as much as B team students would?

    ReplyDelete
  6. ``Holy Fail'' would have been a much better title :P

    ReplyDelete
  7. @Chris: Yes, luck is a factor. However, I don't think that it's so much of a factor that members of the A team get (for example) a 4 on ARML, while members of the B team get a 7. If people had bad days regularly, then those would show up in ARML practices. Otherwise, it's something else. Perhaps it's the stress of the competition, but I'm reluctant to put too much blame on that. Stress is certainly something that people have to be used to, but a reduction from 7 to 4 is pretty large, and I would be hesitant to say that it can be entirely explained by stress.

    @Lily: That's an interesting point. I think it's definitely a factor to consider in competitions like PUMaC and HMMT, but with the style of giving two problems at a time at ARML and Duke, there's not much to do but check your work. I know that when I take these contests, especially ARML and Duke, I'm so paranoid about missing an easy question that I check my work about five times before being satisfied (and even then I have something bugging me because I might have done something wrong). But again, I don't see this explaining the difference between a 4 and a 7. Plus, it's completely incompatible with stress. People who are "overly confident" can hardly be stressed at the same time about the pressure of the competition. If they were feeling pressured they would probably check their work more rather than less.

    ReplyDelete
  8. Your words really pinpoint the problem with "sucky" math contests in general. This blog really enlightens me every week.

    I am ashamed to say this, but I think I am one of those people who became more intent on memorizing how to solve problems rather than actually solve them.

    I believe MathCounts is one of those "sucky" contests. Before 7th grade, I studied math without pressure to get problems right under some time limit. I enjoyed math puzzles more than anything, and that was my incentive to become involved in MathCounts. But... what I learned in MathCounts was troubling. People who simply "know" formulaic methods to solve problem are so much better in MathCounts. (A good example is the McNugget numbers problem. Many just plug in numbers into AB-A-B and get the right answer. Not many I asked actually knew why this formula worked)

    Under pressure to get into a better private boarding school or TJHSST, I became convinced that I must also solve a lot of problems and memorize how to solve them. This sort of "Chinese" training has carried me to USAMO qualification in 8th grade, but my USAMO score (4 or 5 I think) was poor.

    I found that it was hard to get rid of the MathCounts mindset even when I was in high school. Choate (private school I was in for freshman year) only participated in "sucky" contests that I could easily dominate.

    Unfortunately, I learned that things were a little bit different in TJ. Mandelbrot (at least the last problems) is definitely a contest that requires some thinking. However, most contests can still be solved by rote memorization, especially the ARML practices.

    Although I am not doing well in performance contests, I believe that they are wonderful. My bad scores in performance contests simply tell me that I must change my "sucky" method of learning math that has been indoctrinated me since 7th grade.

    This post explains why Mitchell always does very well in actual contests even though he does not do well in NYCMLs (which almost determined performance ranking last year). He may take longer to solve problems, but he is able to solve new kinds of problems that show up in actual contests.

    ReplyDelete
  9. When I was at TJ, NYCIML and ARML were pretty much solved by having seen the problems before. Mandelbrot usually required creativity, though, so my personal assessment of people's skill level was largely determined by their performance on Mandelbrot, even if this didn't show up on the performance rankings.

    ReplyDelete
  10. zzzz had no idea what mathcounts was until soph year when i started math team lololol

    and dammit brian
    stop bashing us x_____X
    non-tj ppl read this blog D:!!!




    and i think a lot of us fail cuz of stupid mistakes/misreading >< but yeah we fail -.-

    ReplyDelete