Friday, June 25, 2010

The Proxy Conundrum

As many of you Yahoo users out there know, Pidgin 2.7.0 and 2.7.1 introduced a challenge for you. In certain proxied environments, many of you can't connect. Quite frankly, (mostly) it's my fault. Let's start by explaining what I've done.

In Pidgin 2.7.0 I made some seemingly innocent changes to the Yahoo protocol plugin that I thought would magically make some problems go away. These changes make us mimic the official Yahoo client better. Very early in the connection process (in fact, it's the first step), the official clients request a specific URL from a Yahoo web server. The server's response tells the client what messaging server to connect to. The client reads that address and connects to it, then goes along its merry way. So I made our Yahoo plugins do this, once I determined the correct URL's for both Yahoo and Yahoo Japan. Everything was great when I tested.

But there's my mistake--I tested it. I don't run a proxy server--I'm an IT nerd, but I'm not that much of an IT nerd. So, I allowed this to be released, not knowing there were problems in environments with restrictive proxies. It came back to bite me hard, as I backported this stuff to Adium's build of libpurple 2.6.something for one of their beta releases and several users started complaining about being unable to connect. Then we got a couple tickets of our own. At first, I was stumped, until a user who was willing to almost bend over backwards to assist us came along.

This user worked in an environment where two separate proxy servers were in use. One handled HTTP and HTTPS traffic and the other handled IM protocols. It took me quite a while to finally wrap my head around his problem, but once I did (with the help of my fellow developer Etan Reisner), I was able to start solving the problem. It turns out I actually had two problems on my hands, but at first it seemed like none of it was the fault of our code.

The first problem is that Pidgin allows configuring proxies in two places--the "Proxy" tab shown when editing an account and the "Proxy" tab in Preferences--but does not allow specifying what ports or services are handled by each proxy. This ordinarily isn't such a big issue, except that I went to the extreme of creating an option for Yahoo accounts to help some users with weird proxies that don't handle SSL at all. This option is used to determine whether to send an HTTPS request (like Yahoo uses for login) to the account's configured proxy or to the global proxy configured in Preferences. When I did this for 2.6.0, I created a new utility function for URL requests that allowed specifying the account the request was for. If the magic value NULL is passed to this function, the request will use the global proxy instead of the account proxy. This allows our proxy code to route the request to the appropriate proxy. Being the idiot I can be on occasion, I set all URL requests in the Yahoo plugins to use the account proxy server no matter what. This is where the user's problem came in--his IM proxy doesn't do HTTP at all. Once Etan and I agreed on how to fix it, I changed the SSL option I mentioned earlier to act on both HTTP and HTTPS requests. I supplied our user the test build. It still didn't work, but the debug log showed some interesting new info.

Again, it took some head-bashing to understand what was going on, and I still wasn't the one who figured it out. In the meantime, I decided to implement a quick hack to make users at least be able to connect temporarily while we figured out what was really broken in our proxy code. The quick hack was to implement a worst-case, last-resort connection to a previously known messaging server. Unfortunately, I could implement this only for Yahoo, not for Yahoo Japan, as the Japan network appears to no longer has a single canonical hostname by which to refer to the messaging servers. I am, however, positive that even on the regular Yahoo network, this hack will eventually break, which is why I resisted it in the first place. After a couple quick bugfixes brought to light by the user's further testing, we had a working fallback mechanism.

Once this was done, Daniel Atallah, one of our co-lead developers, realized that the second problem our helpful user had was our own proxy code's fault--for some stupid reason, we would never authenticate to an HTTP proxy if we were making an HTTP request on the standard HTTP port (80). Daniel implemented some quick fixes to make our URL requesting code behave better in these proxy situations, thus (we hope) finally solving this problem completely.

All these fixes will end up in Pidgin 2.7.2. We're not sure yet when we're going to release that, but we're hoping it's soon, as we have some fixes for other annoying bugs, such as the famous MSN direct connection crash.

Now, after reading all this, some of you might think, "Why don't they just use libproxy?" Truthfully, that's probably a good idea. There are two reasons we haven't done this yet. First, we looked at this over two years ago. At that time, we were led to believe libproxy didn't handle all the proxy types we did. We were corrected on that, but I, at least, missed that correction being posted. (Sorry, sometimes I'm blind, especially if something like that is buried in a flood of other ticket mails.) I suspect I'm not the only developer who did. Secondly, we're all creatures of habit. As the old saying goes, "if it ain't broke, don't fix it." Well, quite honestly, it's not broken for us as developers! Mainly that's because either we don't use proxies or any proxies in our way don't do any of the crazy stuff we ran into in this case. If it's not broken for us, we don't have much incentive to go hacking stuff up to support new libraries that solve problems we don't experience. I know that sounds like a lame excuse to a lot of you, and I can understand how it seems that way. But if we don't experience the problem, it's hard to test that we've fixed it, and we could in fact be making matters far worse for everyone else.

All this said, we probably should support libproxy. I've been thinking about how we could do this in a quick and effective manner, and so far I've come up with a few ideas that don't feel like they work very well. I think this is a perfect case where code should talk, or at least provide an example. So, I invite people who would like to see us support libproxy to write patches. Actively engage us on our development mailing list (devel@pidgin.im), talk with us in #pidgin on irc.freenode.net, or drop by our XMPP MUC (devel@conference.pidgin.im). Or, do some combination of the three. I can't promise we'll be the most valuable resources during the process, but we do at least try.

Now supporting libproxy is one thing. Actually being able to do useful stuff with the information libproxy gives us is another thing entirely. Our actual proxy connection code is, shall we say, a bit sub-par. Quite frankly, it sucks, even ignoring the crappy configuration portions. While supporting libproxy is nice and may solve a lot of our proxy problems, I guarantee it won't solve them all. Ideas and patches to fix this are greatly appreciated too.

I look forward to hearing from those of you inclined to help!