Don’t waste your iterators!

2011/08/28 1 comment

Hey all. I kind of put everything in this blog. I hope much of it will be useful to somebody, but most people will probably only care about some of what I write here. Today, I’m writing about python programming. Some days, I use this space as my workout log. If you are using a feed reader, and you only want to see certain kinds of content, you can actually subscribe to individual categories within my blog, by clicking on the category name on the right –> and then using that URL in your feed reader.

If you’re only here for the fitness stuff, feel free to move on.

There’s a pattern I see fairly often in code where someone uses a function that returns a sequence of some sort, filters it, and then wants to use the first result that matches the filter. It looks something like this:

    return [x for x in foo if len(x) > 4][0]

This works, and looks “pythonic” (it uses list comprehensions after all!), but it’s actually a fairly slow and wasteful way to get the results we want.

The actual example I saw which prompted me to post this was from a fun post by Jeff Elmore which explained creating a wu-name generator in six lines of python.

import urllib
from lxml.html import fromstring
def get_wu_name(first_name, last_name):
    """
    >>> get_wu_name("Jeff", "Elmore")
    'Ultra-Chronic Monstah'
    """

    w = urllib.urlopen("http://www.recordstore.com/wuname/wuname.pl",
                       urllib.urlencode({'fname':first_name, 'sname': last_name}))
    doc = fromstring(w.read())
    return [d for d in doc.iter('span') if d.get('class') == 'newname'][0].text

What this will do is find every span in the document, and check to see if its class is ‘newname’. In order to do this, it has to scan the entire document, which may contain a significant amount of unwanted material.

We don’t need a comprehensive list of matching spans. We just need one, and then we can take it and move on. With a list comprehension, we can’t even ask for the first one until the whole document has been processed.

We’re actually better off going through this the old-fashioned way, by using an if nested in a for-loop, and returning the result.

   for d in doc.iter('span'):
       if d.get('class') == 'newname':
           return d

But we like our brevity, and python provides us with generator expressions, which look like list comprehensions, but don’t do any actual work when they gets created. With them, we can ask for the first object before we even begin scanning the document, so python knows to stop processing as soon as it finds the right one. It also doesn’t hold its results in memory; it passes them back as they are retrieved, one at a time. We both save the memory it would take to build up a list of spans *and* get to stop searching the moment we find a span matching our conditional.

The bad news is that we can’t just write:

   return (d for d in doc.iter('span') if d.get('class') == 'newname')[0].text

If we do, we get an exception:

TypeError: 'generator' object is not subscriptable

We’re trying to index into a generator, but the iterator protocol doesn’t support indexing to a particular member. All you can do is start at the beginning and work through it one at a time.

The good news is that since we want the first one anyway, all we have to do is start iterating through the generator, and stop after grabbing the first item.

   generator = (d for d in doc.iter('span') if d.get('class') == 'newname')
   for each in generator:
       # The function gets returned on the first pass through
       # the loop, forestalling any further processing
       return each

But now we’ve lost the terseness of our list comprehension again. All we’ve done is move the if clause out of the for loop and into the generator. Not much of an improvement.

If you understand how generators do their job, though, you can actually maintain the terseness of the original list comprehension version, while enjoying the improved performance of the generator version. Each time you loop over an iterator, the next value is retrieved by calling the .next() method on the generator. So rather than relying on a forloop to go through our iterator for us, we can step through it manually using this method.

   return (d for d in doc.iter('span') if d.get('class') == 'newname').next().text

In python 3 the method is called .__next__() instead of .next(), so this method isn’t quite compatible across python versions. For python 3, we could use:

   return (d for d in doc.iter('span') if d.get('class') == 'newname').__next__().text

If cross-python compatibility is important to you, or if you would rather not muck around with dunder methods, there is a builtin next() function which goes back at least to python 2.6, and probably further, which takes an iterator and calls the appropriate .next() or .__next__() method on the iterator, returning the result. So now we can write:

   return next(d for d in doc.iter('span') if d.get('class') == 'newname').text

We’ve gotten the results we wanted quickly and efficiently, with no appreciable loss of code clarity.

Now, in this particular case, it’s not a huge deal one way or another. Our bottleneck is going to be pulling the document down from the internet, and the page is fairly short, so we’re not going to be wasting too much time scanning over it. But there are times when this trick can save you quite a bit of time. Imagine scanning through a logfile that hasn’t been placed under logrotation, and has hundreds of megabytes of data in it. Imagine if the processing we were doing on each item in the list comprehension took ten minutes. Or imagine if we were calculating wu names for a million users, where every hundredth of a second getting a wu-name translated to nearly three more hours of running time. In any of these cases, knowing when to use iterators, and how to use them effectively can make a big difference.

Categories: Code

Stalled on Overhead Press

2011/08/24 2 comments

Wednesday, August 24, 2011

Squat: 290 lbs: 5/5/5
Press: 137.5 lbs: 4/4/3
Deadlift: 295 lbs: 1 set of 5.

This morning I stalled on Overhead press for the first time. The weight started feeling really heavy a few workouts back, and my form wasn’t quite on today. When I looked at the video of my third set, it looked like I was pressing the bar about two inches in front of my face, which disadvantages me, leverage-wise. I think I’ll try the same weight again next time, and try to keep going with it.

Squats are still going up, but every set is a struggle. I know I can get squats up to 300 lbs; I’m hoping to pass 315 before I stall. Hell, I’m willing to be surprised by getting to 375 pounds if I can keep going with it. I’ll just take it one squat at a time, and see what happens.

Current numbers for last fully successful set in the progression:

Squat: 290 lbs
Overhead Press: 135 lbs
Bench Press 175 lbs
Deadlift: 295 lbs
Clean and Jerk: 190 lbs
Snatch: 130 lbs

Press and Deadlift are both still 5 and 10 lbs behind my PRs, respectively. My Squat passed my PR by about 55 lbs ago.

I’m thinking of switching up my routine a bit, moving to four days a week, and devoting two of those days to the oly lifts, and the other two to the slow lifts, as recommended by Tamara Cohen.

I’m also thinking of switching out bench press for weighted ring dips. Why? Partly so I can do my workouts even when I don’t have a spotter. Partly because I’m an anti-bench press bigot, I guess.

I’m a little disappointed with how slowly deadlifts are going up. I’m actually thinking of doing them more frequently. I’ve been cautioned about recovery, but so far that hasn’t been an issue. I can always scale back later if it starts fatiguing me too much.

So overall, it would look something like:

Monday

Snatch
Clean and Jerk
Front Squat

Wednesday

Squat
Press
Deadlift

Thursday

Snatch
Clean and Jerk

Saturday

Squat
Ring dips (3x max reps, add weight when I get to 3×10 or 3×12)
Deadlift

Thoughts?

Categories: Fitness

Oly lifting setback and breakthrough.

Ive been working on my olympic lifts for the last couple months. In July, I set a goal for myself to get my snatch up to 135 lbs. I had one day where I got 125 lbs, and worked pretty hard on 130 lbs, but just couldn’t catch it overhead. Since then, I haven’t been able to break 115. It’s been a little frustrating. It’s frustrating not to be able to hit my PR. It’s frustrating to regress to where I can power snatch more than I can squat snatch.

I’ve been doing a linear progression lifting program (which I’ll detail soon), and watching my squats go up into uncharted territory. My previous max was 235 lbs, and this morning I was moving 255×5 for three sets. My clean and jerk is also moving upward. A couple weeks ago, I hit struggled to hit 180 for a PR. In competition, I power cleaned 85kg (187), but pressed out the jerk. A week later, I cleaned and jerked 190 lbs.

But my squat snatch is stagnant at 115, even though I can power snatch 120. Today, I was even struggling to get 115. The 8:30 was finishing up their warm ups, and I was still working on it. Brad walked by and said “your swinging the bar.” What!? I’d worked on that. I’d gotten a good linear bar path. That was fixed, right? Wrong. The bar kept falling down in front of me, and I couldn’t land it in a comfy overhead squat.

Before now, the main critique I’ve gotten was that I need to be more explosive, so I’d been working on moving faster when I pick up the bar. I think this “correction” was causing me to rush the first pull, and getting myself into a bad position for the second pull. So I slowed it all down, got the bar past my knees, and then drove hard from there. On the second try, the bar landed smoothly in the squat, and I drove it up. At that point, I had to leave to catch my bus, so I wasn’t able to really drill it home, but I think it might be the fix to break through my plateau.

Dave also made the very sensible suggestion that I might spend some time drilling assistance lifts. Some snatch pulls might help me work on keeping a vertical bar path, and getting into the right position for the second pull. We also talked about snatch balances, but when I get in the right position, I feel pretty comfortable catching the weight, so for now, I’m going to stick with the pulls.

Whatever I do, I need to solidify this correction, so I can get back to moving the weight up. I can’t wait to get those 45s on the bar.

Categories: Fitness

Notes on Olympic Lifting at PDX Weightlifting

This past week, my beloved and I were given the opportunity to train with Nick Horton and the folks at PDX Weightlifting. Nick was gracious enough to let us lift with them while we were in town, and even spent quite a bit of teaching us the lifts, and coaching us as we progressed.

He began with the snatch, and on the first day, we did nothing but the snatch for the majority of the time. Once our form started degrading from fatigue, we switched over to working up to a heavy single front squat, and then called it a day.

We started with PVC pipes, doing hang power snatches from the hip. Once we had worked on the hip explosion for a little while, we moved down to the knee, and practiced going back and forth from the hip (first position) to the knee (second position), and back up. Between the two positions, we were told to keep the knee angle the same, and just push our hips back, and our ankles to the point where our shins were vertical. This actually created quite a bit of stretch in my hamstrings. I’ve known that my hamstrings tend to be tight, but I may need to do some mobility work there. Or maybe I just need to keep doing snatches with this technique.

For third position, Nick wanted us to keep our butts down, unlike with deadlifts. As I understood it, the reason was that starting with the butt up (and back closer to flat) made it more likely that the hip explosion would pop the bar forward instead of up, swinging it over a lifter’s head and behind. Edit: Nick tells me that the main reason for this starting position is to make it easier to stay back on the heels throughout the pull.

Once we were lifting from the ground, we moved on to the bar, starting with an empty bar. We moved up in weight slowly, doing several sets of triples at each weight, until Nick “graduated” us to the next weight. If the weight got too heavy, we would back the weight down again, and start working back up at the same methodical pace. I worked up to 50 kg at this weight (110 lbs, which is 10 lbs less than my max), and did several singles at this weight.

I didn’t actually back off the weight at any point, but Jayme did. I don’t remember how high she got, but she started psyching herself out, when she realized she was lifting her PR, so Nick told her to take ten pounds off, get her confidence back, and then work back up. After she did that, she was able to lift pretty comfortably at her previous max.

At one point, Jayme asked why Nick never talked about what we’re supposed to be doing with our arms. he said it was a pedagogical strategy. Since you’re supposed to keep your arms relaxed, telling people how to use their arms was often counter productive, as it just got people focused on their arms, which made them tense them up.

In contrast to the way we do strength training at CFD, it felt very relaxed, and reactive to our performance. Rather than hitting a certain number of reps with rising weight, you just spend some time with each weight until you’re ready to move on. It feels less like you’re trying to get somewhere particular, and more like you’re just spending time with the weights.

Categories: Fitness

Gwibber is…

2010/11/24 3 comments

Just had a minor annoyance this morning, trying to post a status with gwibber this morning. I was dual-posting to Twitter and Facebook. On Twitter, the post read, simply enough:

DC Tweed Ride 2010. I need to step up my game: http://readysetdc.com/2010/11/video-dc-tweed-ride-2010/

On Facebook, however, Gwibber felt the need to add the word “is” to the beginning of my post. “Cliff Dyer is DC Tweed Ride 2010” makes no sense whatsoever.

Read more…

Categories: Code

Submitting links via POST using jquery

Often, when designing interactive websites, you want to have elements that look like links, but submit POST requests, because they modify some data on your website. Maybe it’s a delete link. Maybe it’s a “Like” button. Today, I just stumbled across a dead easy way to do this using JQuery.
Read more…

Categories: Code Tags: , , ,

DJing at Hot Club of Durham, September 2, 2010

I DJed at Hot Club of Durham last night. More and more people keep coming out to the dances. This was one of the best turn-outs I’ve seen. We had a good sized balboa contingent, and a fair number of blues dancers as well, which made for some interesting challenges DJing. A couple dancers who just moved to North Carolina came in from Greensboro for the night. I really had to keep an eye on the floor the whole time, so I didn’t get to dance much until after my set. I hope I gave everybody enough of what they were looking for.

The Soul mini-set was a complete flop, so I killed it after one song. People filled the floor dancing (lindy hop, no less!) to songs as slow as 73 BPM and the bal dancers were out as fast as 218 BPM. I was surprised to see Billie Holiday’s “Them There Eyes” clear the floor entirely, but the recording was crappier than I realized when I previewed it yesterday. The beat was kind of muddled beyond recognition. Lesson learned. It seemed like everyone was having a fairly good time.

Set list:

Sidney Bechet—Perdido Street Blues (149 BPM)
Andy Kirk—Walkin’ and Swingin’
Artie Shaw—What is this Thing Called Love (193 BPM)
Edgar Hayes—Swingin’ in the Promised Land (218 BPM)
Slim Gaillard—Look-a-There (175 BPM)
Erskine Hawkins—Tuxedo Junction (148 BPM)
Ella Fitzgerald—Let’s Do It (Let’s Fall in Love) (73 BPM)
Sam Cooke—Frankie and Johnny (115 BPM)
King Cole Trio—Straighten Up and Fly Right (160 BPM)
Chick Webb—Who Ya Hunchin’ (187 BPM)
Hot Club de Norvege—J’Attendrai (181 BPM)
Fats Waller—Hold Tight (I Want Some Seafood, Mama) (189 BPM)
Jelly Roll Morton—Georgia Swing
Ella Fitzgerald—Clap Yo’ Hands (134 BPM)
Benny Goodman—Christopher Columbus (160 BPM)
Slim Gaillard—Potatoe Chips (139 BPM)
Wynonie Harris—Keep on Churnin’ (142 BPM)
Louis Armstrong—A Kiss to Build a Dream On (85 BPM)
The Isley Brothers—This Old Heart of Mine (Is Weak for You)
Big Three Trio—I Feel Like Steppin’ Out (151 BPM)
Fats Waller—Serenade for a Wealthy Widow (208 BPM)
(announcements)
Sidney Bechet—Muskrat Ramble (180 BPM)
Lu Watters—Pork and Beans (175 BPM)
Jimmie Lunceford—Le Jazz Hot (144 BPM)
Benny Goodman—Sometimes I’m Happy (132 BPM)
Boilermaker Jazz Band—Sunday (158 BPM)
Billie Holiday—Them There Eyes (194 BPM)
Andy Kirk—A Mellow Bit of Rhythm
Mills Blue Rhythm Band—Keep the Rhythm Going (175 BPM)
Duke Ellington—It Don’t Mean a Thing (If It Ain’t Got That Swing) (211 BPM)
Artie Shaw—Oh! Lady Be Good (209 BPM)
Boilermaker Jazz Band—Minor Swing (203 BPM)

Categories: Dance Tags: , , , , , , ,

My cycling priorities

2010/03/19 1 comment

Here are my priorities as a cyclist:

1. Stay safe.
2. Be courteous.
3. Obey the law.
4. Get where I’m going.

I think 3 and 4 sometimes switch places.

I will try to be courteous to drivers, unless doing so makes me unsafe. This means that at intersections where drivers are likely to turn right, I will pull out and take the lane when possible, even if it slows you down. I don’t mean to be rude, but I’ve had drivers right hook me too many times (no actually collisions yet, thanks be to God).

Sometimes being safe and being courteous means riding through a red light to get out of the way, or to avoid putting myself in a dangerous situation on the other side of the light. I will do this on occasion, but only if absolutely nobody is coming the other way.

Getting where I’m going in a timely manner and obeying traffic laws are important to me, but not as important as the other two. I will not ride in a manner that disrespects or frightens other people, just to get where I’m going.

I haven’t yet figured out my priorities between obeying the law and getting where I’m going. I will sometimes run red lights in empty intersections, but I always look first. I do think it’s important for cyclists to take their responsibilities on the road seriously, as part of claiming our place as a normal part of traffic. My actions and values are sometimes at odds, it seems. Does anyone else wrestle with this dilemma? How do you address it?

Postscript: I will also almost never ride on a sidewalk, because that’s (ironically) an easy way to get run over by cars, who may be pulling out of driveways or parking lots. It also frequently violates the courtesy rule with respect to pedestrians, who always have the right of way.

Categories: Cycling, Sustainability

Psycopg2 has a web site? Sweet!

It just came to my attention that psycopg2, the python driver for PostgreSQL database has a website again! For several months (years?) the site was unavailable, except for an plaintext rant about the author’s ire toward Trac. Now it has several pages of nice-looking sphinx documentation. I haven’t delved into it yet, so I don’t know how good it is, but it’s nice to see a professional looking website providing the public presence for the driver that lets me get at my data. It may be technically irrelevant, but it gives me a little more confidence that the software isn’t as hackish as the site used to be.

If you haven’t seen it yet, I recommend hopping on over to their site, http://initd.org/. It’s well worth a look.

Welcome back psycopg2!

Tricks with iterators.

2010/02/25 1 comment

This post is a follow up to Iterators and Iterables Clarified. If you’re not sure how iterables and iterators differ, how to create them, or why you’d care, start there.

OK, so files are iterators which can be exhausted. Once you’ve looped over a file, it’s done. But let’s say I want to implement a file object that can be restarted. There are a couple things you can do: Probably the simplest technique is that you can create a wrapper around a file object that rewinds the file each time it gets iterated over:

A rewinding file iterable


#/usr/bin/env python

class RewindingFileIterable(file):
def __iter__(self):
self.seek(0)
return self

>>> f = RewindingFileIterable(‘names.txt’)
>>> for __ in xrange(2):
>>> for line in f:
>>> print line,
Tom
Dick
Muhammad
Tom
Dick
Muhammad

Unfortunately, this technique has two problems.
Read more…

Categories: Uncategorized