The Very First Full 1,042-Member Bachelor Nation Family Tree

I was reeled into the Bachelor universe in the winter of 2010. As a graduate student paying non-Nebraska(!) rent for the first time, cable was definitely out of my budget. Instead, I had a crappy second-hand antenna that could only get the local ABC channel. 

When it was time to unwind, I would turn my brain off, adjust my precarious antenna, and let whatever was on ABC wash over me. My escapes from the stress of school weremy daily Jeopardy, my Grey’s Anatomy Thursdays, and my Bachelor Mondays.

I had never watched a Bachelor show before, even though it’d been in the cultural zeitgeist since 2002. When I finally tuned in eight years later that fateful January, I had the pleasure of watching 25 ladies (of varying mental stability) compete for the love of airline pilot Jake Pavelka. In a series of questionable decision-making, Jake proposed to Vienna in the taped final rose ceremony, only to have a bitter, uncomfortable, on-air post-breakup blowout in the live finale. (BREATHTAKING.) And then, because the franchise upcycles previous contestants to become the next lead, I was instantly invested in the next Bachelorette’s second-chance storyline. It’s at the same time a vicious cycle and a riveting pop culture machine.

How I’ve made it this far as a data enthusiast without doing a Bachelor analysis is a mystery. I think I was intimidated by the amazing analyses already out there from the likes of FiveThirtyEight and The Ringer. What could I possibly add?

As I started roping in Bachelor newbies to drink wine, eat cheese, and enjoy Bachelor Mondays with me, I found myself trying to explain the origin story of each character — whose season they were on, how long they lasted, their off-camera instagram feuds, what crazy occupation they purported to have. And I realized, while there are a number of great meta-analyses and deep dives, there aren’t a lot of person-level interactives that let you explore the entire franchise alumni network, from the first contestant out of the limo to the last final rose.

So, for the first time that I’m aware of, here is an infographic of every single Bachelor cast member of all time, interactive and hover-able for detailed deep-diving. This is what I’ve dubbed, The Bachelor Family Tree. Disclaimer: with a franchise that’s 17 years old and counting, it’ll take a few scrolls.

Creating this, I found myself smiling at the “careers” attributed to the contestants. The word former is often prefixed, presumably because the person quit their job to pursue reality TV or was conveniently already unemployed. With my new Bachelor database, I could explore my curiosity. I decided to break down what career types we see most often, in the interactive below. Go ahead! Type in your job, and see what previous contestants had it. (Only 1 contestant had the word “data” in their occupation. Hello, John Wolfner from Emily Maynard’s season, ya fellow datahead!)

Here, you can definitely see the gender-normative jobs for women (child care, teaching) as well as men (construction, engineering). But there is some unexpected parity. For instance, there is a fairly equal number of male/female lawyers and personal trainers. It’s also interesting to see, overall, what job types are common or uncommon. A lot of people in Sales, which makes sense if you think of a stereotypically outgoing (read: camera ready) salesperson. Not a lot of people in politics, probably because of the exposure downside; hope your tequila company works out, Luke Stone. Younger women are “Students” (because they’re on the show before starting a career), older men are Doctors (because it takes a post-graduate education). It’s a fun albeit skewed sample of the millenial job pool. Funeral director. Jumbotron operator. Italian prince. Not so different from your LinkedIn network, right?

The last thing I thought folks would want to explore is age. Our most recent Bachelorette, Hannah Brown, is the youngest lead ever at age 24. She picked the youngest ever male suitor (Jed Wyatt, age 25). Our oldest ever lead, Byron Velvick (age 40) in a surprising good-for-you move had the oldest cast (average of 31 years) and the oldest female winner (Mary Delgado, age 35). So I thought I’d not only visualize the age distribution of Bachelor Nation, but the ages of contestants relative to their lead for some matchmaking context. 

As I expected, every single Bachelor had a cast on average younger than him. All but three Bachelorettes had a cast older than her. Except for Byron’s season, the average age is predictably between 24 and 29 for both genders of contestants. While this is interesting from a data perspective, it also makes a certain amount of common sense. The number of beautiful, compelling, eligible people (who are also willing to compete on a reality dating show) likely drops off with age.

I could make a dozen more interactives with this data. And I might, later. Much later. This was probably my most intensive project thus far for the blog. Creating a database of all these contestants, organizing a timeline, reading decade-old articles; I felt like Michelle McNamara researching the Golden State Killer but waaaaay more pointless. But I know I learned a lot (one of the O’Connell brothers was the Bachelor, guys! nope, not the one you know!), and I hope my beloved readers enjoy the fruits of my nonsensical labors.

Happy Bach-ing, dataheads.

Dog DNA Data – Comparing the Top Two Testing Companies (a.k.a. my thinly-veiled excuse to write about my pup)

Happy Holidays from a finely decorated Murphy boy.

This January, I adopted Murphy. He is a sweet, spastic, adorable, aggravating, two-year-old mutt.

When I spotted him on the rescue agency’s “adoptable dogs” page (which I had been obsessively monitoring for months), I fell in love with his long spindly spiderlegs; his enormous baby deer eyes; his gorgeous auburn coat. Also, they mentioned that he was found in a field way out in San Bernardino County. My heart both broke and melted.

The agency gave him the name Murphy, and I thought it suited the little guy perfectly. I didn’t overthink it. He’s Murphy. And now he’s mine.

But the agency also gave me something that I just couldn’t sit with unchallenged. They guessed his breed to be an Italian Greyhound / Manchester Terrier mix. But I kept staring at his spindle legs, his doe eyes, his ginger fur. Thinking, “Is that rightReally??” He’s long like a Dachshund. He’s red like a Vizsla. He certainly is fast and skinny like an Italian Greyhound – but I knew that couldn’t be the whole story.

These are Italian Greyhounds (left) and a Manchester Terrier (right) from previous Westminster dog shows. It’s not a bad guess for Murphy, right?

After spending weeks speculating, Google image searching, and trying to stare deep into Murphy’s soul, I decided it needed to be settled once and for all. Enough wheel-spinning. I needed to get his DNA analyzed.

I hear that veterinarians balk at dog DNA tests. They say they’re not reliable, not accurate, not relevant. But to reword the famous quote from statistician George Box: all data is wrong, but some is useful. Having some objective insight is better than none.

But me being me, I couldn’t just stop at one answer. I, for some harebrained and probably wine-driven rationale, decided to buy two dog DNA tests. Even if they’re not reliable, I could see if they at least corroborated each other.

So, after two mouth swabs, two eagerly awaited notification emails, and two emotional roller coasters (yes, I cried), I finally got the Murphy information I had craved since his adoption. Behold below: Murphy’s ancestry!

Turns out, Italian Greyhound was confidently ruled out. Sorry, Murph, you’re just not that fancy. The agency’s original Manchester Terrier guess wasn’t bad, though, since it’s a breed that is related to one of his top breeds of Miniature Pinscher.

The biggest shocker shouldn’t have shocked me. Of course this dude is a Chihuahua!! I live in Southern California. Chihuahuas make up 30%-40% of all shelter dogs in the state. By pure statistical chance, he’s more likely to be a Chihuahua than anything else. I’m guessing there’s some “adoption gaming” going on with the shelters and rescue agencies. If you can pitch that a dog is unique, he stands out to prospective adopters. If people are reluctant to get “a yippy Chihuahua,” it’s better to present your rescue pup as a more amiable, more exotic breed. I get it. I don’t hate the game. It’s frankly how Murphman caught my eye.

Were these tests worth it? For me, 100%. I thrive on data. I flourish on learning. The DNA test didn’t change anything about Murphy, but it helped explain his various behaviors and instincts (so. much. defensive. barking.). But from a more visceral perspective? I think a lot about what Murphy’s life was like before I got him. Alone and stray in a field. Unneutered at one year old. Found with bite marks on his legs. Based on his reactivity when seeing strange dogs, he was probably neglected and poorly socialized. But these tests were a breath of fresh air. I now have some positive information about his life before me – his parents, his siblings, his genes. It just made me happy.

But assuming you’re not a data hoarder, and you only plan on buying from one company, which test is worth your money? Well, that depends what you want out of it. I put together this handy comparison of what comes in the “results package.”

TopicWisdom PanelEmbark
Breed BreakdownGood if you want to know the *top* breeds. It's more conservative about what breeds it's willing to "guess."Good if you want lots of fascinating details about other, low-genetic-percentage breeds. Which may be a less accurate, but certainly more interesting.
Results FormatPercentages and family treePercentages and family tree
Breed ExplanationsMore useful copy about the breed's typical behaviors...but then again, you can Google that stuff
CommunityNone that I sawYou can opt to share your results, and find other dogs that have similar genetic profiles as your pup! Eeee! This feature is. the. best.

love love love the community feature in Embark. I don’t know why, but looking at dogs that look like Murphy sets off all my endorphins. I think it’s me trying to find the happiness in Murphy’s past. He has a family. He has relatives. Maybe he has siblings out there I can find. MAYBE. HE’S. A. DAD.

I’m probably (absolutely) overly preoccupied with my dog. These tests are me injecting my obsession, writing this post is me feeding off the rush. I recognize my problem. Thanks for indulging me, dear readers.

If you have questions about these tests, I’d love to share my experience. Comment below!

Here’s Murphy’s family tree results from Wisdom Panel. Lots of Chihuahuas and Min Pins.


And here’s the family tree from Embark. Both pick up Chihuahua and Min Pin “branches,” but this one throws a Poodle and a Pom in the mix!


Embark shares the profiles of other dogs who have DNA like yours. And they all seem to be from Southern California, too. HI THERE GOOD DOGS.



I Viz For Beto

It’s almost Halloween, dataheads. And I came across quite the fright as I scrolled through Twitter today.

I got a promoted tweet for Texas Senate candidate Beto O’Rourke, soliciting campaign donations. And this data viz? Insert scary movie score here. Take a look.

What grinds my gears about this is that it violates a data visualization tenet: never be visually misleading. But here, it both misrepresents Beto’s position in the race and confounds his call-to-action for donations.

Why? Well, Beto’s “rainbow” takes up more real estate on the graphic than Cruz’s. It has more volume, more length, and has a longer arc. My brain looks at the picture, and before it has a chance to process the labels at the ends of the arcs (46% vs 50%), I might conclude that Beto has overtaken Cruz. So not only is it an inaccurate rendering, but it might actually make me think Beto doesn’t need my ten bucks.

Yes, I see that Cruz’s arc is more “complete,” I guess, than Beto’s (and I do like, visually, stacking Beto on top of Cruz to give a sense of victory). But this is marketing disguised as data.

The point of data visualization is that it capitalizes on our brain’s “pre-attentive attributes.” If designed well, when we look at an image, our eyes go immediately to the pattern we’re supposed to recognize — or the discrepancy we’re supposed to notice.

Image from:


So I played around with a couple different options for getting Beto’s message across while still representing the data truthfully and compellingly. I tried a few options with donut charts (which is what the original designer used, essentially). But differences in angle just aren’t that powerful to the eye, in my opinion. That’s why I’m loathe to use pie charts.

My first attempt. Not in love with it.


But then I realized, you can do some subtle things with font sizes and colors to make Beto stick out, without distorting the chart itself. In fact, simplifying the chart could actually make it more powerful!

Getting rid of the unnecessary curvature effect actually makes this better. And it gave me more options for playing around with the label size and colors. My eye can go to Beto (because he’s the important one in this ad), but my eye can also see that gap in the polling numbers.

Any other thoughts on ways to improve Beto’s message? Anyone else want to toss $10 Beto’s way? Comment away, dataheads.


My Former Role Model is a Fraud. What Does That Make Me?

[Editor’s Note: This post deviates from my typical form. It’s purely an essay, no data viz. Thanks for reading as I try out something new.]

I first saw Elizabeth Holmes at a small health data conference in 2013. I went to these kinds of events often for work, and the typical entrepreneurs I encountered were “health app” developers. They gameify-ed your health goals, used your camera to track food, synced your music to your running pace. Most were flashy software products that, to me, didn’t do anything.

Theranos, the biotech company Elizabeth dropped out of Stanford to found at age 19, seemed totally different.

I first encountered Elizabeth Holmes, founder of Theranos, when she spoke at the pictured data conference I attended in 2013. Photo by Christopher Farber, WIRED.

Theranos wasn’t just 0’s and 1’s on a smartphone. It was a company creating actual physical machines – a miniature laboratory device that could operate off of just a drop of blood from your fingertip (rather than a typical blood draw, with a needle in your vein). This technology promised to significantly reduce costs by automating previously human-intensive lab work, and had the potential to improve access to critical test results in places where laboratory testing facilities weren’t easy to find. Elizabeth even talked about having these tiny units in people’s homes, where a person could log longitudinal blood samples to predict disease onset and prevent adverse trajectories (be still, my data-loving heart!).

I was attending the conference on behalf of the cancer hospital I worked at, and as I sat in the audience, I was thinking about the dozens of blood draws cancer patients endure during their course of treatment. All the trips to the central laboratory while suffering from illness. The potential for these tests to detect cancer sooner. Theranos could completely revolutionize the patient experience. I could see it. The conference attendees, myself included, threw around the word “disruption” with so much certainty that it was basically in the past tense.

Unfortunately, it was all a sham. But we didn’t know that yet.

A Recovering Impostor Myself

I was 26 at the time – just three years younger than Elizabeth herself. I was earnest and optimistic and uncynical. Medical devices and laboratory testing weren’t areas I was knowledgeable about, but I was still captivated by Elizabeth.

Here was a woman roughly my age, my gender, and frankly kind of looked like me, putting forward such confidence in her ideas. She had made a name for herself as a healthcare entrepreneur, in a Silicon Valley world composed mostly of men who were mostly designing iPhone games. Honestly, if she had taken my business card that day and called me with a job offer, I probably would have moved cross-country to take it.

I suppose I was vulnerable to a leader like Elizabeth. I was living in New York City, 1200 miles from my home state of Nebraska. The day I moved, I had a meltdown alone in a U-haul, stuck in post-Yankee game traffic, unable to find a parking spot (doy, self). I felt in over my head in all aspects of life. Navigating the city, learning a new job, forging an identity for myself outside of being a student. When you’re in school, especially when you have math-heavy coursework, you always got clear feedback. You had the right answer, or you didn’t. In the real world, I felt doubt and uncertainty about whether I was doing a good job.

One morning on my subway commute, I was reading Tina Fey’s book, Bossypants. That’s when I learned a new term: “impostor syndrome.” Essentially, a person experiencing impostor syndrome has deeply ingrained insecurity. She believes she’s a “fraud” despite her actual competence or past successes. These beliefs persist even for a woman with a proven track record, because she tells herself that past successes were lucky, that competence on a future task is uncertain. Impostor syndrome is most often described in a professional context, but it can extend to all sorts of environments and events.

I was blown away that Tina Fey – this beautiful, hilarious, successful, universally-admired woman – had persistent self-doubts. When I gathered the courage to talk about my own insecurities with female friends and colleagues, almost everyone related. Impostor syndrome was ubiquitous, even among women I knew personally as self-composed and successful.

With time, experience, and lots of happy hours venting with female confidants, I became a person who could psych myself up with little pep talks. I would try to remember that I wasn’t the only one who felt like I was just guessing what to do next, that others aren’t doubting me like I doubted myself, that I shouldn’t self-deprecate when I earned a success. Now at 30-something years old, I can say simple sentences like, “Yes, I’m excellent with Tableau software,” and not feel like a total fraud or an egomaniac.

I followed Elizabeth over the years, admiring her success and self-composure – especially as a fellow woman, my age, in my field.

Despite my own progress, I still felt frustrated that my male counterparts rarely succumbed to impostor syndrome – rather, if they did feel similar inner doubts, men were more likely to still exude confidence and strength. (Yeah, all that Sheryl Sandberg Lean In goodness.) To this day, I go to data conferences and hear executives talk about how amazing their “data guys” are, which completely stokes my insecurity as a female in a male-dominated field.

In those moments, I turned to female role models who had the confidence and strength and thick skin that I strove to find within myself, especially females making headway in typically “male” professions. Tina Fey is one of those people. So is Sheryl Sandberg. And up until about 18 months ago, Elizabeth Holmes was another.

The True Theranos

Flash forward to today. I’m one-third of the way through John Carreyrou’s book Bad Blood: Secrets and Lies in a Silicon Valley Startup, about the toxic workplace culture and (probably criminal) recklessness that Theranos undertook with Elizabeth at its helm. Hell, I’m not even to 2013 yet in the book’s timeline – the year I saw Elizabeth – and I am totally overwhelmed by the stories of Theranos overpromising results, mistreating employees, and fostering a paranoid and secretive culture.

But it’s not just that Theranos was talking a big talk. They moved forward with actual shoddy implementations: using the faulty device in pharmaceutical clinical trials, collecting samples from patients suffering from SARS in third world countries, providing blood test results in actual clinics. The devices would frequently return an error message due to malfunction, and were much more finicky than the investor demo suggested. It’s like HQ Trivia having technical difficulties during a live game, except it’s your blood going to waste and your medical results being delayed.

But that admiration is long gone. Last week, the federal government indicted Elizabeth for fraud.

But even when it could generate a result, the Theranos device was untrustworthy. Theranos never validated that their tests were accurate. I want to pause on this fact. This isn’t just regulatory red tape or methodological hand-wringing. Wrong results affect people’s lives, and affect the medical decisions they make. I just read a section of the book where a man received results from Theranos indicating that he almost certainly had prostate cancer. But the results were totally wrong; a second test done by an outside lab revealed he was, in fact, at very low risk. That kind of error went unexamined and unfixed at Theranos. It’s sickening.

And I’m not even to 2013 in the book, the year Elizabeth was headlining major conferences around the world.

While I haven’t gotten to the later chapters yet, I already know some “spoilers” from following Theranos over the years. In 2016, the federal government enacted sanctions against Elizabeth, banning her from owning or operating any clinical laboratory for two years for failing multiple lab inspections. Last week, Elizabeth was indicted by U.S. federal court for fraud and could face jail time.

My Gut Reaction

I’ve been absorbed in Theranos news now more than ever. But I’m having difficulty putting my finger on what is so gut-wrenching to me about these recent charges.

Maybe I feel relief because it seems like such a close call, like I could have been wrapped up in it myself. Maybe it’s because I thought I admired Elizabeth, and it sucks watching a hero turn out to be a con. For a minute I worried the un-nameable feeling was schadenfreude – but honestly, every bone in my body wanted to see Theranos succeed and to see Elizabeth rise as a healthcare cult hero.

Well then, what is this feeling I’m experiencing? I think it may be rooted in my attempts to put impostor syndrome behind me. Yes, in an effort to overcome self-doubt, I’ve definitely portrayed more confidence than I truly felt inside. Yes, I’ve promised I could deliver something that wasn’t strictly at my fingertips, but I took a risk that I could figure it out. I’ve stood up for my ideas, even when I was scared I’d get shot down.  

The news about Theranos has made me feel a bit like … if Elizabeth was actually an impostor, then maybe I’m actually an impostor. Elizabeth portrayed more confidence about the product than reality, she promised results that weren’t ready, she defended her ideas against critics. And she is literally being charged with fraud. Are we so different?

Embracing My Inner Impostor Voice

Earlier this year, I was sending out a pretty boring internal monthly report. (By far the least glamorous part of my day job.) But I messed up. I made a pretty glaring mistake in my code, and I didn’t catch it before sending along to physician leads at each of our 14 hospitals. That night, after I had already gone home and settled on my couch to watch an episode of Grey’s Anatomy, an email from one of the report recipients made me realize my error.

I felt like total garbage. I felt like I had disappointed my clients and my boss. I wanted to crawl under a desk and hide. In reality, all I had to do was reissue the reports the next morning (with an explanation, an apology, and a heads up to my boss), but I was so afraid that people would see through me.

Sure enough, that voice was back. “You’re not a real analyst. Even the most basic programmers would have double-checked the output before sending. Good luck getting people to trust your work.”

Maybe the fact that my impostor voice persists, even after all these years, is what separates me from the Elizabeths of the world. I don’t love that this voice makes me feel doubt and insecurity. But maybe, when wielded in small and constructive doses, it can a force for good. As Richard Webber’s character in Grey’s once advised to a cold-footed colleague on her wedding day: “Overwhelming doubt is a problem. A little doubt is the sign of an intelligent adult.” And maybe no doubt at all is problematic, too.

Here’s what I’m thinking. Instead of letting my voice tell me, “You’re not qualified to do this big important project,” I can respond to it with, “Ah, this project is really big and important, let’s see what I can do and how I can contribute.” Instead of steamrolling with a false confidence, maybe my voice will remind me, “This is going to affect other people, are you sure you don’t want to talk about the caveats, or ask for help?” Maybe when I get too focused on my failures, I tell that voice, “People are OK if you can’t deliver perfection 100% of the time – but you do have to admit your mistakes and learn from them.”

To completely squelch the voice of doubt would make me a sociopath. To live symbiotically with the voice would probably make me more mature and effective.

Letting Elizabeth go as a role model also makes me wonder if I should revisit my role model roster altogether. Instead of looking up to celebrities, I should look up to my brilliant coworker who is thoughtful, creative, and killing it in her department. Instead of admiring a stranger, I should admire my friend who is a fearless negotiator. I should be grateful for my mentors in previous jobs, who took chances on me by giving me responsibilities that I didn’t even know I could handle. That is, in moving away from the celebrity role models of my 20’s, I can make room for these everyday heroes I’m lucky to know.

But I swear, if Tina Fey turns out to be a felon? I might snap.

A Complete Taxonomy of Lifetime Movies

I’ve gone down the Lifetime Movie rabbit hole. In a manner not unlike so many Lifetime female protagonists obsessed with solving the case, uncovering a conman (or conwoman!), and/or surviving being trapped under a pool cover for an entire holiday weekend.

Honestly, I’ve always enjoyed the plot audacity and general digestibility of a Lifetime movie. They’re not even a guilty pleasure, because I don’t feel guilty for watching. Then, a few weeks ago, my friend told me he worked on a recently-premiered Lifetime movie (A Dangerous Date) — AND, he and two other friends are pitching ideas to the network for new original movies. This got my data juices flowing. The rabbit hole was dug.

I wanted to use data to help my friends become successful Lifetime movie creators. (And I fully expect to be credited in the end-scroll.) Primarily, I wondered:

  1. What are the general “kinds” of Lifetime movies? Is the network interested in expanding existing genres, or putting a new spin into an under-explored area?
  2. What Lifetime movies are considered “successes”? Is it a property that’s generally well-reviewed? Or a property that’s notorious, regardless of what critics say?
  3. Can math help me decide what Lifetime movie to watch this weekend?

Enter, my Complete Taxonomy of Lifetime Movies, based on every single property in the March 2018 Lifetime Movie Club streaming catalog:

This is a network diagram that’s inspired by a poster I’ve seen by Wine Folly, that diagrams all the major types of wine. I like that it’s basically a qualitative arrangement of the topics, the simple purpose being visual entertainment. That is, as opposed to “true” network diagrams that use statistics to determine the size of each circle (“node”), the relationship/distance between circles (“tie”), and all that good nerdy stuff. (Here’s an awesome overview from a Violence Reduction Network presentation about using network analysis to quantify gang dynamics, which crime-fighters can apply to intervene on high-risk and high-influence nodes [based on their “centrality” to the network]. I AM GEEKING OUT HARD. Note that my Lifetime work is not even close to that level of statistical rigor or subject mater importance.)

I like this Taxonomy as an exploratory tool. An introduction to the Lifetime universe. One of my objectives of this analysis was to help my friends identify opportunities. So let’s see. It seems like stalking movies are a big area of interest, but that there aren’t many female-being-stalked-by-an-unknown-person stories (usually the stalker is known to the protagonist). Or, it seems like stories about Traumatic Violence, like wrongly-accused-child-abuse and victim-escaping-domestic-violence, are actually pretty well-received; it’s a heavier subject area than, say, a woman being stalked by the man who saved her from a shark attack. But a good writing team could handle it.

While the Taxonomy is a good way to organize all these individual movies into one big picture, I wanted to play around with some other ways to represent this information. Mostly because I needed some guidance on which movie to stream this weekend. But also, it could help my friend pitch to the business folks. Let me show you what I mean:

Here’s professional me at the Lifetime pitch. “I realize stalking movies are your biggest genre. Among these movies, regardless of how well it was reviewed, only 300 people on average are generating reviews. Whereas Fight-For-Survival movies by far get the most viewers, even though it’s a smaller genre and generally less well-reviewed. I want to combine the strength of stalking movies with the draw of survival movies with [I’m making this up but I love it] our new movie where a woman and her female stalker neighbor must fight together for survival.”

Here’s lazy me on Sunday afternoon. “Oooh, the most-viewed Teen Drama movie stars Jenna Dewan Tatum? I’m in for Fab Five: The Texas Cheerleading Scandal.” [Cue me on the couch with a bloody mary for the next 90 minutes.]

Right now, this analysis only includes the 105 movies in the Lifetime Movie Club streaming catalog. Lifetime has A TON more properties than what’s listed here. So let’s consider the streaming catalog my “training set” for the Taxonomy, and hopefully down the road I can get access to additional Lifetime movie data; the idea is that the categories that emerged from the “training set” should persist even when new datapoints are added. For example, the Lifetime movie we watched in high school health ed class would fall into Mental Health-Compassionate Drama-Eating Disorder-Mother Helps Child Overcome; my friend’s movie, A Dangerous Date, would fall into Crime-Deceit-Protagonist Victim. I hope the taxonomy holds up, but I also would fully-welcome a genre-defying datapoint.

Enjoy the shows, Dataheads.


College Football Bowls: Vizzed & Power Ranked

I usually love College Football Bowl season. It’s a magical time at the end of the regular season where strange intra-conference match-ups happen, where talented athletes actually get to rest for a couple weeks so they really shine, and where sponsors insist on longer- and longer-named bowl titles (one of this year’s longest and most ridiculous: the Cheribundi Tart Cherry Boca Raton Bowl … it’s apparently a juice?). There is even a bowl that proclaims it has become sentient, with an amazing twitter account and Reddit presence; I applaud you and miss you, dear sweet Belk Bowl.

This year, however, was bittersweet. Nebraska’s losing record made us bowl-ineligible for only the third time in 48 years. (Whoa, I just realized that stat corresponds to our season record of 4-8; damn you, former head coach Mike Riley, have you been trolling us?). Most of us Husker fans entertained ourselves by bandwagoning onto our newly-hired head coach’s current team. But you know what else gave me holiday cheer?

Data visualizing, baby!  This past week, I created my own dataset about each of the 40 bowl games that have occurred since December 16th; the only one left, by the way, is the Championship Game between Georgia and Alabama in two days. (GO DWAGS.)

I wanted to represent every game in an easy-to-digest, one-page snapshot. As part of this, I thought it would help to sort the bowls so the “most entertaining” were first. Of course one cannot simply eyeball the games to make this determination. There’s no math (and, therefore, no fun) in that. So, I made a simple algorithm to calculate what I call the Entertainment Value Ranking. More on that later.

You dataheads out there are probably wondering what goes into the Entertainment Value Ranking (henceforth: EVR) that I calculated. I chose four metrics that I could calculate based on the scoring data I collected. The factors I included were:

  • Total Scoring = Team 1 Final Score + Team 2 Final Score. All things being equal, I enjoy higher-scoring games more.
  • Final Score Difference: Team 1 Final Score – Team 2 Final Score. The closer the final score, the more entertaining it is. This also helps to downweight games like Cheribundi Tart Cherry Boca Raton Bowl, where FAU laid down a solid 50 in a blowout against Akron (who only scored 3).
  • Lead Change Events: Number of Times the Leader Status Changes. If one team is ahead, and the other team ties it up, that’s one event. If one team takes the lead from the other, that’s one event. I just counted these and summed ’em up.
  • Lead Lost: If Losing Team Was Ever Ahead, Amount of That Lead. It’s fun to see a come-from-behind situation. So this helps upweight those games.

Because these are each on difference scales, I normalized each of the raw values (i.e., put them all into a 0 – 1 scale), added them up, then ranked those sums. And, based on the handful of games that I watched, I think it actually does a good job representing which games were competitive and fun to watch. The Rose Bowl, which was a double-overtime shootout, should of course be #1. I swear it’s just a happy coincidence that my sweetheart Belk Bowl came in at a solid #2. I was a little surprised to see the Peach Bowl come in at a low-seeming #10; adding in a Scott Frost “bump” was tempting, but ultimately seemed self-pandering.

In the spirit of transparency, here’s a table showing how each bowl’s EVR was derived:

Last, I need to give a shout-out to Tableau blogger Chris Demartini, whose viz on golf tournaments taught me how to create step lines, and inspired the “small multiples” chart layout. Awesome work, dude.

What do you think, dataheads? Any other factors you would incorporate into an EVR? Any thoughts on bowl game sentience? Comment away.


Game of Throwin’s: An Attempt to Optimize My Fantasy Football Roster

For being such a self-proclaimed data geek, I am surprisingly inept at fantasy football. Since 2012, I’ve been in the same ESPN Fantasy League, named “Game of Throwin’s” (applaud our commissioner Megan on this amazing pun). My cumulative record is an abysmal 24-43-3.

As a believer in statistics, I’ve always used the projections to optimize my roster – i.e, for drafting and setting my weekly lineup. I start the season off with a decent team. My starters generally outscore my bench. But I realized that, basically, once my starters get injured or underperform, I’m scraping the bottom of the waiver wire barrel to fill my roster. Ergo, my team always gets worse as the season progresses.

This week, I’m trying something new. I’m going to attempt a trade!!!! I’ve never actually initiated one before. Never? It’s true! I blame this on the ESPN app’s subpar information design. Fantasy apps in general do not a dang thing to suggest trades that might be beneficial. You just have these lists of names and projection numbers. There’s no way to look at all the information, all at once, in a meaningful way. It’s impossible for me to (1) see which of my positions are strong vs weak, and (2) simultaneously which of my league-mates have complementary strengths/weaknesses.

So I took our Game of Throwin’s rosters and joined it with publicly available fantasy football statistics, using data as of Week 4. My hope was that “seeing” it all in one place would help me more easily evaluate potential win-wins (and, ultimately, propose a trade that my league-mate would actually accept).

But this viz started to make me curious. How is it possible that Breitbart Starr has the highest performing roster in the league, but hasn’t won a matchup yet? Is it poor starting lineup management? How come Zeke Is Free looks average, but is undefeated? (And is en route to beating me this week…) Perhaps it’s more important that you have high-scoring stars who drive your starting lineup, versus a stockpile of better-than-average players, one-third of whom just sit on the bench anyway?

I played around with the data a bit more. I took all of the players who appear on our rosters and, by points-per-game, assigned each to a “scoring quartile.” So, at one end, the top 25% of players with the highest points-per-game will be in the Upper Quartile; at the other end, the lowest 25% will be in the Lower Quartile. That’s what you see here:

Then, I looked to see how each team stacked up in terms of how many high-performing players they had. Maybe this is a better predictor of our league standings?

Welp. Inconclusive, huh?

OK, I do think I can make sense out of these analyses. I see the trend. Fantasy football performance is a freakin’ crapshoot. Maybe record and effort are unrelated. (Hey, Dana. Rationalize much?)

Readers, I promise to embark into the wild west of trade proposals. ESPN’s app didn’t make this easy, so I hope my data viz efforts pay off.

Any other ideas how I can use data and visualization to improve my performance? ANYTHING? PLEASE??? Comment below, y’all.


The Grog Log Blog

Ever been to the Tonga Hut in North Hollywood? It’s the perfect tiki-bar-meets-dive-bar. The bartenders are always amazing. There’s a guy who sells tacos out of a tent in the parking lot out back.

But the absolute best thing about the Tonga Hut is the Grog Log. Here’s the story behind it. There’s this classic recipe book of tiki drinks called “Beachbum Berry’s Grog Log” (I highly recommend buying it – both for the recipes and the amazing historical sidebars). Based on this book, the Tonga Hut selected its own list of 78 classic tiki drinks, and issued to its patrons a challenge: anyone can start their own Grog Log, and whoever can finish all 78 drinks within one year gets to put up a plaque. Forever memorialized in the San Fernando Valley.

Obviously, I have my own Grog Log – which, to give you a visual, is just a sheet of paper, with my name on it, where the bartenders check off the drink when you order it. It’s kept in a plastic sheet protector, in a plastic file bin, on the bartop. Check it out, in all its glory (right).

But the one thing that’s been bugging me about the Grog Log – it only lists the name of the drink. Not what’s in it, or how boozy it is, or whether it will come to me on fire, or other meaningful questions in my decision analysis. While it’s fun to roll the dice – suuuuuure – I decided to create a Tableau data viz to help me explore the unknown.

So, I purchased the Beachbum Berry book, and turned this beautiful Grog Log into beautiful DATA!! Note that the viz works a little better on a desktop because of its size and its interactivity – but mobile can get the job done if that’s what ya got.


I like this explorer view for deciding whether or not I want any individual drink. But it doesn’t necessarily help in the decision-making process when faced with a list of potential options.

So I made a second data viz, with some Q&A features to help my fellow Grog Loggers narrow down their choices. My favorite feature is the Patrick Baker Button. This is a special shoutout to my buddy, who not only is the person who ushered in the Grog Log to our circle of friends, and not only is the person farthest along in his Grog Log, but who is also the birthday boy!!! He turns some indeterminate age today, and the Patrick Baker Button is my gift. (People like getting data for gifts, right? That’s a thing?)

One thing to note: You can see from these vizzes, roughly, the proportion of each ingredient. Sure, you can get out a ruler and measure the diameters of each circle to get the ratios. But out of respect for the Beachbum Berry author, I didn’t publish the value of the actual amounts. And I didn’t include any info on how the drink is prepared (e.g., whether it’s blended, what kind of glass to serve in, and so on), partly to respect his hard research but also party because that seemed really time-consuming and I have a full-time job, people. So for real, buy this tiki book. And buy yourself a volcano bowl. And throw your friends a party because, at some point, you won’t be able to afford going out because you’ve spent all your money at the Tonga Hut. Oh, and invite me!

Enjoy loggin’ them grogs, dataheads.


P.S. Grog Loggers who beat the challenge get to hang a plaque “of your own making.” If I ever make it there, I’m about 90% sure I’ll label it as “Bob Loblaw’s Grog Log.” Any other suggestions? Or someone willing to talk me out of it?


Quasquicentennial Makeover

You’ve probably never heard of Lynch, Nebraska, but it’s a special place. For those readers not intimately familiar with rural Midwestern geography, Lynch is a farm town in northeastern Nebraska with a population of 234. That’s a metropolis compared to the next-closest town – Monowi, Neb., population 1. (That “1” is an amazing lady named Elsie who, naturally, runs the town bar. Monowi has actually received a surprising volume of popular media coverage.)

Reason #1 To Love Lynch: My dad is from there. And Papa Barnes rocks.

Reason #2: While Lynch High School alumni are now scattered all across the world – Nebraska and beyond – they’ve maintained such palpable pride in their roots. The town explodes every June for the annual Lynch High Alumni Weekend. I believe my dad graduated in a class of fewer than 10 people. And almost all come back, with their families in tow, for the yearly festivities.

Alumni Weekend typically entails a golf tournament, a town hall dinner, plentiful pool time, and ample cheap domestic beer. But 2017 is gonna be a doozy. This year is the town’s quasquicentennial. Lynch turns 125! In a truly commendable branding move, the town has dubbed it the (much more pronounceable)  Lynch Q125.

The Q125 organizers created a Facebook page, where someone had posted a flyer with the weekend’s itinerary. The flyer is flush with thorough, detailed information, with the event meticulously laid out in an eye-catching and shareable format. I imagine it took the author a tremendous amount of time to research, confirm, and format all the items. Truly, kudos to the author.

But in my information designer heart, I knew something important was missing. This is such hard-earned, highly-valuable information. I knew a different design could truly do it justice.

Here’s the makeover I came up with:

An ideal information design would effectively communicate key Q125 info – and maybe even increase attendance. I want to see Lynch packed to the proverbial gills this Father’s Day weekend (though now that I think about it, the added competition in the horseshoe tourney is not ideal for my prize-winning aspirations…).

You might be interested in a little insight on my thought process and the design tradeoffs I made. Here goes:

 OriginalDataDana Makeover
TextFont and color variations, meant to grab attention and convey a sense of spiritedness, make it difficult to process the key event information quickly......whereas one font and a simple color scheme (Lynch High orange and black - Go Eagles!) improve readability.
GraphicsThe graphics are fun and help break up text......but reducing and simplifying the graphics make the flyer cleaner, and make the Lynch logo (which in my opinion is gorgeous) more prominent.
LayoutTwo-column format, similar to a newspaper or magazine, is good for narrative text, but......a calendar format lends itself better as a visual guide, conveying events that cross multiple days as well as the sequence within a day.
Social MediaThe original flyer was posted on the Facebook page......but adding a footer with the Facebook info could help improve traffic to the page, as folks share and print this flyer outside of the Facebook platform.
Reppin' Local BandsThe original includes the logos for two local bands playing Friday and Saturday. I wavered a lot on whether to keep these, because (1) I'm sure the bands appreciate (or maybe even asked for) the advertising, and (2) it's more likely to catch the eye of people who are fans of the band......but tradeoffs are sometimes needed. I prioritized event information readability and comprehension, targeted to a wide audience. But if these bands are keynote events, or if there was a strong need for the band-specific banners, then I'd have to take another whack at how to incorporate this.

I’m also willing to share a couple of my handy tricks. First off, I did this all in PowerPoint (seriously!), so don’t feel like you need fancy software to produce something cool. Second, the simple graphics come from my go-to logo library, (Attribution time! Motorcycle by Edward Boatman, Golf by Hopkins, Flags by Aldric Rodriguez)

My final handy trick? I think Lynch is great, and my heart yearns for this Q125 to be the best ever. So putting your heart into these things is always a major plus.

Before (Original)

After (DataDana Redesign)

Who else is coming? In which events would you like to challenge me? Any other design ideas or event poster inspiration? Use the comments below and tell me what ya think.


The Meat Photo: An Interactive

Back when I was a lowly college sophomore, my dormmates and I had an innocent courtyard grill-out. The evidence, beautifully captured by Facebook photo auteur Tad. Each of us who partook were tagged with our respective offerings. Mine, a veggie skewer a.k.a. a Dana-kebab.

Faster than we could drop my skewer through the grill bars (why didn’t we grill perpendicular?!), this photo thread got out of control. Since this fateful day – May 6, 2007 – The Meat Photo has continually haunted my Facebook notifications and newsfeed.

On its first couple days online, somehow (hypothesis: I think it was finals week), we went bananas on the comments. In the ensuing weeks literally hundreds of comments were added. Mostly uncouth euphemisms that I won’t repeat here.

I should put this in perspective. Today, getting hundreds of interactions on something you post isn’t uncommon. But this totally wasn’t the norm at the time. To illustrate, let me tell you what Facebook was like in May 2007:

  • My URL was still bookmarked to
  • Whenever I wrote a status, it had to follow the syntax “Dana is…”
  • There were no “like” buttons
  • Even just a few months prior, the homepage used to only display your friends’ birthdays your outstanding pokes
  • I still received an email alert whenever anyone wrote on my wall or commented on a photo of me


…OK, let me level with you. I pretend to be irritated by this photo. But the truth is, I love it the most. Because to this day, every few months, someone re-ups the thread; which unleashes a flurry of comments from all involved; then it goes dormant for another six months until the cycle repeats. This has gone on for nearly a decade. This photo is practically folklore.

The most recent comment storm called out some ground-shaking information. As the commenter pointed out, this week is the ten year anniversary of this stupid facebook post!!!! And there’s even talk of a reunion, bringing together people from Nebraska to California to both Washingtons. You guys. This would rule.

So, in honor of this momentous occasion, I bring to you: a too-deep deep dive into the ridiculousness that is The Meat Photo.

This data raises important questions. First, we need to talk about why nobody posts at 2:00 p.m, even though presumably we’re all awake; are you all hanging out without me? Second, did George Clinton really comment on this thread in October 2012?

And most importantly…can we hit 1,000 posts before this pic’s tenth birthday? Whoever posts the 1,000th post, you’re getting an entire pitcher of Elk Creek Water, DataDana’s treat.