Goodbye to Google+

03/20/2019 Leave a comment

Google+ is officially shutting down for consumers on 4/2/2019 and I have surprising feelings about it as a product and a codebase, having worked to re-write it in my time as an engineer at Google in 2014.

Screenshot 2019-03-11 14.45.21

“Looks like you’ve reached the end” – Indeed we have, G+

I’d never been an active G+ user and found the Circles concept confusing/clunky and the feature set too large.  I’ve always viewed Twitter as the ideal, time-based news feed.  When Polar as acquired by Google and we joined G+, we began an exciting project to overhaul G+ with product focus and a delightful newsfeed experience, mobile first.

Internally, we re-wrote a Java monolith using a new isomorphic (client/server-side) web component framework and micro-services.  We designed and built mobile-first creating fast, responsive experiences that worked on both small phones and desktop displays.  We built on the success of Google Photos to embed, analyze, and render content in the news feed beautifully.

Re-writing an application from scratch is generally not the right solution to your problems and often fraught with peril.  Our thesis was to make a faster, mobile first implementation using a new internal framework and simplify the product by cutting features.  Both of these were essential pre-requisites to executing a successful re-write.  We also had the infrastructure to build/validate our prototype and then migrate the application URL path-by-path from the legacy monolith into the new system. We let early-adopters beta test the new system.  This was a technical success, though not without toil and setbacks.

I’ll never forget the full, public launch in the small hours of the night when we ramped production traffic to 100% from the legacy to new version of G+ – only to have our backends overloaded and latency spike.  After 24 hours of urgent debugging, we found a browser toolbar sending high volumes of traffic from all over the world to load a formerly minimal page – just to scrape the unread post count for a user. In the re-write, this page contained an expensive fully rendered news-feed.  With the query cost for that page grown dramatically, the traffic overloaded our newsfeed backends.  We blocked the traffic and re-launched successfully.

Another time, we experienced a production outage when users were served up a blank, white page.  The feature flag/experiment system had a bug resulting in users being shown the HTML markup for one variant and the CSS for another – causing no visible content on the page.  Yet everything was working as far as our automation was concerned and it took manual bug reports to realize what had happened.

I will always remember the teams building G+ and the expertise, idealism, and excitement of the staff in the Social product area fondly.

I won’t miss Google+ in my life, but I am proud of what we built and how we built it.

Categories: Uncategorized

A Cargo Cult of Personality

03/31/2016 Comments off

Leaders and entrepreneurs often justify bad behavior and tyrannical management practices by invoking the mythology of Steve Jobs.  This is a “Cargo Cult of Personality“, where people emulate the mythologized and unconnected personality traits of successful people.

Richard Feynman spoke about Cargo Cults:

In the South Seas there is a cargo cult of people. During the war they saw airplanes land with lots of good materials, and they want the same thing to happen now. So they’ve arranged to imitate things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head like headphones and bars of bamboo sticking out like antennas—he’s the controller—and they wait for the airplanes to land.

They’re doing everything right. The form is perfect. It looks exactly the way it looked before. But it doesn’t work. No airplanes land. So I call these things cargo cults because they follow all the apparent precepts and forms, but they’re missing something essential, because the planes don’t land.

It’s easier to learn from failure than from success when it comes to management, business, or technology.  When someone fails, publicly it’s far easier to track the causality from what they did or didn’t do back to important principles.  When people succeed publicly, we only see the highlight reel of their longer struggle and journey, not the behind the scenes footage.  It’s easy to misconstrue the most prominent features and myths about people or companies in the glow of their success.

When it comes to Steve Jobs, it seems he learned to be a better manager and person slowly over his career and only put the pieces together at the end of his career in rejoining Apple as CEO.  Steve Jobs was successful in spite of being an asshole, not because of it.

I see so many entrepreneurs and business leaders thinking that a tyrannical style of micromanagement or ruthless pursuit of products over people will deliver success.  The most recent stories about Tony Fadell only seem to support this happening once again.  It’s not how you run a business or manage people, it’s a cargo cult of personality.

Categories: Management, Startups

Building Timelapse Aquarium Video

02/26/2014 Comments off

Aquarium Dropcam

My wife Kathleen and I keep a 29 gallon mixed tropical reef aquarium. For her birthday, I bought her a Dropcam webcam so we could both watch our reef when we’re away from home. Dropcam provides a live, hosted video stream and also has APIs for other data, like still snapshots.

Tropical reefs are teeming with life, though much of it moves at dramatically slower scales of time than we do. For example, when a coral captures food it can take 30 minutes to open it’s mouth, swallow the food, and digest a single bite. Snails, sea urchins, and other invertebrates slowly crawl the sand and glasswork in our tank.

Looking at a single picture of a tropical reef, it’s easy to miss how active the reef is. To show the activity, I wanted to create timelapse video of our aquarium.

Aqurium Timelapse for January 2014

This video distills the entire month of January from 2014 into a few minutes of video. At this scale, you can see our reef is teeming with life.

Dropcam Tools hacking

I started writing a toolkit that would capture a snapshot from our Dropcam every 10 minutes as a still image. Then, I wrote tools to stitch these still images together into timelapse video using ffmpeg.

You can find the source code on Github:

https://github.com/wpeterson/dropcam-tools

Categories: Aquarium, Open Source

Emoji Gem 1.0 Release

02/12/2014 Comments off

Emoji Gem:

Earlier this week, we are proud to release the complete 1.0 version of the emoji gem on RubyGem.org for public use.

What is Emoji?

Emoji are cartoonish icon characters, distributed as a UTF-8 font, like this heart: ❤. Using emoji began in Japan with mobile phones, but has grown in popularity throughout the internet over the last 3-4 years. Unfortunately, emoji character support is incomplete in many cases (like the Chrome browser). Users of Polar Polls easily add emoji on their iPhones and we wanted to support them. However, we needed a way to support emoji on unsupported web platforms.

Enter the emoji gem. The emoji gem bundles a comprehensive emoji index and a complete emoji image library from Phantom Open Emoji. This gem provides a fast and simple Ruby translation between UTF-8, moji characters, and image representations. Additionally, this gem implements a Rails engine that serves up the Emoji image library when replacing characters with images. This gem works in all Ruby interpreters, but will load a native-optimized string engine if it’s available (everywhere but JRuby).

Backstory

This project began after-hours at the Burlington Ruby Conference in August of 2013. I was having dinner with Steve Klabnik and a handful of other developers at Farmhouse, when Steve mentioned his plans for building an emoji gem. We had struggled with web support for emoji at Polar Polls after enabling emoji support in our iOS app. I exchanged information with Steve and hoped to work together on building a solution.

After the conference, I worked for several weeks putting the first working version together. Steve and I roughed out an outline for the architecture and tools. Shortly after, we opened up the gem to review/contribution from other folks. We were lucky to get several pull requests and I polished a version that Polar Polls could use in production. Over the last 6 months, we’ve collected a handful of fixes and performance tuning to prepare for our 1.0 release.

Thank You!

I’ve been a huge beneficiary of open source software and the community supporting Ruby/Rails. I’m glad to have a chance to build something useful and give back to the community. None of this would have been possible without the following folks:

  • @steveklabnik – for the idea, securing rights to “emoji” gem name, guidance, and publicity for the project
  • @ryan-orr – for transferring the emoji gem account for our new project
  • @mikowitz – for contributing code and adding Ruby String extensions.
  • @semanticart – for contributing code and expanding Ruby version support.
  • @parndt – for improving our docs and README.

ActiveModel Type Coercion and API Validation

02/03/2013 1 comment

ActiveModel Type Coercion

ActiveModel is a powerful ORM that looks up the data types for model values and applies automatic type coercion and casting for user data. Data coming from Web requests as JSON or query parameters are usually cast from Strings into other datatypes, implicitly, during assignment. This is a helpful feature 99% of the time, but can be frustrating when invalid values are silently discarded. Especially if you’re trying to use model level validations for API validation.

Optional API Value Validation

As we built the API for App Cloud, we put together rules for handling JSON input and validating data. For optional user input values, we decided that if a user provided invalid data, we should return a validation error vs. silently discarding invalid values. For example, if there was a “duration” value, that was an Integer number of days a value of “foo” should produce an error. However, the default behavior is to silently discard a value like “foo” assigned to a numeric field.

We know this has happened when the coerced value is nil but there is some user data present before the coercion. There’s an example Validator below implementing this logic.

A Coerced Value Validator

Digging into the source code, ActiveModel first assigns the value for a field ‘duration’ into an attribute ‘duration_before_type_case’, which preserves the raw value before coercing the input based on the type of the attribute. Using this, we can mix-in validations for models in our API that will mark the record invalid if data is silently discarded during type coercion. Before this, providing ‘foo’ as an integer value results in nil but no errors. Now, we’ll capture the error on this input.

class CoercedValueValidator < ActiveModel::Validator
  def validate(record)
    coercible_attrs = record.keys.select do |k,v|
      [Integer, Float, Date, Time].include?(v.type)
    end.map(&:first)

    coercible_attrs.each do |attr|
      if record.send("#{attr}_before_type_cast").present? && record.send(attr).nil?
        record.errors.add(attr, 'is invalid')
      end
    end
  end
end
Categories: Database, Rails

Monitoring Redis Replication in Nagios

12/13/2012 Comments off

I’ve put together the following Nagios plugin for monitoring Redis slave server replication to ensure replication is successful and the lag is within a reasonable time limit:

#!/usr/bin/env ruby
require 'optparse'
options  = {}
required = [:warning, :critical, :host]

parser   = OptionParser.new do |opts|
  opts.banner = "Usage: check_redis_replication [options]"
  opts.on("-h", "--host redishost", "The hostname of the redis slave") do |h|
    options[:host] = h
  end
  opts.on("-w", "--warning percentage", "Warning threshold") do |w|
    options[:warning] = w
  end
  opts.on("-c", "--critical critical", "Critical threshold") do |c|
    options[:critical] = c
  end
end
parser.parse!
abort parser.to_s if !required.all? { |k| options.has_key?(k) }

master_last_io_seconds_ago = `redis-cli info | grep master_last_io_seconds_ago`.split(':').last.to_i rescue -1

status = :ok
if master_last_io_seconds_ago < 0 || master_last_io_seconds_ago >= options[:critical].to_i
  status = :critical
elsif master_last_io_seconds_ago >= options[:warning].to_i
  status = :warning
end

status_detail = master_last_io_seconds_ago == -1 ? 'ERROR' : "#{ master_last_io_seconds_ago.to_s }s"
puts "#{status.to_s.upcase} - replication lag: #{status_detail}"

if status == :critical
  exit(2)
elsif status == :warning
  exit(1)
end

This can be wired up in Nagios with an NRPE remote execution check:

define service {
    name                            redis_replication
    register                        1
    check_command                   check_nrpe!check_redis_replication!$HOSTNAME$ 100 250
    service_description             Redis Replication
    hostgroup_name                  redis_slave
}
Categories: Operations

MongoDB Indexing, count(), and unique validations

11/10/2012 Comments off

Slow Queries in MongoDB

I rebuilt the database tier powering App Cloud earlier this week and uncovered some performance problems caused by slow queries. As usual, two or three were caused by missing indexes and were easily fixed by adding index coverage. MongoDB has decent index functionality for most use cases.

Investigating Slow count() queries

Unfortunately, I noticed a large variety of slow queries issuing complex count() queries like:

{ 
  count: "users", 
  query: { 
    email: "bob@company.com", 
    _id: { $ne: ObjectId('509e83e132a5752f5f000001') }
  }, 
  fields: null 
}

Investigating our users collection, I saw a proper index on _id and email. Unfortunately, MongoDB can’t use indexes properly for count() operations. That’s a serious drawback, but not one I can change.

Where were these odd looking queries coming from? Why would we be looking for a user with a given email but NOT a given id?

The uniqueness validation on the email key of the User document and many other models was the culprit. Whenever a User is created/updated, ActiveModel is verifying there are no other Users with the given email:

class User
  include MongoMapper::Document

  key :email, String, unique: true
end

Use the Source!

Why is a unique validation triggering this type of count() query? Within Rails 3.x, this functionality is handled by the UniquenessValidator#validate_each implementation, which checks for records using the model’s exists?() query:

  finder_class.unscoped.where(relation).exists?

The exists?() method is a convention in both ActiveRecord and MongoMapper, checking for any records within the given scope. MongoMapper delegates it’s querying capability to the Plucky gem, where we can find the exists?() implementation using count():

  def exists?(query_options={})
    !count(query_options).zero?
  end

Root Cause and a Patch to work-around MongoMapper/Plucky

In SQL, using count() is a nice way to check for the existence of records. Unfortunately, since MongoDB won’t use indices properly for count(), this incurs a big performance hit on large collections.

I added a MongoMapper patch to work-around the issue. We can patch the exists?() method to use find_one() without any fields instead of the expensive count() path:

module MongoMapper
  module Plugins
    module Querying
      module ClassMethods
        # Performance Hack: count() operations can't use indexes properly.
        # Use find() instead of count() for faster queries via indexes.
        def exists?(query_options={})
          !!only(:_id).find_one(query_options)
        end
      end
    end
  end 
end