New Features: Alert Management Upgrades

Over the past few months, we’ve been steadily upgrading Instrumental’s alert management features! We’ve improved the overall user interface to make working with alerts faster and easier and added new features, such as:

Real-Time Alert Badges

Throughout the application, in both the menus and on the alert pages, alerts and the red open-alerts badges will update in real-time. Read More

Monitoring AWS Lambda with Instrumental

We recently launched our CloudWatch integration, which brings your CloudWatch data into the same monitoring environment as your application metrics, service metrics, and uptime monitoring. Most of our infrastructure is in AWS, so this is a great feature for us; hopefully you’ll like it too!

While developing the CloudWatch integration, we found that AWS Lambda was a great fit for a specific engineering challenge.¹ To quote an earlier post on the Expected Behavior blog about adding Lambda to an existing application:

What we really needed was a system to work many thousands of jobs concurrently, but one that only costs us money when we’re actually using it.

Read More

New Feature: Super-Fast CloudWatch Integration

While Instrumental offers broad support and integrations for application, server, service, and custom monitoring, certain AWS data is only available within the AWS CloudWatch service. Over the past few months, we’ve been testing a deep integration with CloudWatch and are excited to release it to all users.

Like the rest of Instrumental, our CloudWatch integration is designed to be simple, configurable, and lightning-fast. Read More

Amazon Kinesis: the best event queue you’re not using

Instrumental receives a lot of raw data, upwards of 1,000,000 metrics per second. Because of this, we’ve always used an event queue to aggregate the data before we permanently store it.

Before switching to AWS Kinesis, this aggregation was based on many processes writing to AWS Simple Queue Service (SQS) with a one-at-a-time reader that would aggregate data, then push it into another SQS queue, where multiple readers would store the data in MongoDB.

Read More

What to Expect When You’re Expecting Failure


Instrumental is a key piece of infrastructure for many businesses, including Instrumental!  We put significant effort into making sure that Instrumental customers can rely on us to be accurate, available, and consistent, but no system is perfect.  There are two key components of our approach to reliability:

  • Make it hard to do the wrong thing
  • Assume that everything is going to fail

Example Incident

With that approach  mind, let’s talk about what happened on the 16th of November.   Read More

Server & Application Monitoring Pricing Comparison: Instrumental, New Relic, Datadog, Librato, and SignalFx

Confused about the price of an application and server monitoring tool? So were we! Every tool is priced differently and there are a lot of nuances. We’ll walk you through important terminology differences, the pros and cons of different plans, and then discuss the pricing details for Librato, New Relic, SignalFX, Datadog, and of course, Instrumental. Read More

Best Practices for Deprecating and Removing an API

Why removing an API the right way is important

A few weeks ago, our alert klaxons started blaring (alert notifications – we don’t really have klaxons, but maybe we should). We had a massive spike in failed background jobs. With the scale of data processing at Instrumental, a few minutes of background jobs is potentially hundreds of millions of data points. Read More

Monitoring for Docker, MongoDB, Redis and more!

Today, we’re launching InstrumentalD as a major upgrade and replacement of Instrumental Tools. Since 2011, Instrumental Tools has provided a system metrics daemon and a powerful plugin framework to write custom scripts for service monitoring.

While the ability to write fully custom service monitoring in a language of your choice is an important feature (and we’re keeping the plugin framework!), InstrumentalD includes out-of-the-box service monitoring for the following:

  1. Docker
  2. MySQL
  3. Memcached
  4. MongoDB
  5. Nginx
  6. PostgreSQL
  7. Redis
  8. (and more to come)

For each service, we’ve selected the critical metrics everyone should be monitoring, and we list each metric sent in the service documentation page. Read More