Category: Application Monitoring

Best Practices for Deprecating and Removing an API

Why removing an API the right way is important

A few weeks ago, our alert klaxons started blaring (alert notifications – we don’t really have klaxons, but maybe we should). We had a massive spike in failed background jobs. With the scale of data processing at Instrumental, a few minutes of background jobs is potentially hundreds of millions of data points. Read More

Why your Monitoring must be Cloud-First

As “The Cloud” continues to assert itself as the primary way companies manage their technology infrastructure, it’s worth asking if the software you need to monitor that infrastructure should change to match. In most cases the answer is yes, but maybe not in the way you’re expecting.

The biggest difference between traditional data center infrastructure and the cloud is flexibility. Read More

How Time Series Data Can Serve More Than One Purpose

Here’s a pro-tip for you when you record metrics on Instrumental: anything that can be a gauge, should be a gauge.

Why is that? You can use the Instrumental Query Language to get increment data, because we record the number of times we receive your gauge calls.

Let’s say you’re recording the time it takes to complete one request on your server when a user signs up, but you also want to know how often that is happening. Read More

The Super-Fast Quick-Start Guide to Monitoring the Right Things in Your Application.

Deciding what to measure is hard, and even daunting at first. There’s a ton of code in your project and you don’t want to just slap a gaggle of useless metrics in there. Your measurement should mean something, dammit! On the other hand, it would be nice if it didn’t take forever to get started :)

Don’t worry – getting started doesn’t have to take forever. Read More

Application Monitoring Is The New Unit Testing

Once upon a time, automated testing was not a popular idea. It was too expensive. It was too time-consuming. At best, it was a nice-to-have.

The prevailing idea was that if you were a good and careful software developer, regressions weren’t a problem. When a regression did happen (rarely, of course), the good and careful software developer that you are would carefully consider why it had happened and make a correction to prevent it from happening again. Read More

Who’s Hammering Your System?

Do you remember that time one user with a rogue script saturated your workers, causing a ton of problems for your system (and your other users!) without even realizing he’d caused a problem?

We ran into a similar issue a while back with our HTML to PDF product, DocRaptor. One of our users was generating way more documents than normal – his test document creation alone accounted for 75% of all documents being generated! Read More

Proving What You Know

It happened again. You were loading the front page of your app again and the load time took 27 seconds. You’ve seen it before, you think, on every second Tuesday, Arbor Day, and right after every new deploy. You’ve looked at the web server log files, application server log files and your database slow query logs. Read More