Durations critical to the application's success.
DB Operation Time
The amount of time that has passed between when the application code issued a query to the database and when it received a result.
Database drivers typically allow for some form of callback mechanism that allows you to "hook" when a query is performed. Otherwise, wherever you interact directly with the database.
Database health, schema and query efficiency, system healthiness
MySQL Insert Query Time
Postgres Select Query Time
Mongo Upsert Document Time
External Service Response TIme
The amount of time required to interact with an external service (Facebook, Twitter, Twilio, Mandrill, etc.)
At the points of interaction with those services; "find your friends", "send an email", "page this number", etc.
External system reliability, third party induced bottlenecks
Facebook API Friend Fetch Time
Twilio Phone Call Time
Twitter Make Post Time
Full Request Fulfillment Time
How long it takes your service to fulfill one user's request. This measurement would represent the total amount of user perceived time for a piece of data to be sent to your system and acknowledged.
At the edge of user interaction with your service. For a website, this would generally be in the browser; for an API service where you're not able to capture at the call site, you could capture this at the forward edge of your service ( load balancer, Apache / Nginx ).
System health, quality of service fulfillment
Overall System Latency
Queue Processing Time
The amount of time that a particular item (job, data to be written, etc.) has spent in a system queue.
The queue itself, if it supports such instrumentation; alternatively, measure it at the output / queue processor point.
Worker efficiency, queue capacities
Time to Process an Email Job from the queue
Time to Convert an MPEG from the incoming movie queue
Time load balancer waits to connect to application server
Time to Receive Message
The amount of time that passed between the system issuing a message (email, notification, etc.) and when the recipient acknowledged its receipt.
If you control both the sender and receiver (see: iOS apps), you can measure this at the receiver. Alternatively, you can send test messages through this pipeline to a receiver you control to estimate overall time to receive a message.
Minimum expected user response time to communication, message subsystem efficiency, third party reliability
Time until a sent email was received
Time to receive an iOS Notification
Time to receive an Android Notification
Time to Write to Disk
The amount of time required to write some payload to disk (including fsync / confirmed write time).
After the fsync / flush / disk confirmation call
Contention for the disk resource, disk performance over time
Time to write an SQLite cache db
Time to write a dictionary file
Lock Acquisition Time
How long a given process will wait until it is able to acquire a lock on a shared resource
Around the locking call (whether that be a local lock, eg @synchronized, or a distributed lock, eg Zookeeper)
Lock efficiency, potential gains of work splitting, effects of contention
Time to acquire a lock on a user's Mailbox
The amount of time between issuing a new deploy and when the deploy is successfully running in production
Deploy scripts (Capistrano, Chef) or CI server
Minimum crisis response time
Time to deploy
Library Interaction Time
The amount of time required to interact with an external library
Calls to external libraries (GZip, FFMPEG, PhantomJS)
Library efficiency, cost of external framework integration
Time to compress a payload as GZip
Time to stitch together multiple images using ImageMagick
Time To Load
The amount of time to go from a cold state (unloaded) to a fully responsive state
Web pages can measure this via DOM events (window onload, DOMContentLoaded); native applications can use view lifecycle methods to capture these points.
Overhead of included resources and initialization
Time until the window onload event occurred
Time until the DOMContentLoaded event occurred
Time until an iOS view's viewDidLoad message was received
The age of a given piece of data in the system. Usually such objects are ephemeral, such as a session, message or job.
Typically requires calling into your queue system (for queue messages) or into the relevant cache or DB
Efficiency of objects being processed or cleaned up
Age of the "next" job in a queue
Age of the oldest active session
Average age of current open sessions