http://carlos.bueno.org/optimization/
'The Mature Optimization Handbook' is about monitoring and profiling continuous systems. Reasonable ideas but not particularly dense.
The performance problem definition must be falsifiable
Use performance measurements to try to falsify theory, not just confirm it
A measurement is a number obtained during some profiling event
Metadata are attributes of the system or profiling event
A sample is a collection of measurements and metadata relating to a single event
A metric is a statement about a set of samples, typically an aggregation
Continuous systems need to be measured in production
Store recent samples in a flat table in RAM, so we can ask unexpected questions
Store old samples as aggregated metrics, to save space
Measurement systems need to be tested by sanity checking and independent confirmation
The dimensions of performance measurements are usually time, space and instructions
Record time measurements in u64 microseconds
Record space measurements in u64 bytes
Record instruction measurements in u64 kilo-instructions (because >1000 instructions per microsecond, might overflow)
The main visualization needs for monitoring are raw data (in a table), time-series, histogram and scatter
Any visualization a human ever looked at is probably important enough for a permalink
Design monitoring dashboards by asking while the system is operating normally, the ___
graph should never ___
Design diagnosis tools by asking when the ___
is operating abnormally, the ___
graph can eliminate ___
as a possible cause
Localize performance anomalies by recursively subdividing on different dimensions
Choose alarm thresholds by comparing against historical data