Profiling
The profiling/
directory hosts utilities for measuring and profiling ORQ’s performance.
Contents:
stopwatch.h
– Lightweight wall-clock timer.thread_profiling.h
– Thread-local CPU usage and timing utilities.utils.h
– Miscellaneous helpers shared across benchmarks.
-
namespace orq
-
namespace benchmarking
-
namespace stopwatch
Typedefs
-
using sec = std::chrono::duration<float, std::chrono::seconds::period>
Functions
-
void timepoint(std::string label)
Mark a timepoint with the given label and output the elapsed time on Party 0. If this is the first time
timepoint
has been called, just print the label, and start the clock. Otherwise, outut the time since the last timepoint.- Parameters:
label –
-
float get_elapsed()
Get the elapsed time without printing. Still registers intervals in the manner of
timepoint
. This means that interleaved calls totimepoint
andget_elapsed
will return the time elapsed since either was last called.- Returns:
float
-
void done()
Output the time elapsed since the first call to
timepoint
orget_elapsed
. This is useful to call at the end of a program to see how long the entire execution takes.done
can be called multiple times; it will always output the elapsed time since the same initial timepoint.
-
void profile_init()
Initialized the profiler.
ORQ provides a primitive profiling utility based on the stopwatch which simply aggregates elapsed times registered under each label. The profiler should not be used when high-accuracy measurements are needed, but it is sufficient for simple tests and benchmarks.
-
void profile_timepoint(std::string label)
Register a profile timepoint under
label
. Semantics are similar totimepoint
, but nothing is printed.- Parameters:
label –
-
void profile_preprocessing(std::optional<std::string> label = {})
Register a preprocessing timepoint with an optional label. If no label is provided, update the last timepoint but do not measure the elapsed time. If a label is provided, measure the elapsed time for both this label and the special
PREPROCESSING
symbol.- Parameters:
label –
-
void profile_comm(std::string label, double t)
Register a given interval for the given communication category. This function behaves differently due to architectural differences in measuring compute versus communication, but (TODO) should probably be updated.
- Parameters:
label –
t – time in seconds
-
void profile_done()
Complete profiling and output a profiling report. Prints each category of aggregated times and separates out preprocessing. Also prints out a breakdown of offline versus online time.
Variables
-
int partyID = 0
-
std::chrono::steady_clock::time_point _tp_first
-
static std::map<std::string, double> profile_times
-
static std::map<std::string, double> preproc_times
-
static std::map<std::string, double> comm_times
-
static std::chrono::steady_clock::time_point profile_last
-
static std::chrono::steady_clock::time_point preproc_last
-
using sec = std::chrono::duration<float, std::chrono::seconds::period>
-
namespace stopwatch
-
namespace benchmarking
Defines
-
HOST_NAME_MAX
Functions
-
std::string exec(const char *cmd)
Execute the command cmd and return its output.
- Parameters:
cmd – command to run
- Returns:
std::string stdout from the command
-
std::string prependHash(const std::string &str)
Prepend a hash (
#
) to each line on the inputstr
and return the new string. Does not modify its input.- Parameters:
str –
- Returns:
std::string
-
std::string hostname()
Get the hostname of this machine. Returns only the first
HOST_NAME_MAX
characters of the host name. This value defaults to 256 but can be increased with a compile-time define.- Returns:
std::string
-
namespace orq
-
namespace instrumentation
-
namespace thread_stopwatch
Functions
-
uint64_t get_now_ns()
Get the current time of
steady_clock
in nanoseconds.- Returns:
uint64_t
-
double get_aggregate_comm(int pid = 0)
Return the sum of all measured timing events. TODO: this is incorrectly named; does not apply only to communication.
- Parameters:
pid –
- Returns:
double
-
void init_map(std::thread::id tid)
Initialize the timing map for thread id
tid
.- Parameters:
tid –
Variables
-
std::map<int64_t, std::atomic_uint64_t> timing
Map from thread ID to timing information. Switching to atomic u64 for communication timing purposes only.
TODO: revert this to fix
write
-
class InstrumentBlock
- #include <thread_profiling.h>
A utility class to measure the time taken in a given C++ code block. Instantiate a named instance of this class at, or near, the start of a block. The
constructor
of this class will record the current time, and thedestructor
(which fires when the block completes) computes the elapsed time and stores it in thetiming
map.InstrumentBlock
records the time between its construction and the end of the block. Thus it need not time an entire block.The string
meta
passed in the constructor labels a block. This information is saved to the output file and can be used by later analysis scripts.It may be possible to use
InstrumentBlock
in other, non-block, contexts, by taking advantage of C++ scoping rules. This behavior has not been tested.WARNING: without an object name, the compiler will immediately destroy the object, and no timing information will be available. That is,
works, butInstrumentBlock _ib{"wait"}
will not.InstrumentBlock{"wait"}
Public Functions
-
template<typename ...T>
inline InstrumentBlock(T... args) Default constructor when INSTRUMENT_THREADS is not defined. This constructor is provided so that ORQ can be compiled without thread profiling, but without having to remove
InstrumentBlock
s sprinkled throughout the codebase. When profiling is turned off, the constructor does nothing and we use the default destructor.- Template Parameters:
T –
- Parameters:
args –
-
template<typename ...T>
-
uint64_t get_now_ns()
-
namespace thread_stopwatch
-
namespace instrumentation
Defines
-
PROFILE(EXPR, NAME)
Typedefs
-
using sec = std::chrono::duration<float, std::chrono::seconds::period>