quanta/
lib.rs

1//! Performant cross-platform timing with goodies.
2//!
3//! `quanta` provides a simple and fast API for measuring the current time and the duration between
4//! events.  It does this by providing a thin layer on top of native OS timing functions, or, if
5//! available, using the Time Stamp Counter feature found on modern CPUs.
6//!
7//! # Design
8//!
9//! Internally, `quanta` maintains the concept of two potential clock sources: a reference clock and
10//! a source clock.
11//!
12//! The reference clock is provided by the OS, and always available.  It is equivalent to what is
13//! provided by the standard library in terms of the underlying system calls being made.  As it uses
14//! the native timing facilities provided by the operating system, we ultimately depend on the OS
15//! itself to give us a stable and correct value.
16//!
17//! The source clock is a potential clock source based on the [Time Stamp Counter][tsc] feature
18//! found on modern CPUs.  If the TSC feature is not present or is not reliable enough, `quanta`
19//! will transparently utilize the reference clock instead.
20//!
21//! Depending on the underlying processor(s) in the system, `quanta` will figure out the most
22//! accurate/efficient way to calibrate the source clock to the reference clock in order to provide
23//! measurements scaled to wall clock time.
24//!
25//! Details on TSC support, and calibration, are detailed below.
26//!
27//! # Features
28//!
29//! Beyond simply taking measurements of the current time, `quanta` provides features for more
30//! easily working with clocks, as well as being able to enhance performance further:
31//! - `Clock` can be mocked for testing
32//! - globally accessible "recent" time with amortized overhead
33//!
34//! ## Mocked time
35//!
36//! For any code that uses a `Clock`, a mocked version can be substituted.  This allows for
37//! application authors to control the time in tests, which allows simulating not only the normal
38//! passage of time but provides the ability to warp time forwards and backwards in order to test
39//! corner cases in logic, etc.  Creating a mocked clock can be acheived with [`Clock::mock`], and
40//! [`Mock`] contains more details on mock usage.
41//!
42//! ## Coarsely-updated, or recent, time
43//!
44//! `quanta` also provides a "recent" time feature, which allows a slightly-delayed version of time
45//! to be provided to callers, trading accuracy for speed of access.  An upkeep thread is spawned,
46//! which is responsible for taking measurements and updating the global recent time. Callers then
47//! can access the cached value by calling `Clock::recent`.  This interface can be 4-10x faster than
48//! directly calling `Clock::now`, even when TSC support is available.  As the upkeep thread is the
49//! only code updating the recent time, the accuracy of the value given to callers is limited by how
50//! often the upkeep thread updates the time, thus the trade off between accuracy and speed of
51//! access.
52//!
53//! # Feature Flags
54//!
55//! `quanta` comes with feature flags that enable convenient conversions to time types in other
56//! popular crates, such as:
57//! - `prost` - provides an implementation into [`Timestamp`][prost_types_timestamp] from
58//!   `prost_types`
59//!
60//! # Platform Support
61//!
62//! At a high level, `quanta` carries support for most major operating systems out of the box:
63//! - Windows ([`QueryPerformanceCounter`][QueryPerformanceCounter])
64//! - macOS/OS X/iOS ([`mach_absolute_time`][mach_absolute_time])
65//! - Linux/*BSD/Solaris ([`clock_gettime`][clock_gettime])
66//!
67//! These platforms are supported in the "reference" clock sense, and support for using the Time
68//! Stamp Counter as a clocksource is more subtle, and explained below.
69//!
70//! ## WASM support
71//!
72//! This library can be built for WASM targets, but in this case the resolution and accuracy of
73//! measurements can be limited by the WASM environment. In particular, when running on the
74//! `wasm32-unknown-unknown` target in browsers, `quanta` will use [window.performance.now] as a
75//! clock. This mean the accuracy is limited to milliseconds instead of the usual nanoseconds on
76//! other targets. When running within a WASI environment (target `wasm32-wasi`), the accuracy of
77//! the clock depends on the VM implementation.
78//!
79//! # TSC Support
80//!
81//! Accessing the TSC requires being on the `x86_64` architecture, with access to SSE2.
82//! Additionally, the processor must support either constant or nonstop/invariant TSC.  This ensures
83//! that the TSC ticks at a constant rate which can be easily scaled.
84//!
85//! A caveat is that "constant" TSC doesn't account for all possible power states (levels of power
86//! down or sleep that a CPU can enter to save power under light load, etc) and so a constant TSC
87//! can lead to drift in measurements over time, after they've been scaled to reference time.
88//!
89//! This is a limitation of the TSC mode, as well as the nature of `quanta` not being able to know,
90//! as the OS would, when a power state transition has happened, and thus compensate with a
91//! recalibration. Nonstop/invariant TSC does not have this limitation and is stable over long
92//! periods of time.
93//!
94//! Roughly speaking, the following list contains the beginning model/generation of processors where
95//! you should be able to expect having invariant TSC support:
96//! - Intel Nehalem and newer for server-grade
97//! - Intel Skylake and newer for desktop-grade
98//! - VIA Centaur Nano and newer (circumstantial evidence here)
99//! - AMD Phenom and newer
100//!
101//! Ultimately, `quanta` will query CPUID information to determine if the processor has the required
102//! features to use the TSC.
103//!
104//! # Calibration
105//!
106//! As the TSC doesn't necessarily tick at reference scale -- i.e. one tick isn't always one
107//! nanosecond -- we have to apply a scaling factor when converting from source to reference time
108//! scale to provide this.  We acquire this scaling factor by repeatedly taking measurements from
109//! both the reference and source clocks, until we have a statistically-relevant measure of the
110//! average scaling factor.  We do some additional work to convert this scaling factor into a
111//! power-of-two number that allows us to optimize the code, and thus reduce the generated
112//! instructions required to scale a TSC value.
113//!
114//! This calibration is stored globally and reused.  However, the first `Clock` that is created in
115//! an application will block for a small period of time as it runs this calibration loop.  The time
116//! spent in the calibration loop is limited to 200ms overall.  In practice, `quanta` will reach a
117//! stable calibration quickly (usually 10-20ms, if not less) and so this deadline is unlikely to be
118//! reached.
119//!
120//! # Caveats
121//!
122//! Utilizing the TSC can be a tricky affair, and so here is a list of caveats that may or may not
123//! apply, and is in no way exhaustive:
124//! - CPU hotplug behavior is undefined
125//! - raw values may time warp
126//! - measurements from the TSC may drift past or behind the comparable reference clock
127//!
128//! Another important caveat is that `quanta` does not track time across system suspends.  Simply
129//! put, if a time measurement (such as using [`Instant::now`][crate::Instant::now]) is taken, and
130//! then the system is suspended, and then another measurement is taken, the difference between
131//! those the two would not include the time the system was in suspend.
132//!
133//! [tsc]: https://en.wikipedia.org/wiki/Time_Stamp_Counter
134//! [QueryPerformanceCounter]: https://msdn.microsoft.com/en-us/library/ms644904(v=VS.85).aspx
135//! [mach_absolute_time]: https://developer.apple.com/documentation/kernel/1462446-mach_absolute_time
136//! [clock_gettime]: https://linux.die.net/man/3/clock_gettime
137//! [prost_types_timestamp]: https://docs.rs/prost-types/0.7.0/prost_types/struct.Timestamp.html
138//! [window.performance.now]: https://developer.mozilla.org/en-US/docs/Web/API/Performance/now
139#![deny(missing_docs)]
140#![deny(clippy::all)]
141#![allow(clippy::must_use_candidate)]
142
143use crossbeam_utils::atomic::AtomicCell;
144use std::time::Duration;
145use std::{cell::RefCell, sync::Arc};
146
147use once_cell::sync::OnceCell;
148
149mod clocks;
150use self::clocks::{Counter, Monotonic};
151mod detection;
152mod mock;
153pub use self::mock::{IntoNanoseconds, Mock};
154mod instant;
155pub use self::instant::Instant;
156mod upkeep;
157pub use self::upkeep::{Error, Handle, Upkeep};
158mod stats;
159use self::stats::Variance;
160
161// Global clock, used by `Instant::now`.
162static GLOBAL_CLOCK: OnceCell<Clock> = OnceCell::new();
163
164// Global recent measurement, used by `Clock::recent` and `Instant::recent`.
165static GLOBAL_RECENT: AtomicCell<u64> = AtomicCell::new(0);
166
167// Global calibration, shared by all clocks.
168static GLOBAL_CALIBRATION: OnceCell<Calibration> = OnceCell::new();
169
170// Per-thread clock override, used by `quanta::with_clock`, `Instant::now`, and sometimes `Instant::recent`.
171thread_local! {
172    static CLOCK_OVERRIDE: RefCell<Option<Clock>> = RefCell::new(None);
173}
174
175// Run 500 rounds of calibration before we start actually seeing what the numbers look like.
176const MINIMUM_CAL_ROUNDS: u64 = 500;
177
178// We want our maximum error to be 10 nanoseconds.
179const MAXIMUM_CAL_ERROR_NS: u64 = 10;
180
181// Don't run the calibration loop for longer than 200ms of wall time.
182const MAXIMUM_CAL_TIME_NS: u64 = 200 * 1000 * 1000;
183
184#[derive(Debug)]
185enum ClockType {
186    Monotonic(Monotonic),
187    Counter(Monotonic, Counter, Calibration),
188    Mock(Arc<Mock>),
189}
190
191#[derive(Debug, Copy, Clone)]
192pub(crate) struct Calibration {
193    ref_time: u64,
194    src_time: u64,
195    scale_factor: u64,
196    scale_shift: u32,
197}
198
199impl Calibration {
200    fn new() -> Calibration {
201        Calibration {
202            ref_time: 0,
203            src_time: 0,
204            scale_factor: 1,
205            scale_shift: 0,
206        }
207    }
208
209    fn reset_timebases(&mut self, reference: Monotonic, source: &Counter) {
210        self.ref_time = reference.now();
211        self.src_time = source.now();
212    }
213
214    fn scale_src_to_ref(&self, src_raw: u64) -> u64 {
215        let delta = src_raw.saturating_sub(self.src_time);
216        let scaled = mul_div_po2_u64(delta, self.scale_factor, self.scale_shift);
217        scaled + self.ref_time
218    }
219
220    fn calibrate(&mut self, reference: Monotonic, source: &Counter) {
221        let mut variance = Variance::default();
222        let deadline = reference.now() + MAXIMUM_CAL_TIME_NS;
223
224        self.reset_timebases(reference, source);
225
226        // Each busy loop should spin for 1 microsecond. (1000 nanoseconds)
227        let loop_delta = 1000;
228        loop {
229            // Busy loop to burn some time.
230            let mut last = reference.now();
231            let target = last + loop_delta;
232            while last < target {
233                last = reference.now();
234            }
235
236            // We put an upper bound on how long we run calibration before to provide a predictable
237            // overhead to the calibration process.  In practice, even if we hit the calibration
238            // deadline, we should still have run a sufficient number of rounds to get an accurate
239            // calibration.
240            if last >= deadline {
241                break;
242            }
243
244            // Adjust our calibration before we take our measurement.
245            self.adjust_cal_ratio(reference, source);
246
247            let r_time = reference.now();
248            let s_raw = source.now();
249            let s_time = self.scale_src_to_ref(s_raw);
250            variance.add(s_time as f64 - r_time as f64);
251
252            // If we've collected enough samples, check what the mean and mean error are.  If we're
253            // already within the target bounds, we can break out of the calibration loop early.
254            if variance.has_significant_result() {
255                let mean = variance.mean().abs();
256                let mean_error = variance.mean_error().abs();
257                let mwe = variance.mean_with_error();
258                let samples = variance.samples();
259
260                if samples > MINIMUM_CAL_ROUNDS
261                    && mwe < MAXIMUM_CAL_ERROR_NS as f64
262                    && mean_error / mean <= 1.0
263                {
264                    break;
265                }
266            }
267        }
268    }
269
270    fn adjust_cal_ratio(&mut self, reference: Monotonic, source: &Counter) {
271        // Overall algorithm: measure the delta between our ref/src_time values and "now" versions
272        // of them, calculate the ratio between the deltas, and then find a numerator and
273        // denominator to express that ratio such that the denominator is always a power of two.
274        //
275        // In practice, this means we take the "source" delta, and find the next biggest number that
276        // is a power of two.  We then figure out the ratio that describes the difference between
277        // _those_ two values, and multiple the "reference" delta by that much, which becomes our
278        // numerator while the power-of-two "source" delta becomes our denominator.
279        //
280        // Then, conversion from a raw value simply becomes a multiply and a bit shift instead of a
281        // multiply and full-blown divide.
282        let ref_end = reference.now();
283        let src_end = source.now();
284
285        let ref_d = ref_end.wrapping_sub(self.ref_time);
286        let src_d = src_end.wrapping_sub(self.src_time);
287
288        let src_d_po2 = src_d
289            .checked_next_power_of_two()
290            .unwrap_or_else(|| 2_u64.pow(63));
291
292        // TODO: lossy conversion back and forth just to get an approximate value, can we do better
293        // with integer math? not sure
294        let po2_ratio = src_d_po2 as f64 / src_d as f64;
295        self.scale_factor = (ref_d as f64 * po2_ratio) as u64;
296        self.scale_shift = src_d_po2.trailing_zeros();
297    }
298}
299
300impl Default for Calibration {
301    fn default() -> Self {
302        Self::new()
303    }
304}
305
306/// Unified clock for taking measurements.
307#[derive(Debug, Clone)]
308pub struct Clock {
309    inner: ClockType,
310}
311
312impl Clock {
313    /// Creates a new clock with the optimal reference and source clocks.
314    ///
315    /// Support for TSC, etc, are checked at the time of creation, not compile-time.
316    pub fn new() -> Clock {
317        let reference = Monotonic::default();
318        let inner = if detection::has_counter_support() {
319            let source = Counter;
320            let calibration = GLOBAL_CALIBRATION.get_or_init(|| {
321                let mut calibration = Calibration::new();
322                calibration.calibrate(reference, &source);
323                calibration
324            });
325            ClockType::Counter(reference, source, *calibration)
326        } else {
327            ClockType::Monotonic(reference)
328        };
329
330        Clock { inner }
331    }
332
333    /// Creates a new clock that is mocked for controlling the underlying time.
334    ///
335    /// Returns a [`Clock`] instance and a handle to the underlying [`Mock`] source so that the
336    /// caller can control the passage of time.
337    pub fn mock() -> (Clock, Arc<Mock>) {
338        let mock = Arc::new(Mock::new());
339        let clock = Clock {
340            inner: ClockType::Mock(mock.clone()),
341        };
342
343        (clock, mock)
344    }
345
346    /// Gets the current time, scaled to reference time.
347    ///
348    /// This method is the spiritual equivalent of [`std::time::Instant::now`].  It is guaranteed
349    /// to return a monotonically increasing value between calls to the same `Clock` instance.
350    ///
351    /// Returns an [`Instant`].
352    pub fn now(&self) -> Instant {
353        match &self.inner {
354            ClockType::Monotonic(monotonic) => Instant(monotonic.now()),
355            ClockType::Counter(_, counter, _) => self.scaled(counter.now()),
356            ClockType::Mock(mock) => Instant(mock.value()),
357        }
358    }
359
360    /// Gets the underlying time from the fastest available clock source.
361    ///
362    /// As the clock source may or may not be the TSC, value is not guaranteed to be in nanoseconds
363    /// or to be monotonic.  Value can be scaled to reference time by calling either [`scaled`]
364    /// or [`delta`].
365    ///
366    /// [`scaled`]: Clock::scaled
367    /// [`delta`]: Clock::delta
368    pub fn raw(&self) -> u64 {
369        match &self.inner {
370            ClockType::Monotonic(monotonic) => monotonic.now(),
371            ClockType::Counter(_, counter, _) => counter.now(),
372            ClockType::Mock(mock) => mock.value(),
373        }
374    }
375
376    /// Scales a raw measurement to reference time.
377    ///
378    /// You must scale raw measurements to ensure your result is in nanoseconds.  The raw
379    /// measurement is not guaranteed to be in nanoseconds and may vary.  It is only OK to avoid
380    /// scaling raw measurements if you don't need actual nanoseconds.
381    ///
382    /// Returns an [`Instant`].
383    pub fn scaled(&self, value: u64) -> Instant {
384        let scaled = match &self.inner {
385            ClockType::Counter(_, _, calibration) => calibration.scale_src_to_ref(value),
386            _ => value,
387        };
388
389        Instant(scaled)
390    }
391
392    /// Calculates the delta, in nanoseconds, between two raw measurements.
393    ///
394    /// This method is very similar to [`delta`] but reduces overhead
395    /// for high-frequency measurements that work with nanosecond
396    /// counts internally, as it avoids the conversion of the delta
397    /// into [`Duration`].
398    ///
399    /// [`delta`]: Clock::delta
400    pub fn delta_as_nanos(&self, start: u64, end: u64) -> u64 {
401        // Safety: we want wrapping_sub on the end/start delta calculation so that two measurements
402        // split across a rollover boundary still return the right result.  However, we also know
403        // the TSC could potentially give us different values between cores/sockets, so we're just
404        // doing our due diligence here to make sure we're not about to create some wacky duration.
405        if end <= start {
406            return 0;
407        }
408
409        let delta = end.wrapping_sub(start);
410        match &self.inner {
411            ClockType::Counter(_, _, calibration) => {
412                mul_div_po2_u64(delta, calibration.scale_factor, calibration.scale_shift)
413            }
414            _ => delta,
415        }
416    }
417
418    /// Calculates the delta between two raw measurements.
419    ///
420    /// This method is slightly faster when you know you need the delta between two raw
421    /// measurements, or a start/end measurement, than using [`scaled`] for both conversions.
422    ///
423    /// In code that simply needs access to the whole number of nanoseconds
424    /// between the two measurements, consider [`Clock::delta_as_nanos`]
425    /// instead, which is slightly faster than having to call both this method
426    /// and [`Duration::as_nanos`].
427    ///
428    /// [`scaled`]: Clock::scaled
429    /// [`delta_as_nanos`]: Clock::delta_as_nanos
430    pub fn delta(&self, start: u64, end: u64) -> Duration {
431        Duration::from_nanos(self.delta_as_nanos(start, end))
432    }
433
434    /// Gets the most recent current time, scaled to reference time.
435    ///
436    /// This method provides ultra-low-overhead access to a slightly-delayed version of the current
437    /// time.  Instead of querying the underlying source clock directly, a shared, global value is
438    /// read directly without the need to scale to reference time.
439    ///
440    /// The upkeep thread must be started in order to update the time.  You can read the
441    /// documentation for [`Upkeep`] for more information on starting the upkeep thread, as
442    /// well as the details of the "current time" mechanism.
443    ///
444    /// If the upkeep thread has not been started, the return value will be `0`.
445    ///
446    /// Returns an [`Instant`].
447    pub fn recent(&self) -> Instant {
448        match &self.inner {
449            ClockType::Mock(mock) => Instant(mock.value()),
450            _ => Instant(GLOBAL_RECENT.load()),
451        }
452    }
453
454    #[cfg(test)]
455    #[allow(dead_code)]
456    fn reset_timebase(&mut self) -> bool {
457        match &mut self.inner {
458            ClockType::Counter(reference, source, calibration) => {
459                calibration.reset_timebases(*reference, source);
460                true
461            }
462            _ => false,
463        }
464    }
465}
466
467impl Default for Clock {
468    fn default() -> Clock {
469        Clock::new()
470    }
471}
472
473// A manual `Clone` impl is required because `atomic_shim`'s `AtomicU64` is not `Clone`.
474impl Clone for ClockType {
475    fn clone(&self) -> Self {
476        match self {
477            ClockType::Mock(mock) => ClockType::Mock(mock.clone()),
478            ClockType::Monotonic(monotonic) => ClockType::Monotonic(*monotonic),
479            ClockType::Counter(monotonic, counter, calibration) => {
480                ClockType::Counter(*monotonic, counter.clone(), *calibration)
481            }
482        }
483    }
484}
485
486/// Sets this clock as the default for the duration of a closure.
487///
488/// This will only affect calls made against [`Instant`].  [`Clock`] is always self-contained.
489pub fn with_clock<T>(clock: &Clock, f: impl FnOnce() -> T) -> T {
490    CLOCK_OVERRIDE.with(|current| {
491        let old = current.replace(Some(clock.clone()));
492        let result = f();
493        current.replace(old);
494        result
495    })
496}
497
498/// Sets the global recent time.
499///
500/// While callers should typically prefer to use [`Upkeep`] to establish a background thread in
501/// order to drive the global recent time, this function allows callers to customize how the global
502/// recent time is updated.  For example, programs using an asynchronous runtime may prefer to
503/// schedule a task that does the updating, avoiding an extra thread.
504pub fn set_recent(instant: Instant) {
505    GLOBAL_RECENT.store(instant.0);
506}
507
508#[inline]
509pub(crate) fn get_now() -> Instant {
510    if let Some(instant) = CLOCK_OVERRIDE.with(|clock| clock.borrow().as_ref().map(Clock::now)) {
511        instant
512    } else {
513        GLOBAL_CLOCK.get_or_init(Clock::new).now()
514    }
515}
516
517#[inline]
518pub(crate) fn get_recent() -> Instant {
519    // We make a small trade-off here where if the global recent time isn't zero, we use that,
520    // regardless of whether or not there's a thread-specific clock override.  Otherwise, we would
521    // blow our performance budget.
522    //
523    // Given that global recent time shouldn't ever be getting _actually_ updated in tests, this
524    // should be a reasonable trade-off.
525    let recent = GLOBAL_RECENT.load();
526    if recent == 0 {
527        get_now()
528    } else {
529        Instant(recent)
530    }
531}
532
533#[inline]
534fn mul_div_po2_u64(value: u64, numer: u64, denom: u32) -> u64 {
535    // Modified muldiv routine where the denominator has to be a power of two. `denom` is expected
536    // to be the number of bits to shift, not the actual decimal value.
537    let mut v = u128::from(value);
538    v *= u128::from(numer);
539    v >>= denom;
540    v as u64
541}
542
543#[cfg(test)]
544mod tests {
545    use super::Clock;
546
547    #[cfg(not(target_arch = "wasm32"))]
548    use super::{Counter, Monotonic};
549
550    #[cfg(not(target_arch = "wasm32"))]
551    use average::{Merge as _, Variance};
552
553    #[cfg(not(target_arch = "wasm32"))]
554    use std::time::{Duration, Instant};
555
556    #[test]
557    #[cfg_attr(
558        all(target_arch = "wasm32", target_os = "unknown"),
559        wasm_bindgen_test::wasm_bindgen_test
560    )]
561    fn test_mock() {
562        let (clock, mock) = Clock::mock();
563        assert_eq!(clock.now().0, 0);
564        mock.increment(42);
565        assert_eq!(clock.now().0, 42);
566    }
567
568    #[test]
569    #[cfg_attr(
570        all(target_arch = "wasm32", target_os = "unknown"),
571        wasm_bindgen_test::wasm_bindgen_test
572    )]
573    fn test_now() {
574        let clock = Clock::new();
575        assert!(clock.now().0 > 0);
576    }
577
578    #[test]
579    #[cfg_attr(
580        all(target_arch = "wasm32", target_os = "unknown"),
581        wasm_bindgen_test::wasm_bindgen_test
582    )]
583    fn test_raw() {
584        let clock = Clock::new();
585        assert!(clock.raw() > 0);
586    }
587
588    #[test]
589    #[cfg_attr(
590        all(target_arch = "wasm32", target_os = "unknown"),
591        wasm_bindgen_test::wasm_bindgen_test
592    )]
593    fn test_scaled() {
594        let clock = Clock::new();
595        let raw = clock.raw();
596        let scaled = clock.scaled(raw);
597        assert!(scaled.0 > 0);
598    }
599
600    #[cfg(not(target_arch = "wasm32"))]
601    #[test]
602    #[cfg_attr(not(feature = "flaky_tests"), ignore)]
603    fn test_reference_source_calibration() {
604        let mut clock = Clock::new();
605        let reference = Monotonic::default();
606
607        let loops = 10000;
608
609        let mut overall = Variance::new();
610        let mut src_samples = [0u64; 1024];
611        let mut ref_samples = [0u64; 1024];
612
613        for _ in 0..loops {
614            // We have to reset the "timebase" of the clock/calibration when testing in this way.
615            //
616            // Since `quanta` is designed around mimicing `Instant`, we care about measuring the _passage_ of time, but
617            // not matching our calculation of wall-clock time to the system's calculation of wall-clock time, in terms
618            // of their absolute values.
619            //
620            // As the system adjusts its clocks over time, whether due to NTP skew, or delays in updating the derived
621            // monotonic time, and so on, our original measurement base from the reference source -- which we use to
622            // anchor how we convert our scaled source measurement into the same reference timebase -- can skew further
623            // away from the current reference time in terms of the rate at which it ticks forward.
624            //
625            // Essentially, what we're saying here is that we want to test the scaling ratio that we generated in
626            // calibration, but not necessarily that the resulting value -- which is meant to be in the same timebase as
627            // the reference -- is locked to the reference itself. For example, if the reference is in nanoseconds, we
628            // want our source to be scaled to nanoseconds, too. We don't care if the system shoves the reference back
629            // and forth via NTP skew, etc... we just need to do enough source-to-reference calibration loops to figure
630            // out what the right amount is to scale the TSC -- since we require an invariant/nonstop TSC -- to get it
631            // to nanoseconds.
632            //
633            // At the risk of saying _too much_, while the delta between `Clock::now` and `Monotonic::now` may grow over
634            // time if the timebases are not reset, we can readily observe in this test that the delta between the
635            // first/last measurement loop for both source/reference are independently close i.e. the ratio by which we
636            // scale the source measurements gets it close, and stays close, to the reference measurements in terms of
637            // the _passage_ of time.
638            clock.reset_timebase();
639
640            for i in 0..1024 {
641                src_samples[i] = clock.now().0;
642                ref_samples[i] = reference.now();
643            }
644
645            let is_src_monotonic = src_samples
646                .iter()
647                .map(Some)
648                .reduce(|last, current| last.and_then(|lv| current.filter(|cv| *cv >= lv)))
649                .flatten()
650                .copied();
651            assert_eq!(is_src_monotonic, Some(src_samples[1023]));
652
653            let is_ref_monotonic = ref_samples
654                .iter()
655                .map(Some)
656                .reduce(|last, current| last.and_then(|lv| current.filter(|cv| *cv >= lv)))
657                .flatten()
658                .copied();
659            assert_eq!(is_ref_monotonic, Some(ref_samples[1023]));
660
661            let local = src_samples
662                .iter()
663                .zip(ref_samples.iter())
664                .map(|(s, r)| *s as f64 - *r as f64)
665                .map(|f| f.abs())
666                .collect::<Variance>();
667
668            overall.merge(&local);
669        }
670
671        println!(
672            "reference/source delta: mean={} error={} mean-var={} samples={}",
673            overall.mean(),
674            overall.error(),
675            overall.variance_of_mean(),
676            overall.len(),
677        );
678
679        // If things are out of sync more than 1000ns, something is likely scaled wrong.
680        assert!(overall.mean() < 1000.0);
681    }
682
683    #[cfg(not(target_arch = "wasm32"))]
684    #[test]
685    #[cfg_attr(not(feature = "flaky_tests"), ignore)]
686    fn measure_source_reference_self_timing() {
687        let source = Counter::default();
688        let reference = Monotonic::default();
689
690        let loops = 10000;
691
692        let mut src_deltas = Vec::new();
693        let mut src_samples = [0u64; 100];
694
695        for _ in 0..loops {
696            let start = Instant::now();
697            for i in 0..100 {
698                src_samples[i] = source.now();
699            }
700
701            src_deltas.push(start.elapsed().as_secs_f64());
702        }
703
704        let mut ref_deltas = Vec::new();
705        let mut ref_samples = [0u64; 100];
706
707        for _ in 0..loops {
708            let start = Instant::now();
709            for i in 0..100 {
710                ref_samples[i] = reference.now();
711            }
712
713            ref_deltas.push(start.elapsed().as_secs_f64());
714        }
715
716        let src_variance = src_deltas.into_iter().collect::<Variance>();
717        let ref_variance = ref_deltas.into_iter().collect::<Variance>();
718
719        let src_variance_ns = Duration::from_secs_f64(src_variance.mean() / 100.0);
720        let ref_variance_ns = Duration::from_secs_f64(ref_variance.mean() / 100.0);
721
722        println!(
723            "source call average: {:?}, reference call average: {:?}",
724            src_variance_ns, ref_variance_ns
725        );
726    }
727}