Fastest Way To Get The Time

So, for my 6502/6510 emulator project I’ve been trying to figure out the best (read that, “fastest”) way to get the time with nanosecond resolution. I need this because the North American version of the Commodore 64 ran at 1.02MHz (1,022,727Hz) and so that means the clock “ticks” every 977.8 nanoseconds. And that’s just for the Commodore 64. If I want to emulate the Commodore 128’s 2MHz mode then the timer “ticks” every 488.9 nanoseconds.

Now, on a Unix-like system (Linux, MacOS, iOS, etc.) the only reliable, portable way (across different CPUs at least) to get the time with nanosecond resolution is with the system function clock_gettime(…). This function returns the number of nanoseconds that have elapsed since midnight on January 1, 1970 UTC. The problem is that, on just about any operating system, this means a call into the kernel itself which means an expensive round trip. So I played around with one of the parameters for clock_gettime(…) which is the clock ID which tells it which system clock it’s going to get the time from. Some result in faster calls than others because of how it’s getting the time, any calculations that are involved, etc.

For both Linux and MacOS I’ve found that using the clock ID CLOCK_MONOTONIC_RAW results in the fastest call time. On my MacBook Pro (3.1 GHz Quad-Core Intel Core i7) it took an average of 38 nanoseconds to get the time. On my Odroid H2+ (2.5 GHz Intel Quad-core J4115) it took an average of 29 nanoseconds.1 I also tried playing with the process scheduling priority (using nice) but that didn’t make any difference.

PlatformAverage Time
MacBook Pro (3.1 GHz i7)38ns1
Odroid H2+ (2.5 GHz Intel J4115)29ns1
Odroid C4 (2.0 GHz ARM Cortex-A55 (S905X3))160ns2
Raspberry PI 4 (Broadcom BCM2711 Cortex-A72)80ns2

Below is the test program I used to get the timings. I executed the file with “swift -Ounchecked TimerTests.swift” on all platforms.

import Foundation

@inlinable func getSysTime() -> UInt64 {
    var ts:    timespec  = timespec(tv_sec: 0, tv_nsec: 0)
    let clkid: clockid_t = CLOCK_MONOTONIC_RAW
    clock_gettime(clkid, &ts)
    return (UInt64(ts.tv_sec) * 1_000_000_000 + UInt64(ts.tv_nsec))
}

let reqRunTime: UInt64 = 20_000_000_000 // run for twenty seconds...
let startTime:  UInt64 = getSysTime()
let runTime:    UInt64 = (startTime + reqRunTime)
var endTime:    UInt64 = 0
var calls:      UInt64 = 1

repeat {
    endTime = getSysTime()
    calls += 1
}
while endTime < runTime

let elapsedTime = (endTime - startTime)

print("Requested Run Time: \(reqRunTime)ns")
print("        Start Time: \(startTime)ns")
print("          End Time: \(endTime)ns")
print("      Elapsed Time: \(elapsedTime)ns")
print("         Overshoot: \(elapsedTime - reqRunTime)ns")
print("        Iterations: \(calls)")
print("      Average Time: \((elapsedTime) / calls)ns")

1 – The difference in times here is most likely because of kernel design. The Linux kernel is monolithic and MacOS kernel is a hybrid micro/monolithic design. My MacBook Pro also has a heck of a lot more going on.

2 – The difference in the times between the Odroid C4 and the Raspberry PI 4 suprised me. The C4 is a faster processor than the RPi4 but, in this case, the RPi4 is twice as fast as the C4. I suspect a hardware reason behind this – perhaps the RPi4 is reading a hardware clock vs a software clock.

Leave a Reply