Python time.perf_counter_ns Function
Last modified April 11, 2025
This comprehensive guide explores Python's time.perf_counter_ns
function,
which returns a high-resolution performance counter in nanoseconds. We'll cover
high-precision timing, benchmarking, and practical examples.
Basic Definitions
The time.perf_counter_ns
function returns an integer representing
a performance counter in nanoseconds. It provides the highest available resolution
timer for short duration measurements.
Key characteristics: nanosecond resolution, monotonic (always increases), not affected by system clock changes, and ideal for benchmarking and profiling. The counter's reference point is undefined (only differences matter).
Basic Performance Measurement
The simplest use of time.perf_counter_ns
measures code execution
time with nanosecond precision. This example shows basic usage.
import time def calculate_sum(n): return sum(range(n)) # Start timer start = time.perf_counter_ns() # Execute function result = calculate_sum(1000000) # End timer end = time.perf_counter_ns() # Calculate duration in nanoseconds duration = end - start print(f"Calculation took {duration} ns") print(f"Result: {result}")
This example demonstrates how to measure the execution time of a function with nanosecond precision. The duration is calculated by subtracting start from end timestamps.
Note that the absolute values of perf_counter_ns are meaningless - only the differences between measurements are useful.
Comparing perf_counter_ns with perf_counter
This example compares perf_counter_ns
with its floating-point
counterpart perf_counter
to show precision differences.
import time def empty_loop(n): for _ in range(n): pass iterations = 1000000 # Using perf_counter (float seconds) start = time.perf_counter() empty_loop(iterations) end = time.perf_counter() float_duration = end - start # Using perf_counter_ns (integer nanoseconds) start_ns = time.perf_counter_ns() empty_loop(iterations) end_ns = time.perf_counter_ns() ns_duration = end_ns - start_ns print(f"perf_counter: {float_duration:.9f} sec") print(f"perf_counter_ns: {ns_duration} ns") print(f"Converted ns: {ns_duration / 1e9:.9f} sec")
perf_counter_ns
provides integer nanoseconds while
perf_counter
returns float seconds. The ns version avoids
floating-point precision issues for very short intervals.
For most timing needs, either works well, but perf_counter_ns is better for extremely precise measurements.
Measuring Function Call Overhead
perf_counter_ns
can measure very small time intervals like
function call overhead. This example demonstrates this capability.
import time def empty_function(): pass # Measure single call overhead start = time.perf_counter_ns() empty_function() end = time.perf_counter_ns() single_call = end - start # Measure average overhead over many calls trials = 1000000 start = time.perf_counter_ns() for _ in range(trials): empty_function() end = time.perf_counter_ns() avg_call = (end - start) / trials print(f"Single call overhead: {single_call} ns") print(f"Average call overhead: {avg_call:.2f} ns")
This measures the time taken to call an empty function. The single call measurement shows raw overhead while the averaged version reduces noise.
Such precise measurements are useful when optimizing performance-critical code where every nanosecond matters.
Benchmarking Different Implementations
This example uses perf_counter_ns
to compare the performance
of different Python implementations of the same algorithm.
import time def sum_with_for(n): total = 0 for i in range(n): total += i return total def sum_with_builtin(n): return sum(range(n)) def measure(func, n, trials=100): start = time.perf_counter_ns() for _ in range(trials): func(n) end = time.perf_counter_ns() return (end - start) / trials n = 10000 trials = 100 for_loop_time = measure(sum_with_for, n, trials) builtin_time = measure(sum_with_builtin, n, trials) print(f"For loop sum: {for_loop_time:.0f} ns per call") print(f"Builtin sum: {builtin_time:.0f} ns per call") print(f"Builtin is {for_loop_time/builtin_time:.1f}x faster")
The example measures two ways to sum numbers: using a manual for loop versus Python's built-in sum. Results show their relative performance.
Averaging over multiple trials reduces measurement noise and provides more reliable comparisons between implementations.
Precision Limitations and Clock Resolution
This example demonstrates the practical resolution limits of
perf_counter_ns
by measuring the smallest detectable interval.
import time def measure_resolution(): # Measure smallest detectable time difference min_diff = float('inf') for _ in range(1000): t1 = time.perf_counter_ns() t2 = time.perf_counter_ns() if t2 > t1: diff = t2 - t1 if diff < min_diff: min_diff = diff return min_diff resolution = measure_resolution() print(f"Smallest detectable interval: {resolution} ns") # Verify with time.sleep(0) sleep_start = time.perf_counter_ns() time.sleep(0) sleep_end = time.perf_counter_ns() print(f"time.sleep(0) duration: {sleep_end - sleep_start} ns")
The first part measures the smallest time difference the timer can detect. The second part shows that even sleep(0) takes measurable time due to Python's overhead.
Actual resolution depends on hardware and OS. Modern systems typically have nanosecond resolution timers, but overhead may limit practical use.
Microbenchmarking with perf_counter_ns
This example shows how to create a microbenchmark decorator using
perf_counter_ns
for precise function timing.
import time from functools import wraps def benchmark(n_trials=1000, warmup=100): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): # Warmup phase (avoid JIT/cache effects) for _ in range(warmup): func(*args, **kwargs) # Measurement phase total_ns = 0 for _ in range(n_trials): start = time.perf_counter_ns() func(*args, **kwargs) end = time.perf_counter_ns() total_ns += (end - start) avg_ns = total_ns / n_trials print(f"{func.__name__}: {avg_ns:.1f} ns per call " f"(over {n_trials} trials)") return func(*args, **kwargs) return wrapper return decorator @benchmark(n_trials=10000, warmup=1000) def list_comprehension(n): return [i*i for i in range(n)] @benchmark(n_trials=10000, warmup=1000) def manual_loop(n): result = [] for i in range(n): result.append(i*i) return result list_comprehension(100) manual_loop(100)
The decorator handles warmup runs to avoid startup effects, then measures average execution time over many trials. This provides reliable microbenchmarks.
The example compares list comprehension versus manual loop performance, showing how to properly benchmark small code differences.
Best Practices
- Precision needs: Use perf_counter_ns for nanosecond precision timing
- Monotonicity: The counter always increases, ideal for intervals
- Averaging: For small operations, average over many runs
- Warmup: Include warmup runs to avoid startup overhead
- Clock resolution: Be aware of your system's actual timer resolution
Source References
Author
List all Python tutorials.