# Changelog ## [4.4] - 2026-04-20 ### Features - #1553: Differential Flame Graphs ### Improvements - #1705: `memlimit` option to limit size of the call trace storage - #1706: Extend syntax of `-j` option to truncate deep stacks - #1720: FlameGraph: Dark mode toggle - #1672: FlameGraph: Use Ctrl+Click in addition to Alt+Click to remove stacks - #1684: Unwind ARM64 generated stubs on JDK 26+ - #1676: Make `dwarf` stack walking mode an alias for `vm` - #1671: An option to select TLAB based AllocTracer engine with JDK 11+ - #1670: Move converter Main class to the one.convert package - #1660: Provide non-aggregated samples in OTLP converter - #1701, #1682: Speed-up stack walking ### Breaking changes - #1673: Permanently remove `check` command - #1675: Remove unsafe AsyncGetCallTrace recovery tricks along with `safemode` option - #1677: Remove `cstack=lbr` option ### Bug fixes - #1727: Allocation profile has wrong units in OTLP format - #1716: Wall-clock Heatmap does not count samples correctly - #1715: Fix Zing crash when profiling cpu+wall together - #1708: Another fix for correct vDSO unwinding on ARM64 - #1707: Workaround for JFR shutdown race - #1699: Allow negative keys in JFR constant pool - #1697: Ensure remaining buffer is sufficient for event data in JfrReader - #1657: Re-enable workaround for a long attach on JDK 8 - #1654: Prefer perf-events engine when record-cpu or target-cpu are selected - #1585: Scale perf counters in case of multiplexing - #1528: Add a hard-coded limit on the maximum number of jmethodIDs - #1203: Fix "Instance field not found" when using `-Xcheck:jni` on JDK 8 - Do not walk past virtual thread continuation barriers ## [4.3] - 2026-01-20 ### Features - #1547: Native lock profiling - #1566: Filter cpu/wall profiles by latency - #1568: Expose async-profiler metrics in Prometheus format - #1628: async-profiler.jar as Java agent; remote control via JMX ### Improvements - #1140: FlameGraph improvements: legend, hot keys, new toolbar icons - #1530: Timezone switcher between Local and UTC time in Heatmaps - #1582: Support `--include`/`--exclude` options for JFR to Heatmap/OTLP/pprof conversion - #1624: Compatibility with OTLP v1.9.0 - #1629: Harden crash protection in StackWalker ### Breaking changes - #1277: New `timeSpan` field in WallClockSample events - #1518: Deprecate `check` command - #1590: Support compilation on modern JDKs. Drop JDK 7 support ### Bug fixes - #1599: Workaround for the kernel PERF_EVENT_IOC_REFRESH bug - #1596: Do not block any signals during execution of a custom crash handler - #1584: JfrReader loops on corrupted recordings - #1555: Parse FlameGraph title from HTML input - #1621: `loop` and `timeout` options do not work together - #1641: Unwind vDSO correctly on Linux-ARM64 - #1648: Fix stop sequence in Profiler::start - #1575: Fix CodeCache memory leak in lock profiling while looping - #1558: Fix record-cpu bug when kernel stacks are not available - #1651: Do not record CPU frame for non-perf samples - #1614, #1615, #1617, #1623: Fix races related to VM termination ## [4.2.1] - 2025-11-22 ### Bug fixes - #1599: Workaround for the kernel PERF_EVENT_IOC_REFRESH bug - #1596: Do not block any signals during execution of a custom crash handler ## [4.2] - 2025-10-20 ### Features - Java Method Tracing and Latency Profiling * #1421: Latency profiling * #1435: Allow wildcards in Instrument profiling engine * #1499: `--trace` option with per-method latency threshold - System-wide process sampling on Linux * #1411: `--proc` option to record `profiler.ProcessSample` events - VMStructs stack walker by default * #1539: Use VMStructs stack walking mode by default * #1537: Support `comptask` and `vtable` features * #1517: Use JavaFrameAnchor to find top Java frame * #1449: Special handling of prologue and epilogue of compiled methods ### Improvements - #1475: Add `CPUTimeSample` event support to jfrconv - #1414: Per-thread flamegraph option in JFR heatmap converter - #1526: Expose JfrReader dictionary that maps osThreadId to javaThreadId - #1448: Thread name in OpenTelemetry output - #1413: Add `time_nanos` and `duration_nanos` to OTLP profiles - #1450: Unwind dylib stubs as empty frames on macOS - #1416: Add synthetic symbols for Mach-O stubs/trampolines - Allow cross-compilation for 32-bit platforms ### Bug fixes - #1515: Fix UnsatisfiedLinkError when tmpdir is set to a relative path - #1500: Detect if `calloc` calls `malloc` for nativemem profiling - #1427: Re-implement SafeAccess crash protection - #1417: Two wall-clock profilers interfere with each other ### Project Infrastructure - #1527: GHA: replace macos-13 with macos-15-intel - #1510: Add option to retry tests - #1508: Add more GHA jobs to cover JDK versions on ARM - #1502: Fix job dependencies between integration tests and builds - #1466: Add Liberica JDK on Alpaquita Linux to the CI - Made integration tests more stable overall ## [4.1] - 2025-07-21 ### Features - Experimental support for the OpenTelemetry profiling signal * #1188: OTLP output format and `dumpOtlp` Java API * #1336: JFR to OTLP converter - JDK 25 support * #1222: Update VMStructs for JDK 25 - Productize native memory profiling * #1193: Full `nativemem` support on macOS * #1254: Fixed Nativemem tests on Alpine * #1269: Native memory profiling now works with `jemalloc` * #1323: `nativemem` shows allocations inside async-profiler itself ### Improvements - #1174: Detect JVM in non-Java application and attach to it - #1223: Native API to add custom events in JFR recording - #1259: `--all` option to collect all possible events simultaneously - #1286: Record which CPU a sample was taken on - #1299: Skip last 10% allocations for leak detection - #1300: Allow profiling kprobes/uprobes with `--fdtransfer` - #1366: Rewrite `jfrconv` executable to shell - #1400: Unwind checksum and digest intrinsics on ARM64 - #1357, #1389: VMStructs-based stack unwinding for `alloc` and `nativemem` profiling ### Bug fixes - #1251: `--ttsp` option does not work on Alpine - #1264: Guard hook installation with dlopen/dlclose - #1319: SIGSEGV in PerfEvents::walk - #1350: Disable JFR OldObjectSample event in jfrsync mode - #1358: Do not dereference jmethodIDs on JDK 26 - #1374: Correctly check if profiler is preloaded - #1380: Workaround clang type promotion bug - #1387: JFR writer crashes when using cstack=vmx - #1393: Improve stack walking termination logic: no endless `unknown` frames - Stack unwinding fixes for ARM64 ### Project Infrastructure - #1129: Command-line option to filter tests - #1262: Include `asprof.h` in async-profiler release package - #1271: Release additional binaries with debug symbols - #1274: Add Corretto 8 to the test matrix - #1246, #1226: Run tests on Amazon Linux and Alpine Linux - #1360: Auto-generated clang-tidy review comments - #1373: Save all generated test logs for debug purposes - Fixed flaky tests (#1282, #1307, #1376) ## [4.0] - 2025-04-08 ### Features - #895, #905: `jfrconv` binary and numerous converter enhancements - #944: Interactive Heatmap - #1064: Native memory leak profiler - #1002: An option to display instruction addresses - #1007: Optimize wall clock profiling - #1073: Productize VMStructs-based stack walker: `--cstack vm/vmx` - #1169: C API for accessing thread-local profiling context ### Improvements - #923: Support JDK 23+ - #952: Solve musl and glibc compatibility issues; link `libstdc++` statically - #955: `--libpath` option to specify path to `libasyncProfiler.so` in a container - #1018: `--grain` converter option to coarsen flame graphs - #1046: `--nostop` option to continue profiling outside `--begin`/`--end` window - #1178: `--inverted` option to flip flame graphs vertically - #1009: Allows collecting allocation and live object traces at the same time - #925: An option to accumulate JFR events in memory instead of flushing to a file - #929: Load symbols from debuginfod cache - #982: Sample contended locks by overflowing interval bucket - #993: Filter native frames in allocation profile - #896: FlameGraph: `Alt+Click` to remove stacks - #1097: FlameGraph: `N`/`Shift+N` to navigate through search results - #1182: Retain by-thread grouping when reversing FlameGraph - #1167: Log when no samples are collected - #1044: Fall back to `ctimer` for CPU profiling when perf_events are unavailable - #1068: Count missed samples when estimating total CPU time in `ctimer` mode - #1142: Use counter-timer register for timestamps on ARM64 - #1123: Support `clock=tsc` without a JVM - #1070: Demangle Rust v0 symbols - #1007: Use `ExecutionSample` event for CPU profiling and `WallClockSample` for Wall clock profiling - #1011: Obtain `can_generate_sampled_object_alloc_events` JVMTI capability only when needed - #1013: Intercept java.util.concurrent locks more efficiently - #759: Discover available profiling signal automatically - #884: Record event timestamps early - #885: Print error message if JVM fails to load libasyncProfiler - #892: Resolve tracepoint id in `asprof` - Suppress dynamic attach warning on JDK 21+ ### Bug fixes - #1143: Crash on macOS when using thread filter - #1125: Fixed parsing concurrently loaded libraries - #1095: jfr print fails when a recording has empty pools - #1084: Fixed Logging related races - #1074: Parse both .rela.dyn and .rela.plt sections - #1003: Support both tracefs and debugfs for kernel tracepoints - #986: Profiling output respects loglevel - #981: Avoid JVM crash by deleting JNI refs after `GetMethodDeclaringClass` - #934: Fix crash on Zing in a native thread - #843: Fix race between parsing and concurrent unloading of shared libraries - #1147, #1151: Deadlocks with jemalloc and tcmalloc profilers - Stack walking fixes for ARM64 - Converter fixes for `jfrsync` profiles - Fixed parsing non-PIC executables and shared objects with non-standard section layout - Fixed recursion in `pthread_create` when using native profiling API - Fixed crashes on Alpine when profiling native apps - Fixed warnings with `-Xcheck:jni` - Fixed "Unsupported JVM" on OpenJ9 JDK 21 - Fixed DefineClass crash on OpenJ9 - JfrReader should handle custom events properly - Handle truncated JFRs ### Project Infrastructure - Restructure and update documentation - Implement test framework; add new integration tests - Unit test framework for C++ code - Run CI on all supported platforms - Test multiple JDK versions in CI - Add GHA to validate license headers - Add Markdown checker and formatter - Add Issue and Pull Request templates - Add Contributing Guidelines and Code of Conduct - Run static analyzer and fix found issues (#1034, #1039, #1049, #1051, #1098) - Provide Dockerfile for building async-profiler release packages - Publish nightly builds automatically ## [3.0] - 2024-01-20 ### Features - #724: Binary launcher `asprof` - #751: Profile non-Java processes - #795: AsyncGetCallTrace replacement - #719: Classify execution samples into categories in JFR converter - #855: `ctimer` mode for accurate profiling without perf_events - #740: Profile CPU + Wall clock together - #736: Show targets of vtable/itable calls - #777: Show JIT compilation task - #644: RISC-V port - #770: LoongArch64 port ### Improvements - #733: Make the same `libasyncProfiler` work with both glibc and musl - #734: Support raw PMU event descriptors - #759: Configure alternative profiling signal - #761: Parse dynamic linking structures - #723: `--clock` option to select JFR timestamp source - #750: `--jfrsync` may specify a list of JFR events - #849: Parse concatenated multi-chunk JFRs - #833: Time-to-safepoint JFR event - #832: Normalize names of hidden classes / lambdas - #864: Reduce size of HTML Flame Graph - #783: Shutdown asprof gracefully on SIGTERM - Better demangling of C++ and Rust symbols - DWARF unwinding for ARM64 - `JfrReader` can parse in-memory buffer - Support custom events in `JfrReader` - An option to read JFR file by chunks - Record `GCHeapSummary` events in JFR ### Bug fixes - Workaround macOS crashes in SafeFetch - Fixed attach to OpenJ9 on macOS - Support `UseCompressedObjectHeaders` aka Lilliput - Fixed allocation profiling on JDK 20.0.x - Fixed context-switches profiling - Prefer ObjectSampler to TLAB hooks for allocation profiling - Improved accuracy of ObjectSampler in `--total` mode - Make Flame Graph status line and search results always visible - `loop` and `timeout` options did not work in some modes - Restart interrupted poll/epoll_wait syscalls - Fixed stack unwinding issues on ARM64 - Workaround for stale jmethodIDs - Calculate ELF base address correctly - Do not dump redundant threads in a JFR chunk - `check` action prints result to a file - Annotate JFR unit types with `@ContentType` ## [2.9] - 2022-11-27 ### Features - Java Heap leak profiler - `meminfo` command to print profiler's memory usage - Profiler API with embedded agent as a Maven artifact ### Improvements - `--include`/`--exclude` options in the FlameGraph converter - `--simple` and `--dot` options in jfr2flame converter - An option for agressive recovery of `[unknown_Java]` stack traces - Do not truncate signatures in collapsed format - Display inlined frames under a runtime stub ### Bug fixes - Profiler did not work with Homebrew JDK - Fixed allocation profiling on Zing - Various `jfrsync` fixes - Symbol parsing fixes - Attaching to a container on Linux 3.x could fail ## [2.8.3] - 2022-07-16 ### Improvements - Support virtualized ARM64 macOS - A switch to generate auxiliary events by async-profiler or FlightRecorder in jfrsync mode ### Bug fixes - Could not recreate perf_events after the first failure - Handle different versions of Zing properly - Do not call System.loadLibrary, when libasyncProfiler is preloaded ## [2.8.2] - 2022-07-13 ### Bug fixes - The same .so works with glibc and musl - dlopen hook did not work on Arch Linux - Fixed JDK 7 crash - Fixed CPU profiling on Zing ### Changes - Mark interpreted frames with `_[0]` in collapsed output - Double click selects a method name on a flame graph ## [2.8.1] - 2022-06-10 ### Improvements - JFR to pprof converter (contributed by @NeQuissimus) - JFR converter improvements: time range, collapsed output, pattern highlighting - `%n` pattern in file names; limit number of output files - `--lib` to customize profiler library path in a container - `profiler.sh list` command now works without PID ### Bug fixes - Fixed crashes related to continuous profiling - Fixed Alpine/musl compatibility issues - Fixed incomplete collapsed output due to weird locale settings - Workaround for JDK-8185348 ## [2.8] - 2022-05-09 ### Features - Mark top methods as interpreted, compiled (C1/C2), or inlined - JVM TI based allocation profiling for JDK 11+ - Embedded HTTP management server ### Improvements - Re-implemented stack recovery for better reliability - Add `loglevel` argument - Do not mmap perf page in `--all-user` mode - Distinguish runnable/sleeping threads in OpenJ9 wall-clock profiler - `--cpu` converter option to extract CPU profile from the wall-clock output ## [2.7] - 2022-02-14 ### Features - Experimental support for OpenJ9 VM - DWARF stack unwinding ### Improvements - Better handling of VM threads (fixed missing JIT threads) - More reliable recovery from `not_walkable` AGCT failures - Do not accept unknown agent arguments ## [2.6] - 2022-01-09 ### Features - Continuous profiling; `loop` and `timeout` options ### Improvements - Reliability improvements: avoid certain crashes and deadlocks - Smaller and faster agent library - Minor `jfr` and `jfrsync` enhancements (see the commit log) ## [2.5.1] - 2021-12-05 ### Bug fixes - Prevent early unloading of libasyncProfiler.so - Read kernel symbols only for perf_events - Escape backslashes in flame graphs - Avoid duplicate categories in `jfrsync` mode - Fixed stack overflow in RedefineClasses - Fixed deadlock when flushing JFR ### Improvements - Support OpenJDK C++ Interpreter (aka Zero) - Allow reading incomplete JFR recordings ## [2.5] - 2021-10-01 ### Features - macOS/ARM64 (aka Apple M1) port - PPC64LE port (contributed by @ghaug) - Profile low-privileged processes with perf_events (contributed by @Jongy) - Raw PMU events; kprobes & uprobes - Dump results in the middle of profiling session - Chunked JFR; support JFR files larger than 2 GB - Integrate async-profiler events with JDK Flight Recordings ### Improvements - Use RDTSC for JFR timestamps when possible - Show line numbers and bci in Flame Graphs - jfr2flame can produce Allocation and Lock flame graphs - Flame Graph title depends on the event and `--total` - Include profiler logs and native library list in JFR output - Lock profiling no longer requires JVM symbols - Better container support - Native function profiler can count the specified argument - An option to group threads by scheduling policy - An option to prepend library name to native symbols ### Notes - macOS build is provided as a fat binary that works both on x86-64 and ARM64 - 32-bit binaries are no longer shipped. It is still possible to build them from sources - Dropped JDK 6 support (may still work though) ## [2.0] - 2021-03-14 ### Features - Profile multiple events together (cpu + alloc + lock) - HTML 5 Flame Graphs: faster rendering, smaller size - JFR v2 output format, compatible with FlightRecorder API - JFR to Flame Graph converter - Automatically turn profiling on/off at `--begin`/`--end` functions - Time-to-safepoint profiling: `--ttsp` ### Improvements - Unlimited frame buffer. Removed `-b` option and 64K stack traces limit - Additional JFR events: OS, CPU, and JVM information; CPU load - Record bytecode indices / line numbers - Native stack traces for Java events - Improved CLI experience - Better error handling; an option to log warnings/errors to a dedicated stream - Reduced the amount of unknown stack traces ### Changes - Removed non-ASL code. No more CDDL license ## [1.8.4] - 2021-02-24 ### Improvements - Smaller and faster agent library ### Bug fixes - Fixed JDK 7 crash during wall-clock profiling ## [1.8.3] - 2021-01-06 ### Improvements - libasyncProfiler.dylib symlink on macOS ### Bug fixes - Fixed possible deadlock on non-HotSpot JVMs - Gracefully stop profiler when terminating JVM - Fixed GetStackTrace problem after RedefineClasses ## [1.8.2] - 2020-11-02 ### Improvements - AArch64 build is now provided out of the box - Compatibility with JDK 15 and JDK 16 ### Bug fixes - More careful native stack walking in wall-clock mode - `resume` command is not compatible with JFR format - Wrong allocation sizes on JDK 8u262 ## [1.8.1] - 2020-09-05 ### Improvements - Possibility to specify application name instead of `pid` (contributed by @yuzawa-san) ### Bug fixes - Fixed long attach time and slow class loading on JDK 8 - `UnsatisfiedLinkError` during Java method profiling - Avoid reading `/proc/kallsyms` when `--all-user` is specified ## [1.8] - 2020-08-10 ### Features - Converters between different output formats: - JFR -> nflx (FlameScope) - Collapsed stacks -> HTML 5 Flame Graph ### Improvements - `profiler.sh` no longer requires bash (contributed by @cfstras) - Fixed long attach time and slow class loading on JDK 8 - Fixed deadlocks in wall-clock profiling mode - Per-thread reverse Flame Graph and Call Tree - ARM build now works with ARM and THUMB flavors of JDK ### Changes - Release package is extracted into a separate folder ## [1.7.1] - 2020-05-14 ### Features - LBR call stack support (available since Haswell) ### Improvements - `--filter` to profile only specified thread IDs in wall-clock mode - `--safe-mode` to disable selected stack recovery techniques ## [1.7] - 2020-03-17 ### Features - Profile invocations of arbitrary Java methods - Filter stack traces by the given name pattern - Java API to filter monitored threads - `--cstack`/`--no-cstack` option ### Improvements - Thread names and Java thread IDs in JFR output - Wall clock profiler distinguishes RUNNABLE vs. SLEEPING threads - Stable profiling interval in wall clock mode - C++ function names as events, e.g. `-e VMThread::execute` - `check` command to test event availability - Allow shading of AsyncProfiler API - Enable CPU profiling on WSL - Enable allocation profiling on Zing - Reduce the amount of `unknown_Java` samples ## [1.6] - 2019-09-09 ### Features - Pause/resume profiling - Allocation profiling support for JDK 12, 13 (contributed by @rraptorr) ### Improvements - Include all AsyncGetCallTrace failures in the profile - Parse symbols of JNI libraries loaded in runtime - The agent autodetects output format by the file extension - Output file name patterns: `%p` and `%t` - `-g` option to print method signatures - `-j` can increase the maximum Java stack depth - Allocaton sampling rate can be adjusted with `-i` - Improved reliability on macOS ### Changes - `-f` file names are now relative to the current shell directory ## [1.5] - 2019-01-08 ### Features - Wall-clock profiler: `-e wall` - `-e itimer` mode for systems that do not support perf_events - Native stack traces on macOS - Support for Zing runtime, except allocation profiling ### Improvements - `--all-user` option to allow profiling with restricted `perf_event_paranoid` (contributed by @jpbempel) - `-a` option to annotate method names - Improved attach to containerized and chroot'ed JVMs - Native function profiling now accepts non-public symbols - Better mapping of Java thread names (contributed by @KirillTim) ### Changes - Changed default profiling engine on macOS - Fixed the order of stack frames in JFR format ## [1.4] - 2018-06-24 ### Features - Interactive Call tree and Backtrace tree in HTML format (contributed by @rpulle) - Experimental support for Java Flight Recorder (JFR) compatible output ### Improvements - Added units: `ms`, `us`, `s` and multipliers: `K`, `M`, `G` for interval argument - API and command-line option `-v` for profiler version - Allow profiling containerized JVMs on older kernels ### Changes - Default CPU sampling interval reduced to 10 ms - Changed the text format of flat profile ## [1.3] - 2018-05-13 ### Features - Profiling of native functions, e.g. malloc ### Improvements - JDK 9, 10, 11 support for heap profiling with accurate stack traces - `root` can now profile Java processes of any user - `-j` option for limiting Java stack depth ## [1.2] - 2018-03-05 ### Features - Produce SVG files out of the box; flamegraph.pl is no longer needed - Profile ReentrantLock contention - Java API ### Improvements - Allocation and Lock profiler now works on JDK 7, too - Faster dumping of results ### Changes - `total` counter of allocation profiler now measures heap pressure (like JMC) ## [1.1] - 2017-12-03 ### Features - Linux Perf Events profiling: CPU cycles, cache misses, branch misses, page faults, context switches etc. - Kernel tracepoints support - Contended monitor (aka intrinsic lock) profiling - Individual thread profiles ### Improvements - Profiler can engage at JVM start and automatically dump results on exit - `list` command-line option to list supported events - Automatically find target process ID with `jps` tool - An option to include counter value in `collapsed` output - Friendly class names in allocation profile - Split allocations in new TLAB vs. outside TLAB ### Changes - Replaced `-m` modes with `-e` events - Interval changed from `int` to `long` ## [1.0] - 2017-10-09 ### Features - CPU profiler without Safepoint bias - Lightweight Allocation profiler - Java, native and kernel stack traces - FlameGraph compatible output