Proliferation caused by Java unified GC logging
As most of you know, GC logging is not standardized. GC log format varies by JVM vendor (Oracle, IBM, HP, Azul, …), Java version (1.4, 5, 6, 7, 8, 9), GC algorithm (Serial, Parallel, CMS, G1, Shenandoah) and GC system properties that you pass (-XX:+PrintGC, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC …). Based on these permutations and combinations, already there are 40+ different GC log formats.
What are the challenges in having different GC log formats?
If you have developed any script that parses GC logs to extract certain statistics or trigger alerts when GC time exceeds certain threshold or monitor for repeated Full GC events, those scripts must be customized to cater to different formats of GC logs.
Since there are so many different GC log formats, it’s hard to understand each one of the formats and interpret the results effectively. Most formats don’t have any documentation or literature (Of course this point can be counter argued: You can use sophisticated GC log analysis tools such as GCeasy, HPJmeter…).
Unification or Proliferation?
When we heard the news that in Java 9, GC logs are re-implemented using ‘Unified JVM logging framework (JEP 158)’ we were thrilled because we thought all these 40+ different GC log formats will be consolidated into 1 standard GC log format. Life would have become a bliss. But unfortunately, this unified GC logging framework did the contrary. It ended up creating, even more, log formats .
Unified JVM Logging framework’s goal is to introduce a common logging system for all components of JVM i.e. compiler, GC, classloader, metaspace, svc, jfr… So, in each log line, following information is printed in addition to current information that is present in the old version of GC Logs:
- Component name (compiler, GC, classloader, metaspace, svc, jfr…)
- Log Level (trace, debug, info, warning, error)
- Decorations (time, uptime, timemillis, uptimemillis, timenanos, uptimenanos, pid, tid)
On top of these additional parameters, GC log statements format has also changed. See the G1 GC log and CMS GC log between Java 8 and Java 9:
Fig 1: G1 GC log format in Java 9 unified GC Logging framework
Fig 2: G1 GC log format in Java 8
Fig 3: CMS GC log format in Java 9 unified GC Logging framework
Fig 4: CMS GC log format in Java 8
Before reading further, please pause for few seconds and review Fig 1, Fig2, Fig 3, Fig 4 closely. You will be able to clearly see the difference how GC log formats varies significantly between Java 8 and Java 9. This is just difference between Oracle Java 8 and Oracle Java 9 for G1 and CMS GC algorithms. On top of this, there are other GC algorithms (parallel, CMS, Shenandoah), other Java versions (Java 1.4, 5, 6, 7), other JVM vendors (Oracle, IBM, HP, Azul), other GC system properties (-XX:+PrintGC, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC …).When you put those combinations here, the number of GC log formats explodes.
Thus, you can see how Unified GC logging framework complicates (already complicated) GC log format space.
Conclusion
It’s easier to complain and criticize any implementation. However, we understand and respect the mandate Oracle engineering team received to migrate to JVM unified logging framework. As it’s a JVM wide global initiative, GC logging got into this complicated situation.
To help us from this complicated situation, we can leverage tools like GCeasy, HPJmeter…. which can parse most formats of GC logs.
Author bio:
Ram Lakshmanan
Every single day, millions & millions of people in North America—bank, travel, and commerce—use the applications that Ram Lakshmanan has architected. Ram is an acclaimed speaker in major conferences on scalability, availability, and performance topics. Recently, he has founded a startup, which specializes in troubleshooting performance problems.