hs_err_pid.log - 详解

April 11, 2018

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (os_linux.cpp:2673), pid=13016, tid=140682742126336
#
# JRE version: Java(TM) SE Runtime Environment (8.0_40-b25) (build 1.8.0_40-b25)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.40-b25 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

第一部分主要是程序crash的原因以及相关的solution。

Current thread (0x00007ff2d8001000):  JavaThread "flink-akka.remote.default-remote-dispatcher-15" daemon [_thread_new, id=31925, stack(0x00007ff340ced000,0x00007ff340dee000)]

Stack: [0x00007ff340ced000,0x00007ff340dee000],  sp=0x00007ff340dec9a0,  free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xaaca9a]  VMError::report_and_die()+0x2ba
V  [libjvm.so+0x4f333b]  report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b
V  [libjvm.so+0x90e8c3]  os::Linux::commit_memory_impl(char*, unsigned long, bool)+0x103
V  [libjvm.so+0x90e98c]  os::pd_commit_memory(char*, unsigned long, bool)+0xc
V  [libjvm.so+0x90772a]  os::commit_memory(char*, unsigned long, bool)+0x2a
V  [libjvm.so+0x90c97f]  os::pd_create_stack_guard_pages(char*, unsigned long)+0x7f
V  [libjvm.so+0xa52b4e]  JavaThread::create_stack_guard_pages()+0x5e
V  [libjvm.so+0xa5c9b4]  JavaThread::run()+0x34
V  [libjvm.so+0x910ee8]  java_start(Thread*)+0x108
C  [libpthread.so.0+0x7dc5]  start_thread+0xc5

Current Thread表示crash时程序所处的线程。 线程类别: JavaThread,共有以下几种

  • JavaThread
  • VMThread(负责VM内的一些操作,包括GC)
  • CompilerThread
  • GCTaskThread
  • WatcherThread
  • ConcurrentMarkSweepThread

线程状态:_thread_new

线程状态 | 描述

  • | - _thread_uninitialized | 线程未被创建 _thread_new | 线程创建但并未执行 _thread_in_native | 线程正在运行native code _thread_in_vm | 执行VM的code _thread_in_java | 执行java的code _thread_blocked | 被block住的线程

原生堆栈信息(Native frames): 主要是一些jvm动态库的对战信息,是分析crash中比较重要的一个部分。

Java Threads: ( => current thread )
=>0x00007ff2d8001000 JavaThread "flink-akka.remote.default-remote-dispatcher-15" daemon [_thread_new, id=31925, stack(0x00007ff340ced000,0x00007ff340dee000)]
  0x00007ff2ec053800 JavaThread "Hashed wheel timer #1" daemon [_thread_blocked, id=14219, stack(0x00007ff33bdfe000,0x00007ff33beff000)]
  0x00007ff2dc130800 JavaThread "New I/O server boss #6" daemon [_thread_in_native, id=14218, stack(0x00007ff33beff000,0x00007ff33c000000)]
  0x00007ff2dc119800 JavaThread "New I/O worker #5" daemon [_thread_in_native, id=14217, stack(0x00007ff3400e3000,0x00007ff3401e4000)]
  0x00007ff2dc0b6000 JavaThread "New I/O worker #4" daemon [_thread_in_native, id=14216, stack(0x00007ff3401e4000,0x00007ff3402e5000)]
  0x00007ff2dc0bb800 JavaThread "New I/O boss #3" daemon [_thread_in_native, id=14215, stack(0x00007ff3402e5000,0x00007ff3403e6000)]
  0x00007ff2dc05c000 JavaThread "New I/O worker #2" daemon [_thread_in_native, id=14214, stack(0x00007ff3403e6000,0x00007ff3404e7000)]
  0x00007ff2dc052000 JavaThread "New I/O worker #1" daemon [_thread_in_native, id=14213, stack(0x00007ff3404e7000,0x00007ff3405e8000)]
  0x00007ff2c8009800 JavaThread "flink-akka.remote.default-remote-dispatcher-7" daemon [_thread_in_vm, id=14206, stack(0x00007ff3407e8000,0x00007ff3408e9000)]
  0x00007ff2c800c000 JavaThread "flink-akka.remote.default-remote-dispatcher-6" daemon [_thread_in_native, id=14202, stack(0x00007ff3408e9000,0x00007ff3409ea000)]
  0x00007ff2cc020000 JavaThread "flink-akka.actor.default-dispatcher-5" daemon [_thread_blocked, id=14147, stack(0x00007ff3409ea000,0x00007ff340aeb000)]
  0x00007ff36222a800 JavaThread "flink-akka.actor.default-dispatcher-4" daemon [_thread_blocked, id=14136, stack(0x00007ff340aeb000,0x00007ff340bec000)]
  0x00007ff362229000 JavaThread "flink-akka.actor.default-dispatcher-3" daemon [_thread_blocked, id=14135, stack(0x00007ff340bec000,0x00007ff340ced000)]
  0x00007ff362183000 JavaThread "flink-scheduler-1" daemon [_thread_blocked, id=14044, stack(0x00007ff340fee000,0x00007ff3410ef000)]
  0x00007ff3614b4000 JavaThread "Timer for 'phoenix' metrics system" daemon [_thread_blocked, id=14032, stack(0x00007ff342dfa000,0x00007ff342efb000)]
  0x00007ff360fc1000 JavaThread "main-EventThread" daemon [_thread_blocked, id=14031, stack(0x00007ff3431fc000,0x00007ff3432fd000)]
  0x00007ff360faf000 JavaThread "main-SendThread(cnzk2:2181)" daemon [_thread_in_native, id=14030, stack(0x00007ff3432fd000,0x00007ff3433fe000)]
  0x00007ff360471000 JavaThread "Thread-80" daemon [_thread_blocked, id=14029, stack(0x00007ff3430fb000,0x00007ff3431fc000)]
  0x00007ff3608a7000 JavaThread "IPC Parameter Sending Thread #0" daemon [_thread_blocked, id=13178, stack(0x00007ff3437fe000,0x00007ff3438ff000)]
  0x00007ff3608a4000 JavaThread "IPC Client (1508059488) connection to cnhm0/10.0.1.222:8032 from apps" daemon [_thread_blocked, id=13177, stack(0x00007ff348923000,0x00007ff348a24000)]
  0x00007ff3600df800 JavaThread "Service Thread" daemon [_thread_blocked, id=13172, stack(0x00007ff3498c5000,0x00007ff3499c6000)]
  0x00007ff3600d2000 JavaThread "C1 CompilerThread2" daemon [_thread_in_native, id=13171, stack(0x00007ff3499c6000,0x00007ff349ac7000)]
  0x00007ff3600d0000 JavaThread "C2 CompilerThread1" daemon [_thread_blocked, id=13170, stack(0x00007ff349ac7000,0x00007ff349bc8000)]
  0x00007ff3600cd000 JavaThread "C2 CompilerThread0" daemon [_thread_blocked, id=13169, stack(0x00007ff349bc8000,0x00007ff349cc9000)]
  0x00007ff3600cb000 JavaThread "JDWP Event Helper Thread" daemon [_thread_blocked, id=13168, stack(0x00007ff349cc9000,0x00007ff349dca000)]
  0x00007ff3600c7000 JavaThread "JDWP Transport Listener: dt_socket" daemon [_thread_in_native, id=13167, stack(0x00007ff349dca000,0x00007ff349ecb000)]
  0x00007ff3600b8000 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=13166, stack(0x00007ff34a0cf000,0x00007ff34a1d0000)]
  0x00007ff36008c000 JavaThread "Finalizer" daemon [_thread_blocked, id=13165, stack(0x00007ff34a1d0000,0x00007ff34a2d1000)]
  0x00007ff36008a000 JavaThread "Reference Handler" daemon [_thread_blocked, id=13164, stack(0x00007ff34a2d1000,0x00007ff34a3d2000)]
  0x00007ff360017000 JavaThread "main" [_thread_blocked, id=13158, stack(0x00007ff369c7a000,0x00007ff369d7b000)]

Other Threads:
  0x00007ff360085000 VMThread [stack: 0x00007ff34a3d2000,0x00007ff34a4d3000] [id=13163]
  0x00007ff3600e2000 WatcherThread [stack: 0x00007ff3497c4000,0x00007ff3498c5000] [id=13173]

这一部分是crash时Java代码中所有线程的状况。分析类似上一部分。

VM state:not at safepoint (normal execution)
VM Mutex/Monitor currently owned by a thread: None

VM状态

状态 | 描述

  • | :-: | not at a safepoint | 正常执行 at safepoint | 所有VM内的线程被block住,在等待某一个VM操作完成 synchronizing | 一个特殊VM操作需要执行,VM在等待所有线程block
Heap:
 PSYoungGen  total 39424K, used 31166K [0x0000000771a00000, 0x0000000774780000, 0x00000007c0000000)
  eden space 38912K, 79% used [0x0000000771a00000,0x000000077384faf0,0x0000000774000000)
  from space 512K, 25% used [0x0000000774680000,0x00000007746a0000,0x0000000774700000)
  to   space 512K, 0% used [0x0000000774700000,0x0000000774700000,0x0000000774780000)
 ParOldGen   total 124928K, used 67043K [0x00000006d4e00000, 0x00000006dc800000, 0x0000000771a00000)
  object space 124928K, 53% used [0x00000006d4e00000,0x00000006d8f78d18,0x00000006dc800000)
 Metaspace   used 54122K, capacity 54748K, committed 54912K, reserved 1097728K
  class spaceused 6562K, capacity 6716K, committed 6784K, reserved 1048576K

Card table byte_map: [0x00007ff3664db000,0x00007ff366c35000] byte_map_base: 0x00007ff362e34000

Marking Bits: (ParMarkBitMap*) 0x00007ff3691426c0
 Begin Bits: [0x00007ff324a70000, 0x00007ff328538000)
 End Bits:   [0x00007ff328538000, 0x00007ff32c000000)

Polling page: 0x00007ff369d84000

CodeCache: size=245760Kb used=17958Kb max_used=18385Kb free=227801Kb
 bounds [0x00007ff34aad9000, 0x00007ff34bd29000, 0x00007ff359ad9000]
 total_blobs=6019 nmethods=5425 adapters=516
 compilation: enabled
  • Card Table是jvm维护的一种数据结构,用于记录更改对象时的引用,方便GC。
  • CodeCache是用来保存本地代码的,不属于PermGen。
Compilation events (10 events):
GC Heap History (10 events):
Deoptimization events (10 events):
Internal exceptions (10 events):
Events (10 events):
Dynamic libraries:

这些都是表示crash时jvm最近的一些操作。举个例子说明:

Events (10 events):
Event: 2603309.010 Thread 0x00007ff2c800c000 DEOPT UNPACKING pc=0x00007ff34aaddf69 sp=0x00007ff3409e88a8 mode 2
Event: 2603310.108 Thread 0x00007ff362229000 DEOPT PACKING pc=0x00007ff34b25ce6c sp=0x00007ff340ceb660
Event: 2603310.122 Thread 0x00007ff2c8009800 Uncommon trap: trap_request=0xffffff65 fr.pc=0x00007ff34b890e40
Event: 2603310.124 Thread 0x00007ff2c8009800 DEOPT PACKING pc=0x00007ff34b890e40 sp=0x00007ff3408e7790
Event: 2603310.124 Thread 0x00007ff2c8009800 DEOPT UNPACKING pc=0x00007ff34aaddf69 sp=0x00007ff3408e7680 mode 2
Event: 2603310.125 Thread 0x00007ff2c8009800 Uncommon trap: trap_request=0xffffff65 fr.pc=0x00007ff34b850fe4
Event: 2603310.125 Thread 0x00007ff2c8009800 DEOPT PACKING pc=0x00007ff34b850fe4 sp=0x00007ff3408e7560
Event: 2603310.125 Thread 0x00007ff2c8009800 DEOPT UNPACKING pc=0x00007ff34aaddf69 sp=0x00007ff3408e72d8 mode 2
Event: 2603310.126 Thread 0x00007ff362229000 DEOPT UNPACKING pc=0x00007ff34aaddf69 sp=0x00007ff340ceb628 mode 2
Event: 2603310.935 Thread 0x00007ff2d8001000 Thread added: 0x00007ff2d8001000

上述是最近10个runtime VM的事件,拿第一条举例,2603309.010是从VM启动后的时长,单位是秒,pc是program counter,sp是stack pointer。关于 DEOPT、Uncommon trap等的解释可以看这个issue,大概就是编译的时候,编译器做了一些优化,但是在runtime时发现这个优化没有用,所以叫去优化(deoptimization)。

---------------  S Y S T E M  ---------------

OS:Amazon Linux AMI release 2016.03

uname:Linux 4.4.11-23.53.amzn1.x86_64 #1 SMP Wed Jun 1 22:22:50 UTC 2016 x86_64
libc:glibc 2.17 NPTL 2.17
rlimit: STACK 8192k, CORE 0k, NPROC 65536, NOFILE 65536, AS infinity
load average:10.84 2.76 1.17

/proc/meminfo:
MemTotal:   15403948 kB
MemFree: 2637700 kB
MemAvailable:2571132 kB
Buffers: 304 kB
Cached:13784 kB
SwapCached:0 kB
Active: 12625320 kB
Inactive:   7684 kB
Active(anon):   12619252 kB
Inactive(anon):  336 kB
Active(file):   6068 kB
Inactive(file): 7348 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal: 0 kB
SwapFree:  0 kB
Dirty:16 kB
Writeback:56 kB
AnonPages:  12606968 kB
Mapped:10896 kB
Shmem:   356 kB
Slab:  46004 kB
SReclaimable:  22384 kB
SUnreclaim:23620 kB
KernelStack:9856 kB
PageTables:38084 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit: 7701972 kB
Committed_AS:   18337652 kB
VmallocTotal:   34359738367 kB
VmallocUsed:   0 kB
VmallocChunk:  0 kB
AnonHugePages: 0 kB
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k:   12288 kB
DirectMap2M:15716352 kB

系统内存情况:

名词 | 描述 | 名词 | 描述 | 名词 | 描述

  • | :-: | :-: | :-: | :-: | :- MemToal | 总内存 | MemFree | 未使用内存 | MemAvailable | 可用内存 Buffers | 文件读写Buffer | Cached | 缓存 | SwapCached | 缓存的交换空间 Active | 最近被申请过的空间 | Inactive | 最近未被申请过的空间 | Active(anon) | .. Inactive(anon) | .. | Active(file) | .. | Inactive(file) | .. Unevictable | .. | Unevictable | .. | Mlocked | .. SwapTotal | 可用交换空间 | SwapFree | 可用交换空间 | Dirty | 等待写入磁盘的空间 Writeback | 正在被写入磁盘的空间 | AnonPages | .. | Mapped | .. Shmem | .. | Slab | kernel使用 | Sreclaimable | .. SUnreclaim | .. | KernelStack | .. | PageTables | 内存页映射表空间 NFS_Unstable | .. | Bounce | .. | WritebackTmp | .. CommitLimit | .. | Committed_AS | 完成workload最坏情况的内存使用情况 | VmallocTotal | .. VmallocUsed | 虚拟空间可映射的大小 | VmallocUsed | 已使用的虚拟空间映射大小 | VmallocChunk | 虚拟空间最大的虚拟空间块 AnonHugePages | .. | HugePages_Total | .. | HugePages_Free | .. HugePages_Rsvd | .. | HugePages_Surp | .. | Hugepagesize | .. DirectMap4k | .. | DirectMap2M | ..