...
crash /boot/System.map-2.6.32.lustremaster vmlinux vmcore
--> vmlinux is located in ./BUILD/kernel-2.6.32.lustremaster/vmlinux
--> /var/crash/*/vmcore
(NOTE: If there is a version mismatch in the system.map and /BUILD/<kernel> then try without adding the system.map )
Collecting bits
- download and install kernel debuginfo and debuginfo-common rpms.
- If you know the gerrit review you're looking at then you can follow the links to get to the necessary rpms. The server builds have the kernel-debuginfo and kernel-debuginfo-common rpms
- for client which use the stock kernel, you'll need to get that off centos debuginfo site
- EX: centos 7: http://debuginfo.centos.org/7/x86_64/
- download and install the lustre-debuginfo for the appropriate build
- Again for a gerrit review you can get to it through the build artifacts link on build.hpdd.intel.com
- Instead of installation you can extract the rpms in your local debug directory
- rpm2cpio <rpm> | cpio -idmv
...
| Code Block |
|---|
# Display stack trace for crashed task bt # gives you stack trace for all the CPUs bt -a # gives you task list in condensed form ps # give you more info on each call, including stack addresses. bt -f # print back trace with line numbers # -l might not always work if the wrong debuginfo rpms or the wrong debug symbols are loaded bt -l # print stack traces for all tasks foreach bt | less # print the stack trace for wanted task bt [<PID> | <task pointer>] # to examine type definitions whatis <type name> # EXAMPLE: crash> whatis the_lnet lnet_t the_lnet crash> whatis lnet_t typedef struct { .... } lnet_t; # examining global variables crash> the_lnet the_lnet = $1 = { ln_cpt_table = 0xffff883cecde6940, ln_cpt_number = 16, ln_cpt_bits = 4, ln_res_lock = 0xffff883ce8a8a1a0, ... } # examining local variables <struct name> <address>. print the contents of the memory # EXAMPLEpointed (moreto details by address in the form of the struct. <struct name> <address> # EXAMPLE (more details below) lnet_peer_ni <address> # or to avoid confusion in case you have a function with the same name as the struct. struct lnet_peer_ni <address> |
Disassembling functions to find structure pointers and print
It is often necessary to print certain structures and their values for testing. In order to do that we need to find the pointer to the structure memory. To accomplish that we need some understanding of AMD64 assembly and registry usage:
reference x86-64 abi
alternative reference http://www.x86-64.org/documentation/abi.pdf6.s081.scripts.mit.edu/sp18/x86-64-architecture-guide.html
First, disassemble function
...
| Code Block |
|---|
%rbx: callee-saved register; optionally used as base pointer %rdi: used to pass 1st argument to functions |
Our task becomes to track down through the disassembled code the usage of rdi and rbx
Side Note: RHEL/CentOS kernels use %rbp as a frame pointer (pointer to start of stack frame), while SLES kernels use it as a general register. This reflects a difference in the default compiler options these distributions use when compiling the kernel. As a side effect this means that with enough experience you can tell from the disassembly whether you are dealing with RHEL/CentOS or SLES. The example given here is RHEL/CentOS.
Our task becomes to track down through the disassembled code the usage of rdi and rbx
First disassemble the code First disassemble the code for lnet_destroy_peer_ni_locked()
...
| Code Block |
|---|
# show where the kernel memory is allocated crash> kmem -s ... ffff88007fa809c0 idr_layer_cache 544 294 301 43 4k ffff88007fa60980 size-4194304(DMA) 4194304 0 0 0 4096k ffff88007fa50940 size-4194304 4194304 0 0 0 4096k ffff88007fa40900 size-2097152(DMA) 2097152 0 0 0 2048k ffff88007fa308c0 size-2097152 2097152 1 1 1 2048k ffff88007fa20880 size-1048576(DMA) 1048576 0 0 0 1024k ffff88007fa10840 size-1048576 1048576 64 64 64 1024k ffff88007fa00800 size-524288(DMA) 524288 0 0 0 512k ffff88007f9f07c0 size-524288 524288 0 0 0 512k ffff88007f9e0780 size-262144(DMA) 262144 0 0 0 256k ffff88007f9d0740 size-262144 262144 64 64 64 256k ffff88007f9c0700 size-131072(DMA) 131072 0 0 0 128k ffff88007f9b06c0 size-131072 131072 7 7 7 128k ffff88007f9a0680 size-65536(DMA) 65536 0 0 0 64k ffff88007f990640 size-65536 65536 3 3 3 64k ffff88007f980600 size-32768(DMA) 32768 0 0 0 32k ffff88007f9705c0 size-32768 32768 26 26 26 32k ffff88007f960580 size-16384(DMA) 16384 0 0 0 16k ffff88007f950540 size-16384 16384 24 26 26 16k ffff88007f940500 size-8192(DMA) 8192 0 0 0 8k ffff88007f9304c0 size-8192 8192 839 844 844 8k ffff88007f920480 size-4096(DMA) 4096 0 0 0 4k ffff88007f910440 size-4096 4096 702 735 735 4k ffff88007f900400 size-2048(DMA) 2048 0 0 0 4k ffff88007f8f03c0 size-2048 2048 791 862 431 4k ffff88007f8e0380 size-1024(DMA) 1024 0 0 0 4k ffff88007f8d0340 size-1024 1024 1966 2188 547 4k ffff88007f8c0300 size-512(DMA) 512 0 0 0 4k ffff88007f8b02c0 size-512 512 2326695 2326704 290838 4k ffff88007f8a0280 size-256(DMA) 256 0 0 0 4k ffff88007f890240 size-256 256 1162648 1162650 77510 4k ffff88007f880200 size-192(DMA) 192 0 0 0 4k ffff88007f8701c0 size-192 192 3900 6340 317 4k ffff88007f860180 size-128(DMA) 128 0 0 0 4k ffff88007f850140 size-64(DMA) 64 0 0 0 4k ffff88007f840100 size-64 64 12403 13983 237 4k ffff88007f8300c0 size-32(DMA) 32 0 0 0 4k ffff88007f810080 size-128 128 295891 295920 9864 4k ffff88007f800040 size-32 32 1181468 1181488 10549 4k ffffffff81ad3620 kmem_cache 32896 240 240 240 64k # Show all the memory blocks which are crash> kmem -S <memory address> # example: crash> kmem -S ffff88007f8d0340 CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE ffff88007f8d0340 size-1024 1024 1966 2188 547 4k SLAB MEMORY TOTAL ALLOCATED FREE ffff880037e52e40 ffff880059b90000 4 0 4 FREE / [ALLOCATED] ffff880059b90000 (shared cache) ffff880059b90400 (shared cache) ffff880059b90800 (shared cache) ffff880059b90c00 SLAB MEMORY TOTAL ALLOCATED FREE ffff880044f9a600 ffff880044f62000 4 1 3 FREE / [ALLOCATED] ffff880044f62000 (shared cache) ffff880044f62400 [ffff880044f62800] ffff880044f62c00 (shared cache) # Each address listed is the beginning of an allocation. Potentially you can print the memory at this address to see what it contains. # example: crash> lnet_msg_t ffff880044f62800 # or # print memory at 64 byte increments starting at address and print 23 64 bytes. crash> rd -64 ffff88007f82a800 23 # You can pipe the output to 'tail' to see the tail end of the output. 23 # You can pipe the output to 'tail' to see the tail end of the output. |
Case Study
LU-9203
LU-9203 describes a crash with the following stack trace:
| Code Block |
|---|
crash> bt
PID: 28792 TASK: ffff88004f9e0fb0 CPU: 0 COMMAND: "mdt00_003"
#0 [ffff88006a6bf6b8] machine_kexec at ffffffff81059bab
#1 [ffff88006a6bf718] __crash_kexec at ffffffff81105812
#2 [ffff88006a6bf7e8] crash_kexec at ffffffff81105900
#3 [ffff88006a6bf800] oops_end at ffffffff81690048
#4 [ffff88006a6bf828] no_context at ffffffff8167fd06
#5 [ffff88006a6bf878] __bad_area_nosemaphore at ffffffff8167fd9c
#6 [ffff88006a6bf8c0] bad_area_nosemaphore at ffffffff8167ff06
#7 [ffff88006a6bf8d0] __do_page_fault at ffffffff81692e8e
#8 [ffff88006a6bf930] do_page_fault at ffffffff81693035
#9 [ffff88006a6bf960] page_fault at ffffffff8168f248
[exception RIP: lnet_cpt_of_md+223]
RIP: ffffffffa0a8a2ff RSP: ffff88006a6bfa18 RFLAGS: 00010202
RAX: 000001040002f840 RBX: 0009000000000000 RCX: 000077ff80000000
RDX: ffffea0000000000 RSI: 0000000000000000 RDI: ffff880079600040
RBP: ffff88006a6bfa18 R8: 0000000000000009 R9: 00000000000003f8
R10: ffff88001ed52a00 R11: ffffc90000be1100 R12: ffff880005537e80
R13: 0009000000000000 R14: 0000000000000000 R15: ffff88001ed52a00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#10 [ffff88006a6bfa20] lnet_select_pathway at ffffffffa0a917dc [lnet]
#11 [ffff88006a6bfae0] lnet_send at ffffffffa0a93fb1 [lnet]
#12 [ffff88006a6bfb00] LNetPut at ffffffffa0a94325 [lnet]
#13 [ffff88006a6bfb58] ptl_send_buf at ffffffffa0d77c46 [ptlrpc]
#14 [ffff88006a6bfc10] ptlrpc_send_reply at ffffffffa0d7aeeb [ptlrpc]
#15 [ffff88006a6bfc88] target_send_reply_msg at ffffffffa0d3998e [ptlrpc]
#16 [ffff88006a6bfca8] target_send_reply at ffffffffa0d44476 [ptlrpc]
#17 [ffff88006a6bfd00] tgt_request_handle at ffffffffa0de2597 [ptlrpc]
#18 [ffff88006a6bfd48] ptlrpc_server_handle_request at ffffffffa0d8c1c3 [ptlrpc]
#19 [ffff88006a6bfde8] ptlrpc_main at ffffffffa0d901a0 [ptlrpc]
#20 [ffff88006a6bfec8] kthread at ffffffff810b0a4f
#21 [ffff88006a6bff50] ret_from_fork at ffffffff81697798 |
The function lnet_cpt_of_md() is as follows:
| Code Block |
|---|
83 int
84 lnet_cpt_of_md(struct lnet_libmd *md)
85 {
86 »·······int cpt = CFS_CPT_ANY;
87
88 »·······if (!md)
89 »·······»·······return CFS_CPT_ANY;
90
91 »·······if ((md->md_options & LNET_MD_BULK_HANDLE) != 0 &&
92 »······· !LNetMDHandleIsInvalid(md->md_bulk_handle)) {
93 »·······»·······md = lnet_handle2md(&md->md_bulk_handle);
94
95 »·······»·······if (!md)
96 »·······»·······»·······return CFS_CPT_ANY;
97 »·······}
98
99 »·······if ((md->md_options & LNET_MD_KIOV) != 0) {
100 »·······»·······if (md->md_iov.kiov[0].kiov_page != NULL)
101 »·······»·······»·······cpt = cfs_cpt_of_node(lnet_cpt_table(),
102 »·······»·······»·······»·······page_to_nid(md->md_iov.kiov[0].kiov_page));
103 »·······} else if (md->md_iov.iov[0].iov_base != NULL) {
104 »·······»·······cpt = cfs_cpt_of_node(lnet_cpt_table(),
105 »·······»·······»·······page_to_nid(virt_to_page(md->md_iov.iov[0].iov_base)));
106 »·······}
107
108 »·······return cpt;
109 } |
using gdb on lnet.ko shows the location of the crash. It appears that access the page causes the crash
| Code Block |
|---|
(gdb) l *lnet_cpt_of_md+223
0x1332f is in lnet_cpt_of_md (include/linux/mm.h:778).
773 #ifdef NODE_NOT_IN_PAGE_FLAGS
774 extern int page_to_nid(const struct page *page);
775 #else
776 static inline int page_to_nid(const struct page *page)
777 {
778 return (page->flags >> NODES_PGSHIFT) & NODES_MASK;
779 }
780 #endif
781
782 #ifdef CONFIG_NUMA_BALANCING |
The strategy is to try and find the address of the md argument to the function. This way can understand why we're getting
| Code Block |
|---|
BUG: unable to handle kernel paging request at ffffeb040013bd80 |
First we tried to look at the disassembly of the lnet_cpt_of_md() function. This didn't yield an address, but it's documented here for teaching purposes
| Code Block |
|---|
crash> dis lnet_cpt_of_md
0xffffffffa0a8a220 <lnet_cpt_of_md>: data32 data32 data32 xchg %ax,%ax [FTRACE NOP]
0xffffffffa0a8a225 <lnet_cpt_of_md+5>: test %rdi,%rdi
0xffffffffa0a8a228 <lnet_cpt_of_md+8>: je 0xffffffffa0a8a32b <lnet_cpt_of_md+267>
0xffffffffa0a8a22e <lnet_cpt_of_md+14>: push %rbp
0xffffffffa0a8a22f <lnet_cpt_of_md+15>: mov 0x4c(%rdi),%eax
0xffffffffa0a8a232 <lnet_cpt_of_md+18>: mov %rsp,%rbp
0xffffffffa0a8a235 <lnet_cpt_of_md+21>: test $0x2,%ah
0xffffffffa0a8a238 <lnet_cpt_of_md+24>: je 0xffffffffa0a8a294 <lnet_cpt_of_md+116>
0xffffffffa0a8a23a <lnet_cpt_of_md+26>: mov 0x68(%rdi),%rsi
0xffffffffa0a8a23e <lnet_cpt_of_md+30>: cmp $0xffffffffffffffff,%rsi
0xffffffffa0a8a242 <lnet_cpt_of_md+34>: je 0xffffffffa0a8a294 <lnet_cpt_of_md+116>
0xffffffffa0a8a244 <lnet_cpt_of_md+36>: mov 0x353c2(%rip),%ecx # 0xffffffffa0abf60c <the_lnet+12>
0xffffffffa0a8a24a <lnet_cpt_of_md+42>: mov $0x1,%edx
0xffffffffa0a8a24f <lnet_cpt_of_md+47>: mov %rsi,%rax
0xffffffffa0a8a252 <lnet_cpt_of_md+50>: shr $0x2,%rax
0xffffffffa0a8a256 <lnet_cpt_of_md+54>: shl %cl,%rdx
0xffffffffa0a8a259 <lnet_cpt_of_md+57>: mov 0x353a9(%rip),%ecx # 0xffffffffa0abf608 <the_lnet+8>
0xffffffffa0a8a25f <lnet_cpt_of_md+63>: sub $0x1,%edx
0xffffffffa0a8a262 <lnet_cpt_of_md+66>: and %eax,%edx
0xffffffffa0a8a264 <lnet_cpt_of_md+68>: cmp %ecx,%edx
0xffffffffa0a8a266 <lnet_cpt_of_md+70>: jae 0xffffffffa0a8a320 <lnet_cpt_of_md+256>
0xffffffffa0a8a26c <lnet_cpt_of_md+76>: mov 0x353bd(%rip),%rax # 0xffffffffa0abf630 <the_lnet+48>
0xffffffffa0a8a273 <lnet_cpt_of_md+83>: movslq %edx,%rdx
0xffffffffa0a8a276 <lnet_cpt_of_md+86>: mov (%rax,%rdx,8),%rdi
0xffffffffa0a8a27a <lnet_cpt_of_md+90>: callq 0xffffffffa0a7b900 <lnet_res_lh_lookup>
0xffffffffa0a8a27f <lnet_cpt_of_md+95>: test %rax,%rax
0xffffffffa0a8a282 <lnet_cpt_of_md+98>: je 0xffffffffa0a8a310 <lnet_cpt_of_md+240>
0xffffffffa0a8a288 <lnet_cpt_of_md+104>: sub $0x10,%rax
0xffffffffa0a8a28c <lnet_cpt_of_md+108>: mov %rax,%rdi
0xffffffffa0a8a28f <lnet_cpt_of_md+111>: je 0xffffffffa0a8a310 <lnet_cpt_of_md+240>
0xffffffffa0a8a291 <lnet_cpt_of_md+113>: mov 0x4c(%rax),%eax
0xffffffffa0a8a294 <lnet_cpt_of_md+116>: test $0x1,%ah
0xffffffffa0a8a297 <lnet_cpt_of_md+119>: je 0xffffffffa0a8a2c0 <lnet_cpt_of_md+160>
0xffffffffa0a8a299 <lnet_cpt_of_md+121>: mov 0x70(%rdi),%rax
0xffffffffa0a8a29d <lnet_cpt_of_md+125>: test %rax,%rax
0xffffffffa0a8a2a0 <lnet_cpt_of_md+128>: je 0xffffffffa0a8a310 <lnet_cpt_of_md+240>
0xffffffffa0a8a2a2 <lnet_cpt_of_md+130>: mov (%rax),%rsi
0xffffffffa0a8a2a5 <lnet_cpt_of_md+133>: mov 0x35354(%rip),%rdi # 0xffffffffa0abf600 <the_lnet>
0xffffffffa0a8a2ac <lnet_cpt_of_md+140>: shr $0x36,%rsi
0xffffffffa0a8a2b0 <lnet_cpt_of_md+144>: callq 0xffffffffa067bff0 <cfs_cpt_of_node>
0xffffffffa0a8a2b5 <lnet_cpt_of_md+149>: pop %rbp
0xffffffffa0a8a2b6 <lnet_cpt_of_md+150>: retq
0xffffffffa0a8a2b7 <lnet_cpt_of_md+151>: nopw 0x0(%rax,%rax,1)
0xffffffffa0a8a2c0 <lnet_cpt_of_md+160>: mov 0x70(%rdi),%rdx
0xffffffffa0a8a2c4 <lnet_cpt_of_md+164>: test %rdx,%rdx
0xffffffffa0a8a2c7 <lnet_cpt_of_md+167>: je 0xffffffffa0a8a310 <lnet_cpt_of_md+240>
0xffffffffa0a8a2c9 <lnet_cpt_of_md+169>: mov $0x80000000,%eax
0xffffffffa0a8a2ce <lnet_cpt_of_md+174>: movabs $0x77ff80000000,%rcx
0xffffffffa0a8a2d8 <lnet_cpt_of_md+184>: mov 0x35321(%rip),%rdi # 0xffffffffa0abf600 <the_lnet>
0xffffffffa0a8a2df <lnet_cpt_of_md+191>: add %rdx,%rax
0xffffffffa0a8a2e2 <lnet_cpt_of_md+194>: cmovb -0x1f0c52da(%rip),%rcx # 0xffffffff819c5010
0xffffffffa0a8a2ea <lnet_cpt_of_md+202>: movabs $0xffffea0000000000,%rdx
0xffffffffa0a8a2f4 <lnet_cpt_of_md+212>: add %rcx,%rax
0xffffffffa0a8a2f7 <lnet_cpt_of_md+215>: shr $0xc,%rax
0xffffffffa0a8a2fb <lnet_cpt_of_md+219>: shl $0x6,%rax
0xffffffffa0a8a2ff <lnet_cpt_of_md+223>: mov (%rax,%rdx,1),%rsi
0xffffffffa0a8a303 <lnet_cpt_of_md+227>: shr $0x36,%rsi
0xffffffffa0a8a307 <lnet_cpt_of_md+231>: callq 0xffffffffa067bff0 <cfs_cpt_of_node>
0xffffffffa0a8a30c <lnet_cpt_of_md+236>: pop %rbp
0xffffffffa0a8a30d <lnet_cpt_of_md+237>: retq
0xffffffffa0a8a30e <lnet_cpt_of_md+238>: xchg %ax,%ax
0xffffffffa0a8a310 <lnet_cpt_of_md+240>: mov $0xffffffff,%eax
0xffffffffa0a8a315 <lnet_cpt_of_md+245>: pop %rbp
0xffffffffa0a8a316 <lnet_cpt_of_md+246>: retq
0xffffffffa0a8a317 <lnet_cpt_of_md+247>: nopw 0x0(%rax,%rax,1)
0xffffffffa0a8a320 <lnet_cpt_of_md+256>: mov %edx,%eax
0xffffffffa0a8a322 <lnet_cpt_of_md+258>: xor %edx,%edx
0xffffffffa0a8a324 <lnet_cpt_of_md+260>: div %ecx
0xffffffffa0a8a326 <lnet_cpt_of_md+262>: jmpq 0xffffffffa0a8a26c <lnet_cpt_of_md+76>
0xffffffffa0a8a32b <lnet_cpt_of_md+267>: mov $0xffffffff,%eax
0xffffffffa0a8a330 <lnet_cpt_of_md+272>: retq
0xffffffffa0a8a331 <lnet_cpt_of_md+273>: data32 data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) |
%rdi register is used to pass the 1st argument. The question was would %rid hold the address of the md at the point of the crash.
| Code Block |
|---|
# we know crash occurred on:
<lnet_cpt_of_md+223>
# based on the assembly analysis
0xffffffffa0a8a2d8 <lnet_cpt_of_md+184>: mov 0x35321(%rip),%rdi # 0xffffffffa0abf600 <the_lnet>
# it's clear that %rdi has been overwritten. So we can not rely
# on %rdi to hold the address of the md. |
The next step is to look at the caller function, lnet_select_pathway() to see if we can determine the address of the md. Disassembling the function we get the first bit until the instruction at ffffffffa0a917dc
| Code Block |
|---|
crash> dis lnet_select_pathway
0xffffffffa0a91780 <lnet_select_pathway>: data32 data32 data32 xchg %ax,%ax [FTRACE NOP]
0xffffffffa0a91785 <lnet_select_pathway+5>: push %rbp
0xffffffffa0a91786 <lnet_select_pathway+6>: mov %rsp,%rbp
0xffffffffa0a91789 <lnet_select_pathway+9>: push %r15
0xffffffffa0a9178b <lnet_select_pathway+11>: mov %rdx,%r15
0xffffffffa0a9178e <lnet_select_pathway+14>: push %r14
0xffffffffa0a91790 <lnet_select_pathway+16>: push %r13
0xffffffffa0a91792 <lnet_select_pathway+18>: push %r12
0xffffffffa0a91794 <lnet_select_pathway+20>: push %rbx
0xffffffffa0a91795 <lnet_select_pathway+21>: mov %rsi,%rbx
0xffffffffa0a91798 <lnet_select_pathway+24>: sub $0x88,%rsp
0xffffffffa0a9179f <lnet_select_pathway+31>: mov %rdi,-0x50(%rbp)
0xffffffffa0a917a3 <lnet_select_pathway+35>: mov 0x2de56(%rip),%rdi # 0xffffffffa0abf600 <the_lnet>
0xffffffffa0a917aa <lnet_select_pathway+42>: mov %rsi,-0x38(%rbp)
0xffffffffa0a917ae <lnet_select_pathway+46>: mov $0x1,%esi
0xffffffffa0a917b3 <lnet_select_pathway+51>: mov %rcx,-0x90(%rbp)
0xffffffffa0a917ba <lnet_select_pathway+58>: callq 0xffffffffa067bfb0 <cfs_cpt_current>
0xffffffffa0a917bf <lnet_select_pathway+63>: mov 0x2deba(%rip),%rdi # 0xffffffffa0abf680 <the_lnet+128>
0xffffffffa0a917c6 <lnet_select_pathway+70>: mov %eax,%esi
0xffffffffa0a917c8 <lnet_select_pathway+72>: mov %eax,%r14d
0xffffffffa0a917cb <lnet_select_pathway+75>: mov %eax,-0x2c(%rbp)
0xffffffffa0a917ce <lnet_select_pathway+78>: callq 0xffffffffa0692770 <cfs_percpt_lock>
0xffffffffa0a917d3 <lnet_select_pathway+83>: mov 0x68(%r15),%rdi
0xffffffffa0a917d7 <lnet_select_pathway+87>: callq 0xffffffffa0a8a220 <lnet_cpt_of_md>
0xffffffffa0a917dc <lnet_select_pathway+92>: cmp $0xffffffff,%eax |
We can see the instruction that moves the address of the md into %rdi just before the call to lnet_cpt_of_md(). Examination of the dissasembly of lnet_cpt_of_md() shows that r15 is not being overwritten, therefore r15 with 0x68 offset should have the address of the md.
The instruction below means add 0x68 to %r15, dereference and move the result into %rdi.
| Code Block |
|---|
0xffffffffa0a917d3 <lnet_select_pathway+83>: mov 0x68(%r15),%rdi |
Referring back to the stack trace we find
| Code Block |
|---|
R13: 0009000000000000 R14: 0000000000000000 R15: ffff88001ed52a00 |
So we now can do:
| Code Block |
|---|
crash> p/x 0x68+0xffff88001ed52a00
$4 = 0xffff88001ed52a68
crash> rd 0xffff88001ed52a68
ffff88001ed52a68: ffff880005537e80 |
We now know that the address of the md is: ffff880005537e80
We can now print the struct lnet_libmd at that address:
| Code Block |
|---|
crash> struct lnet_libmd ffff880005537e80
struct lnet_libmd {
md_list = {
next = 0xffff880011bd1b00,
prev = 0xffff88000450b200
},
md_lh = {
lh_hash_chain = {
next = 0xffffc90001a8cad0,
prev = 0xffffc90001a8cad0
},
lh_cookie = 28208821
},
md_me = 0x0,
md_start = 0xffffc90000be1100 "\b",
md_offset = 0,
md_length = 1016,
md_max_size = 12456192,
md_threshold = 0,
md_refcount = 1,
md_options = 0,
md_flags = 2,
md_niov = 1,
md_user_ptr = 0xffffc90000be1000,
md_eq = 0xffff8800691a51e0,
md_bulk_handle = {
cookie = 18446744073709551615
},
md_iov = {
iov = {{
iov_base = 0xffffc90000be1100,
iov_len = 1016
}, {
iov_base = 0xffff880005537980,
iov_len = 6510615555426900570
}, {
iov_base = 0x5a5a5a5a5a5a5a5a,
iov_len = 6510615555426900570
}, {
iov_base = 0x5a5a5a5a5a5a5a5a,
iov_len = 6510615555426900570
}, {
iov_base = 0x5a5a5a5a5a5a5a5a,
iov_len = 6510615555426900570
}, {
iov_base = 0x5a5a5a5a5a5a5a5a,
iov_len = 0
}, {
iov_base = 0x0,
iov_len = 0
}, {
iov_base = 0x0,
iov_len = 0
}, {
iov_base = 0x0,
iov_len = 0
}, {
iov_base = 0xacb4,
iov_len = 0
}, { |
From the dump above we see that md_options is 0, which is what we'd expect given our call stack.
After a code analysis it appears that there are three ways to describe memory:
- Using
lnet_kiov_t- This has a vector of the pages
- Using
struct kvec- This has a vector of kernel virtual addresses
- If kernel virtual addresses are allocated by
kmallocthey have a one to one corresponds with pages.- We can use
virt_to_page()macro to return the page structure
- We can use
- Using
struct kvecto describe 1 contiguous buffer allocated viavmalloc- vmalloc addresses are in a different memory space
- need to use
vmalloc_to_page()to get the structure only ifis_vmalloc_addr()returns true
The same logic is done in the o2iblnd, follow the logic in kiblnd_setup_rd_iov()
Another debugging example:
| Code Block |
|---|
#### Try and find the CPT number being passed on the stack to lnet_initiate_peer_discovery()
crash> bt
PID: 8874 TASK: ffff881ff97d3f40 CPU: 7 COMMAND: "mdt_rdpg02_014"
#0 [ffff881e4a6b3660] machine_kexec at ffffffff8105d77b
#1 [ffff881e4a6b36c0] __crash_kexec at ffffffff81108742
#2 [ffff881e4a6b3790] panic at ffffffff816a863f
#3 [ffff881e4a6b3810] __warn at ffffffff8108ae7a
#4 [ffff881e4a6b3850] warn_slowpath_fmt at ffffffff8108aedf
#5 [ffff881e4a6b38b8] __list_add at ffffffff8134405c
#6 [ffff881e4a6b38e0] lnet_initiate_peer_discovery at ffffffffc0b07bc7 [lnet]
#7 [ffff881e4a6b3918] lnet_handle_find_routed_path at ffffffffc0b0b90d [lnet]
#8 [ffff881e4a6b3998] lnet_select_pathway at ffffffffc0b0c2c0 [lnet]
#9 [ffff881e4a6b3a98] lnet_send at ffffffffc0b0d115 [lnet]
#10 [ffff881e4a6b3ab8] LNetPut at ffffffffc0b0d56c [lnet]
#11 [ffff881e4a6b3b18] ptl_send_buf at ffffffffc0e00ff6 [ptlrpc]
#12 [ffff881e4a6b3bd0] ptlrpc_send_reply at ffffffffc0e043ab [ptlrpc]
#13 [ffff881e4a6b3c48] target_send_reply_msg at ffffffffc0dc335e [ptlrpc]
#14 [ffff881e4a6b3c68] target_send_reply at ffffffffc0dcd7de [ptlrpc]
#15 [ffff881e4a6b3cc0] tgt_request_handle at ffffffffc0e73d11 [ptlrpc]
#16 [ffff881e4a6b3d50] ptlrpc_server_handle_request at ffffffffc0e16c6b [ptlrpc]
#17 [ffff881e4a6b3df0] ptlrpc_main at ffffffffc0e1a63a [ptlrpc]
#18 [ffff881e4a6b3ec8] kthread at ffffffff810b4031
#19 [ffff881e4a6b3f50] ret_from_fork at ffffffff816c155d
crash> disas lnet_handle_find_routed_path
2 0xffffffffc0b0b740 <+0>:»····nopl 0x0(%rax,%rax,1)
3 0xffffffffc0b0b745 <+5>:»····push %rbp
4 0xffffffffc0b0b746 <+6>:»····mov %rsp,%rbp
5 0xffffffffc0b0b749 <+9>:»····push %r15
6 0xffffffffc0b0b74b <+11>:»···mov %rdi,%r15 <--- RDI is used to pass the first argument gets stored in r15
7 0xffffffffc0b0b74e <+14>:»···push %r14
8 0xffffffffc0b0b750 <+16>:»···push %r13
9 0xffffffffc0b0b752 <+18>:»···push %r12
10 0xffffffffc0b0b754 <+20>:»···push %rbx
2001 static int
2002 lnet_handle_find_routed_path(struct lnet_send_data *sd,
2003 »·······»·······»······· lnet_nid_t dst_nid,
2004 »·······»·······»······· struct lnet_peer_ni **gw_lpni,
2005 »·······»·······»······· struct lnet_peer **gw_peer)
2076 »·······sd->sd_msg->msg_src_nid_param = sd->sd_src_nid;
2077 »·······rc = lnet_initiate_peer_discovery(gwni, sd->sd_msg, sd->sd_rtr_nid,
2078 »·······»·······»·······»·······»······· sd->sd_cpt);
crash> disas lnet_initiate_peer_discovery
1 Dump of assembler code for function lnet_initiate_peer_discovery:
2 0xffffffffc0b07aa0 <+0>:»····nopl 0x0(%rax,%rax,1)
3 0xffffffffc0b07aa5 <+5>:»····push %rbp
4 0xffffffffc0b07aa6 <+6>:»····mov %rsp,%rbp
5 0xffffffffc0b07aa9 <+9>:»····push %r15
6 0xffffffffc0b07aab <+11>:»···push %r14
7 0xffffffffc0b07aad <+13>:»···push %r13
8 0xffffffffc0b07aaf <+15>:»···push %r12
9 0xffffffffc0b07ab1 <+17>:»···push %rbx
# r15 is being saved on the stack by lnet_initiate_peer_discovery. So now we look there
crash> bt -f
#6 [ffff881e4a6b38e0] lnet_initiate_peer_discovery at ffffffffc0b07bc7 [lnet]
ffff881e4a6b38e8: ffff881f4be670c0 (%rbx) ffff881f4be67000 (%r12)
ffff881e4a6b38f8: ffff881f8d3b2400 (%r13) ffff881f8d33bb40 (%r14)
ffff881e4a6b3908: ffff881e4a6b39f8 (%r15) ffff881e4a6b3990 (%rbp)
ffff881e4a6b3918: ffffffffc0b0b90d (return address in caller)
crash> lnet_send_data ffff881e4a6b39f8
struct lnet_send_data {
sd_best_ni = 0xffff881f8d3b2200,
sd_best_lpni = 0xffff881f9575f600,
sd_final_dst_lpni = 0xffff881f9575f600,
sd_peer = 0xffff881ff7f4cd00,
sd_gw_peer = 0x0,
sd_gw_lpni = 0x0,
sd_peer_net = 0x0,
sd_msg = 0xffff880e4cb6f400,
sd_dst_nid = 3659191877107818,
sd_src_nid = 1407546850803763,
sd_rtr_nid = 18446744073709551615,
sd_cpt = 2,
sd_md_cpt = 0,
sd_send_case = 25
} |
Some Assembler Tidbits
| Code Block |
|---|
most x86-64 assembler instructions perform the operation on the first argument
and stores the result in the second argument.
Example:
mov $0xffffffff,%eax
moves the value 0xffffffff into the register %eax |
GDB Scripts
GDB commands can be used to create scripts to dissect the crash dump. Attached are a few scripts, courtesy of Alexey Lyashkov . I've also added more functionality to them. Also attached is a program which can extrack lustre logs from the dump file: Crash-tools.
Resources
Below are some resources that explain the registers and the architecture.
- SYSV ABI (calling conventions): http://wiki.osdev.org/System_V_ABI and from there https://www.uclibc.org/docs/psABI-x86_64.pdf
- page 21
- Developer info, AMD: http://developer.amd.com/resources/developer-guides-manuals/
- in particular: http://support.amd.com/TechDocs/24594.pdf ("AMD64 Architecture Programmer’s Manual Volume 3: General Purpose and System Instructions)
- Intel side: https://software.intel.com/en-us/articles/intel-sdm
- In particular https://software.intel.com/sites/default/files/managed/a4/60/325383-sdm-vol-2abcd.pdf (Intel® 64 and IA-32 architectures software developer's manual combined volumes 2A, 2B, 2C, and 2D: Instruction set reference, A-Z)
...
- Assembler cheat-sheet available.