[SASAG] Fwd: NFS server stumbling

Nyath Maxwell nyathm at gmail.com
Fri Nov 4 06:08:32 PDT 2011


How are you connected to to your array? Fiber Channel, iSCSI, SCSI? 

I have ideas as to to your problem but need to know the array fabric. 

Thanks and best of luck,
Nyath

Sent from my iPod

On Nov 4, 2011, at 4:42 AM, Stuart Kendrick <skendric at fhcrc.org> wrote:

> I'm working with an NFS server which is stumbling.  And I'm looking for
> insights from our collective expertise.
> 
> 
> DESCRIBE PROBLEM
> The problem presents itself as slow 'ls' output -- sometimes, 'ls' can
> spend seconds or minutes (1, 5, 20! minutes) to produce a listing of a
> directory containing tens or hundreds of files.  Happens maybe a dozen
> times a day, across a handful of users on a handful of NFS clients (all
> OpenSuse 11.3) mounting their volumes from one particular NFS server
> (Solaris 10 9/10 s10x_u9wos_14a X86, assembled 2010-08-11) (which in
> turn builds its LUNs from any and all of three back-end systems -- 3Par,
> Hitachi, and NetApp storage arrays).  The sole emacs user experiences
> similar hangs on opening files.  Most of the time, these boxes are
> reading/writing large files (multi gigabyte output files from genome
> sequencing runs), running multi-hour compute jobs on them; we don't know
> if performance is affected on these.
> 
> This is an infrequent issue -- the vast majority of the time,
> performance of 'ls' and 'emacs' feel reasonable.  And the majority of
> the clients (scores) and users (several hundred) don't report a problem...
> 
> 
> 
> REFERENCES
> Have any recommends for books which explain NFS, and ONC-RPC in
> particular, in detail?  Which explain how to use 'nfsstat' and others to
> trouble-shoot?  I have these on-order:
>    NFS Illustrated by Callaghan
>    Managing NFS and NIS (O'Reilly)
> 
> 
> TOOLS
> What tools do you use to shed light on NFS issues?  'nfsstat' and
> 'wireshark' ... others?
> 
> 
> 
> PACKET TRACE
> Packet traces show WRITEs and COMMITs which can take .0001 seconds to
> complete ... or as long as 35 minutes! to complete.
> 
> [BTW:  packet traces between Client and Server show no TCP pathology:
> no TCP retransmissions or RSTs.  Which is why I'm focusing on the NFS /
> storage portion of the stack.]
> 
> 
> Here is a WRITE which takes ~5 minutes to complete:
> 
> filter:  rpc.xid == 0x1d207259
> 
> No.   Abs Time DeltaT Bytes Src     Dst    Info
> 55808 10:23:46 14.563 2962  Client  Server V3 WRITE Call (Reply In
>                                                          85295)
> 62624 10:25:46 14.572 2962  Client  Server [RPC retransmission of
>                          #55808] V3 WRITE Call (Reply In 85295)
> 69434 10:26:46 14.579 2962  Client  Server [RPC retransmission of
>                          #55808] V3 WRITE Call (Reply In 85295)
> 76235 10:28:46 14.577 2962  Client  Server [RPC retransmission of
>                          #55808] V3 WRITE Call (Reply In 85295)
> 85295 10:29:22 0.860  206   Server  Client V3 WRITE Reply (Call In
>                                                          55808)
> 
> Looks like some WRITEs needed ~20 minutes to complete, while some
> COMMITs needed ~35 minutes.
> 
> ONC-RPC Server Response Time statistics for NFSv3:
> Procedure     Calls     Min SRT     Max SRT     Avg SRT
> WRITE            89     .000180     1188.59     260.127
> COMMIT            5     328.581     2106.62     1163.34
> FSSTAT           14     .000180     .000441     .000260
> ACCESS            3     .000187     .000456     .000282
> 
> ONC-RPC Program Statistics:
> Program   Version   Calls  Min SRT  Max SRT  Avg SRT
> NFS       3         111    .000180  2106.62  260.973
> 
> 
> That '2106.62' seconds under Max SRT for a COMMIT is where I pulled the
> '~35 minutes' figure.
> 
> 
> NFSSTAT
> 
> Server:/home/test-user> nfsstat
> 
> Server rpc:
> Connection oriented:
> calls      badcalls nullrecv badlen xdrcall dupchecks dupreqs
> 6565848596 0        0        0      0       2161188143 128335
> 
> Now, I'm new to interpreting nfsstat output.  Anyone have enough
> confidence to speak to this?  Here's my thinking:
> 
> 'calls' are the total number of RPC calls.  'dupchecks' tells us how
> many times the NFS server ran one of those calls through cache.  And
> 'dupreqs' tells us how many retransmits, or, more precisely, how many
> times the NFS server realized that it was /already/ working on that call.
> 
> dupreqs / calls = 128334 / 6547537906 =~ .000019600%  Not bad at all
> 
> i.e. only a tiny fraction of all the RPC calls are being retransmitted.
> 
> But then, I start to doubt my interpretation.  I think I'm
> misunderstanding the use of the term 'cache' as it relates to dupchecks
> .. are we using the term cache in the sense of "If it's in my cache,
> then I'll return the answer to you quicker than if I had to retrieve
> blocks from storage".  Or are we using the term cache in the way I would
> use the term 'queue' ... meaning, if I'm already working on this request
> (in my queue), then I won't bother adding your most recent request to my
> queue. If the latter, then:
> 
> dupchecks / calls = 2161188143 / 6565848596 = ~.33 =~ 33%
> 
> Which would suggest that ~33% of all RPCs are retransmitted ... a
> different story entirely.
> 
> http://docstore.mik.ua/orelly/networking_2ndEd/nfs/appb_01.htm
> 
> 
> 
> WHAT WE'RE DOING
> -We've spent effort analyzing health (load, CPU utilization, network IO,
> others) on the clients during these events; so far, we believe that the
> clients are underutlized while 'ls' is being slow.
> 
> -A wrapper around 'ls' logs information:  precisely what 'ls' command
> was run, on what volume, by what user, with what flags, and how long it
> took to complete
> 
> -We're building a table listing which volumes are involved with slow
> 'ls' commands, and which LUNs comprise those volumes.  [Looks like some
> of the volumes are built from spindles hosted on multiple storage
> systems ... using concats, not stripes, meaning some files probably live
> on a single spindle, rather than being striped across multiple spindles
> ... more investigation needed here.]
> 
> -We've started to chart IO latency on the relevant volumes
> 
> 
> --sk
> 
> Stuart Kendrick
> FHCRC
> _______________________________________________
> Members mailing list
> Members at lists.sasag.org
> http://lists.sasag.org/mailman/listinfo/members


More information about the Members mailing list