[SASAG] ESX + Debian 6 = high I/O... conclusion

Bill Levering idbill at planx.com
Fri Jul 22 11:24:18 PDT 2011


For those of you following this... here is a little write-up about the  
experience.

http://pugetworks.com/2011/07/vmware-and-ubuntu-high-load-issues/

Comments are disabled, but I'm open to receiving comments directly.

Thanks,
Bill

Bill Levering
idbill at planx.com
KFP: 0C38 4D7E 5B50 94FE 992D  406D 6C81 DE33 5459 A1AC




On Jul 18, 2011, at 9:39 PM, Justin Huff wrote:

> Looks to me like IO *is* your problem:)
>
> IOWait is pretty high as is IOUtil.  I'm not sure what the workload
> looks like, but IOops don't look bad. queue size makes sense for a
> single threaded process. Service times don't look too  bad either.
>
> If I were to guess, I think you might be expecting more IO out of this
> VM than it can do. Perhaps the provider can allocate more spindles or
> something.
>
> --Justin
>
> On Thu, Jul 14, 2011 at 12:08 PM, Bill Levering <idbill at planx.com>  
> wrote:
>>
>> I'm working on an Debian 6 server that is acting really strange.
>>
>> The VM is as follows:
>> 3 gig RAM (is now dedicated to this VM)
>> 4 cpu (Intel(R) Xeon(R) CPU X5680  @ 3.33GHz )
>> 165 Gig drive space on a Network Appliance connected via Fiber  
>> Channel
>>
>> OS:        Linux 2.6.32-5-686
>> Debian:    6.0.1
>> Hardware:  i686
>>
>> Services running:
>> mysql 5.1.49
>> apache 2.2
>>  w/ libphp and mod_passenger
>>
>>
>> Load most of the time is between 5 and 8.
>> Overall Cpu usage is between 2-12%
>> Free RAM is between 80meg and 125meg
>> Only 90m of swap is active
>>
>> Memory usage:
>>  - mysql  ~ 1.2gig
>>  - apache:
>>        ### Processes: 27
>>        ### Total private dirty RSS: 164.73 MB
>>        ---- Passenger processes -----
>>        ### Processes: 6
>>        ### Total private dirty RSS: 66.25 MB
>>  - php imports ~40meg
>>
>>
>> The box will spike to a load of around 20 when things start to stop
>> responding, but even then the overall CPU's usage never exceeds 35%.
>>
>> Under load iostat shows stats like the following:
>> Device: rrqm/s  wrqm/s  r/s     w/s     rsec/s  wsec/s  avgrq-sz
>>  avgqu-sz        await   svctm   %util
>> sda     1.00    1216.00 11.00   135.00  984.00  10808.00        80.77
>> 20.06   137.37  2.93    42.80
>> sda     0.00    137.00  65.00   973.00  1168.00 8880.00 9.68     
>> 0.95    0.92
>>    0.76    79.20
>> sda     0.00    148.00  81.00   1414.00 704.00  12496.00         
>> 8.83    0.93
>>    0.62    0.55    82.00
>> sda     1.00    165.00  42.00   1468.00 432.00  13064.00         
>> 8.94    1.04
>>    0.69    0.55    83.20
>> sda     0.00    160.00  24.00   1125.00 192.00  13136.00         
>> 11.60   0.84
>>    0.73    0.69    78.80
>> sda     0.00    148.00  56.00   1571.00 448.00  13752.00         
>> 8.73    0.71
>>    0.44    0.44    70.80
>> sda     0.00    172.00  43.00   1327.00 376.00  11992.00         
>> 9.03    0.79
>>    0.58    0.56    76.80
>> sda     0.00    142.00  55.00   1306.00 464.00  11584.00         
>> 8.85    0.82
>>    0.61    0.53    72.00
>> sda     0.00    167.00  47.00   1490.00 376.00  13256.00         
>> 8.87    0.71
>>    0.46    0.45    69.60
>>
>> mpstat shows:
>> # mpstat 1 20
>> Linux 2.6.32-5-686 (pmm)        07/13/2011      _i686_  (4 CPU)
>>
>> 01:21:45 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft   
>> %steal
>>  %guest   %idle
>> 01:21:46 PM  all    0.27    0.00    0.53   72.27    0.00    0.00     
>> 0.00
>>  0.00   26.93
>> 01:21:47 PM  all    0.73    0.00    0.49   72.37    0.00    0.00     
>> 0.00
>>  0.00   26.41
>> 01:21:48 PM  all    4.49    0.00    0.47   31.68    0.00    0.00     
>> 0.00
>>  0.00   63.36
>> 01:21:49 PM  all    8.15    0.00    0.99   42.22    0.00    0.25     
>> 0.00
>>  0.00   48.40
>> 01:21:50 PM  all   11.86    0.00    0.77   41.49    0.00    0.26     
>> 0.00
>>  0.00   45.62
>> 01:21:51 PM  all    7.07    0.00    0.25   38.38    0.00    0.00     
>> 0.00
>>  0.00   54.29
>> 01:21:52 PM  all    6.93    0.00    0.80   51.20    0.00    0.00     
>> 0.00
>>  0.00   41.07
>> 01:21:53 PM  all    6.83    0.00    0.68   35.54    0.00    0.00     
>> 0.00
>>  0.00   56.95
>> 01:21:54 PM  all   10.39    0.00    0.00   32.21    0.00    0.00     
>> 0.00
>>  0.00   57.40
>> 01:21:55 PM  all    6.13    0.00    0.24   37.03    0.00    0.00     
>> 0.00
>>  0.00   56.60
>> 01:21:56 PM  all   16.14    0.00    0.49   35.21    0.00    0.00     
>> 0.00
>>  0.00   48.17
>> 01:21:57 PM  all   11.68    0.00    0.25   39.09    0.25    0.00     
>> 0.00
>>  0.00   48.73
>> 01:21:58 PM  all   12.29    0.00    1.18   41.37    0.00    0.00     
>> 0.00
>>  0.00   45.15
>> 01:21:59 PM  all    9.47    0.00    0.79   51.32    0.00    0.26     
>> 0.00
>>  0.00   38.16
>> 01:22:00 PM  all    9.52    0.00    0.71   36.43    0.00    0.00     
>> 0.00
>>  0.00   53.33
>> 01:22:01 PM  all   14.65    0.00    0.76   35.10    0.00    0.00     
>> 0.00
>>  0.00   49.49
>> 01:22:02 PM  all   12.05    0.00    0.48   40.96    0.00    0.00     
>> 0.00
>>  0.00   46.51
>> 01:22:03 PM  all    7.57    0.00    0.52   33.94    0.00    0.00     
>> 0.00
>>  0.00   57.96
>> 01:22:04 PM  all    6.45    0.00    0.46   37.10    0.00    0.00     
>> 0.00
>>  0.00   55.99
>> 01:22:05 PM  all   16.45    0.00    0.51   35.99    0.00    0.00     
>> 0.00
>>  0.00   47.04
>> Average:     all    8.94    0.00    0.57   41.86    0.01    0.04     
>> 0.00
>>  0.00   48.57
>> # w
>> 13:22:05 up 48 days, 12:43,  2 users,  load average: 12.00, 11.50,  
>> 10.10
>>
>> More recently, I've tweaked mysql quite bit and switched the mysql  
>> engine
>> from myisam to innodb, but the developer is convinced that is  
>> making things
>> worse, so we are switching it back to myisam tonight.
>>
>> The imports are not threaded, so they create quite a bit of load,  
>> but the
>> developer doesn't have the time or the impetus to fix this. (The  
>> import does
>> a select, then based on response, does a delete, then add... based  
>> on a
>> parsed xml file, for about 35-40 minutes as fast as the single  
>> thread can
>> go.)
>>
>> BUT, this same setup was on Slicehost (XEN) and in 2 years, we've  
>> never had
>> these problems.
>>
>> I keep coming back to an i/o issue, but don't know how to  
>> troubleshoot from
>> there.
>>
>> Am I missing something?
>> Do you have any other suggestions?
>>
>> Bill
>>
>> Bill Levering
>> idbill at planx.com
>> KFP: 0C38 4D7E 5B50 94FE 992D  406D 6C81 DE33 5459 A1AC
>>
>> PS. I must confess, this is my second run-in with ESX(i) and my last
>> experience was pretty much identical. VMWare says the server has  
>> room to
>> run, but Guest runs like crap and is unusable. I now dread working  
>> on any
>> project that even mentions VMWare.
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Members mailing list
>> Members at lists.sasag.org
>> http://lists.sasag.org/mailman/listinfo/members
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 203 bytes
Desc: This is a digitally signed message part
URL: <http://lists.sasag.org/pipermail/members/attachments/20110722/1daffec0/attachment.bin>


More information about the Members mailing list