[SASAG] ESX + Debian 6 = high I/O
Ski Kacoroski
kacoroski at gmail.com
Mon Jul 18 13:14:22 PDT 2011
Bill,
Hmm, that makes it much more difficult. Setting the I/O scheduler to
noop can have tremendous impact under some loads, but not all of them.
I have some machines where the change to noop has really helped.
cheers,
ski
On 07/18/2011 12:40 PM, Bill Levering wrote:
> Unfortunately, I don't have access to the host, and all the ISP has to
> say is that VMWare is not the issue.
>
> I'm not entirely surprised at their response. When we asked for a Linux
> install, they appear to have downloaded and installed a full desktop
> version of Debian.
>
> They also originally provided a single cpu instance, with VMWare bound
> to 2 cpus, when we asked for 4 cpus!
>
> As someone pointed out (and I missed) they also installed a 32-bit version.
>
> For Apache, I don't think 32-bit would be much of an issue.
> For mySQL, I assume it would just be a capacity/speed issue... but that
> wouldn't cause the problems we are seeing.
>
> I'm requesting that we create separate servers for apache and mysql so
> we can isolate the services and maybe find out which load
> characteristics are more problematic.
>
> We found this link:
> http://forums.cpanel.net/f5/extreme-load-peaks-vmware-server-158233.html
>
> Which suggested rebooting daily and setting the I/O scheduler to noop.
>
> Bill
>
> Bill Levering
> idbill at planx.com
> KFP: 0C38 4D7E 5B50 94FE 992D 406D 6C81 DE33 5459 A1AC
>
>
>
>
> On Jul 18, 2011, at 9:36 AM, Ski Kacoroski wrote:
>
>> Bill,
>>
>> Just got back from vacation and was wondering if you figured this out.
>> If not, what you need to do is run esxtop on the esx host and it will
>> tell you where the problem lies.
>>
>> cheers,
>>
>> ski
>>
> On Jul 14, 2011, at 1:11 PM, Hazen, Jeff wrote:
>
>> Have you tried running esxtop on the ESX host when the high system
>> load occurs? Not knowing what else is going on in your environment, it
>> could be that it's having trouble allocating 4 CPUs simultaneously to
>> run each 'tick' of your VM. In particular, look for %CSTP - Percent
>> coprocessor stop time - the amount of time that a VM cannot run ONLY
>> because the VM has multiple cores that cannot be scheduled
>> simultaneously because the ESX host is too busy. Over 3% = potentially
>> big problems.
>>
>> -Jeff
>> On 07/14/2011 12:08 PM, Bill Levering wrote:
>>>
>>> I'm working on an Debian 6 server that is acting really strange.
>>>
>>> The VM is as follows:
>>> 3 gig RAM (is now dedicated to this VM)
>>> 4 cpu (Intel(R) Xeon(R) CPU X5680 @ 3.33GHz )
>>> 165 Gig drive space on a Network Appliance connected via Fiber Channel
>>>
>>> OS: Linux 2.6.32-5-686
>>> Debian: 6.0.1
>>> Hardware: i686
>>>
>>> Services running:
>>> mysql 5.1.49
>>> apache 2.2
>>> w/ libphp and mod_passenger
>>>
>>>
>>> Load most of the time is between 5 and 8.
>>> Overall Cpu usage is between 2-12%
>>> Free RAM is between 80meg and 125meg
>>> Only 90m of swap is active
>>>
>>> Memory usage:
>>> - mysql ~ 1.2gig
>>> - apache:
>>> ### Processes: 27
>>> ### Total private dirty RSS: 164.73 MB
>>> ---- Passenger processes -----
>>> ### Processes: 6
>>> ### Total private dirty RSS: 66.25 MB
>>> - php imports ~40meg
>>>
>>>
>>> The box will spike to a load of around 20 when things start to stop
>>> responding, but even then the overall CPU's usage never exceeds 35%.
>>>
>>> Under load iostat shows stats like the following:
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await
>>> svctm %util
>>> sda 1.00 1216.00 11.00 135.00 984.00 10808.00 80.77 20.06 137.37 2.93
>>> 42.80
>>> sda 0.00 137.00 65.00 973.00 1168.00 8880.00 9.68 0.95 0.92 0.76 79.20
>>> sda 0.00 148.00 81.00 1414.00 704.00 12496.00 8.83 0.93 0.62 0.55 82.00
>>> sda 1.00 165.00 42.00 1468.00 432.00 13064.00 8.94 1.04 0.69 0.55 83.20
>>> sda 0.00 160.00 24.00 1125.00 192.00 13136.00 11.60 0.84 0.73 0.69 78.80
>>> sda 0.00 148.00 56.00 1571.00 448.00 13752.00 8.73 0.71 0.44 0.44 70.80
>>> sda 0.00 172.00 43.00 1327.00 376.00 11992.00 9.03 0.79 0.58 0.56 76.80
>>> sda 0.00 142.00 55.00 1306.00 464.00 11584.00 8.85 0.82 0.61 0.53 72.00
>>> sda 0.00 167.00 47.00 1490.00 376.00 13256.00 8.87 0.71 0.46 0.45 69.60
>>>
>>> mpstat shows:
>>> # mpstat 1 20
>>> Linux 2.6.32-5-686 (pmm) 07/13/2011 _i686_ (4 CPU)
>>>
>>> 01:21:45 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
>>> 01:21:46 PM all 0.27 0.00 0.53 72.27 0.00 0.00 0.00 0.00 26.93
>>> 01:21:47 PM all 0.73 0.00 0.49 72.37 0.00 0.00 0.00 0.00 26.41
>>> 01:21:48 PM all 4.49 0.00 0.47 31.68 0.00 0.00 0.00 0.00 63.36
>>> 01:21:49 PM all 8.15 0.00 0.99 42.22 0.00 0.25 0.00 0.00 48.40
>>> 01:21:50 PM all 11.86 0.00 0.77 41.49 0.00 0.26 0.00 0.00 45.62
>>> 01:21:51 PM all 7.07 0.00 0.25 38.38 0.00 0.00 0.00 0.00 54.29
>>> 01:21:52 PM all 6.93 0.00 0.80 51.20 0.00 0.00 0.00 0.00 41.07
>>> 01:21:53 PM all 6.83 0.00 0.68 35.54 0.00 0.00 0.00 0.00 56.95
>>> 01:21:54 PM all 10.39 0.00 0.00 32.21 0.00 0.00 0.00 0.00 57.40
>>> 01:21:55 PM all 6.13 0.00 0.24 37.03 0.00 0.00 0.00 0.00 56.60
>>> 01:21:56 PM all 16.14 0.00 0.49 35.21 0.00 0.00 0.00 0.00 48.17
>>> 01:21:57 PM all 11.68 0.00 0.25 39.09 0.25 0.00 0.00 0.00 48.73
>>> 01:21:58 PM all 12.29 0.00 1.18 41.37 0.00 0.00 0.00 0.00 45.15
>>> 01:21:59 PM all 9.47 0.00 0.79 51.32 0.00 0.26 0.00 0.00 38.16
>>> 01:22:00 PM all 9.52 0.00 0.71 36.43 0.00 0.00 0.00 0.00 53.33
>>> 01:22:01 PM all 14.65 0.00 0.76 35.10 0.00 0.00 0.00 0.00 49.49
>>> 01:22:02 PM all 12.05 0.00 0.48 40.96 0.00 0.00 0.00 0.00 46.51
>>> 01:22:03 PM all 7.57 0.00 0.52 33.94 0.00 0.00 0.00 0.00 57.96
>>> 01:22:04 PM all 6.45 0.00 0.46 37.10 0.00 0.00 0.00 0.00 55.99
>>> 01:22:05 PM all 16.45 0.00 0.51 35.99 0.00 0.00 0.00 0.00 47.04
>>> Average: all 8.94 0.00 0.57 41.86 0.01 0.04 0.00 0.00 48.57
>>> # w
>>> 13:22:05 up 48 days, 12:43, 2 users, load average: 12.00, 11.50, 10.10
>>>
>>> More recently, I've tweaked mysql quite bit and switched the mysql
>>> engine from myisam to innodb, but the developer is convinced that is
>>> making things worse, so we are switching it back to myisam tonight.
>>>
>>> The imports are not threaded, so they create quite a bit of load, but
>>> the developer doesn't have the time or the impetus to fix this. (The
>>> import does a select, then based on response, does a delete, then add...
>>> based on a parsed xml file, for about 35-40 minutes as fast as the
>>> single thread can go.)
>>>
>>> BUT, this same setup was on Slicehost (XEN) and in 2 years, we've never
>>> had these problems.
>>>
>>> I keep coming back to an i/o issue, but don't know how to troubleshoot
>>> from there.
>>>
>>> Am I missing something?
>>> Do you have any other suggestions?
>>>
>>> Bill
>>>
>>> Bill Levering
>>> idbill at planx.com
>>> KFP: 0C38 4D7E 5B50 94FE 992D 406D 6C81 DE33 5459 A1AC
>>>
>>> PS. I must confess, this is my second run-in with ESX(i) and my last
>>> experience was pretty much identical. VMWare says the server has room to
>>> run, but Guest runs like crap and is unusable. I now dread working on
>>> any project that even mentions VMWare.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Members mailing list
>>> Members at lists.sasag.org
>>> http://lists.sasag.org/mailman/listinfo/members
>>
>> --
>> "When we try to pick out anything by itself, we find it
>> connected to the entire universe" John Muir
>>
>> Chris "Ski" Kacoroski, kacoroski at gmail.com, 206-501-9803
>> or ski98033 on most IM services
>
>
>
> _______________________________________________
> Members mailing list
> Members at lists.sasag.org
> http://lists.sasag.org/mailman/listinfo/members
--
"When we try to pick out anything by itself, we find it
connected to the entire universe" John Muir
Chris "Ski" Kacoroski, kacoroski at gmail.com, 206-501-9803
or ski98033 on most IM services
More information about the Members
mailing list