Performance analysis (Wikipedia article) deals with analyzing how computing resources are used for completing a specified task.
Profiling is one relevant tool.
In microkernel-based systems, there is generally a considerable RPC overhead.
In a multi-server system, it is non-trivial to implement a high-performance I/O System.
When providing POSIX compatibility (and similar interfaces) in an
environemnt that doesn't natively implement these interfaces, there may be a
severe performance degradation. For example, in this fork
system
call's case.
Unit testing can be used for tracking performance regressions.
IRC, freenode, #hurd, 2012-07-05
<braunr> the more i study the code, the more i think a lot of time is
wasted on cpu, unlike the common belief of the lack of performance being
only due to I/O
IRC, freenode, #hurd, 2012-07-23
<braunr> there are several kinds of scalability issues
<braunr> iirc, i found some big locks in core libraries like libpager and
libdiskfs
<braunr> but anyway we can live with those
<braunr> in the case i observed, ext2fs, relying on libdiskfs and libpager,
scans the entire file list to ask for writebacks, as it can't know if the
pages are dirty or not
<braunr> the mistake here is moving part of the pageout policy out of the
kernel
<braunr> so it would require the kernel to handle periodic synces of the
page cache
<antrik> braunr: as for big locks: considering that we don't have any SMP
so far, does it really matter?...
<braunr> antrik: yes
<braunr> we have multithreading
<braunr> there is no reason to block many threads while if most of them
could continue
<braunr> -while
<antrik> so that's more about latency than throughput?
<braunr> considering sleeping/waking is expensive, it's also about
throughput
<braunr> currently, everything that deals with sleepable locks (both
gnumach and the hurd) just wake every thread waiting for an event when
the event occurs (there are a few exceptions, but not many)
<antrik> ouch
id:"[email protected]"
IRC, freenode, #hurd, 2012-12-04
<damo22> why do some people think hurd is slow? i find it works well even
under heavy load inside a virtual machine
<braunr> damo22: the virtual machine actually assists the hurd a lot :p
<braunr> but even with that, the hurd is a slow system
<damo22> i would have thought it would have the potential to be very fast,
considering the model of the kernel
<braunr> the design implies by definition more overhead, but the true cause
is more than 15 years without optimization on the core components
<braunr> how so ?
<damo22> since there are less layers of code between the hardware bare
metal and the application that users run
<braunr> how so ? :)
<braunr> it's the contrary actually
<damo22> VFS -> IPC -> scheduler -> device drivers -> hardware
<damo22> that is monolithic
<braunr> well, it's not really meaningful
<braunr> and i'd say the same applies for a microkernel system
<damo22> if the application can talk directly to hardware through the
kernel its almost like plugging directly into the hardware
<braunr> you never talk directly to hardware
<braunr> you talk to servers instead of the kernel
<damo22> ah
<braunr> consider monolithic kernel systems like systems with one big
server
<braunr> the kernel
<braunr> whereas a multiserver system is a kernel and many servers
<braunr> you still need the VFS to identify your service (and thus your
server)
<braunr> you need much more IPC, since system calls are "replaced" with RPC
<braunr> the scheduler is basically the same
<damo22> okay
<braunr> device drivers are similar too, except they run in thread context
(which is usually a bit heavier)
<damo22> but you can do cool things like report when an interrupt line is
blocked
<braunr> and there are many context switches between all that
<braunr> you can do all that in a monolithic kernel too, and faster
<braunr> but it's far more elegant, and (when well done) easy to do on a
microkernel based system
<damo22> yes
<damo22> i like elegant, makes coding easier if you know the basics
<braunr> there are only two major differences between a monolilthic kernel
and a multiserver microkernel system
* damo22 listens
<braunr> 1/ independence of location (your resources could be anywhere)
<braunr> 2/ separation of address spaces (your servers have their own
addresses)
<damo22> wow
<braunr> these both imply additional layers of indirection, making the
system as a whole slower
<damo22> but it would be far more secure though i suspect
<braunr> yes
<braunr> and reliable
<braunr> that's why systems like qnx were usually adopted for critical
tasks
<damo22> security and reliability are very important, i would switch to the
hurd if it supported all the hardware i use
<braunr> so would i :)
<braunr> but performance matters too
<damo22> not to me
<braunr> it should :p
<braunr> it really does matter a lot in practice
<damo22> i mean, a 2x slowdown compared to linux would not affect me
<damo22> if it had all the benefits we mentioned above
<braunr> but the hurd is really slow for other reasons than its additional
layers of indrection unfortunately
<damo22> is it because of lack of optimisation in the core code?
<braunr> we're working on these issues, but it's not easy and takes a lot
of time :p
<damo22> like you said
<braunr> yes
<braunr> and also because of some fundamental design choices related to the
microkernel back in the 80s
<damo22> what about the darwin system
<damo22> it uses a mach kernel?
<braunr> yes
<damo22> what is stopping someone taking the MIT code from darwin and
creating a monster free OS
<braunr> what for ?
<damo22> because it already has hardware support
<damo22> and a mach kernel
<braunr> in kernel drivers ?
<damo22> it has kernel extensions
<damo22> you can do things like kextload module
<braunr> first, being a mach kernel doesn't make it compatible or even
easily usable with the hurd, the interfaces have evolved independantly
<braunr> and second, we really do want more stuff out of the kernel
<braunr> drivers in particular
<damo22> may i ask why you are very keen to have drivers out of kernel?
<braunr> for the same reason we want other system services out of the
kernel
<braunr> security, reliability, etc..
<braunr> ease of debugging
<braunr> the ability to restart drivers separately, without restarting the
kernel
<damo22> i see
IRC, freenode, #hurd, 2012-09-13
Test Driving GNU Hurd, With Benchmarks Against Linux, Phoronix, Michael Larabel, 2011-07-18.
<braunr> the phoronix benchmarks don't actually test the operating system
..
<hroi_> braunr: well, at least it tests its ability to run programs for
those particular tasks
<braunr> exactly, it tests how programs that don't make much use of the
operating system run
<braunr> well yes, we can run programs :)
<pinotree> those are just cpu-taking tasks
<hroi_> ok
<pinotree> if you do a benchmark with also i/o, you can see how it is
(quite) slower on hurd
<hroi_> perhaps they should have run 10 of those programs in parallel, that
would test the kernel multitasking I suppose
<braunr> not even I/O, simply system calls
<braunr> no, multitasking is ok on the hurd
<braunr> and it's very similar to what is done on other systems, which
hasn't changed much for a long time
<braunr> (except for multiprocessor)
<braunr> true OS benchmarks measure system calls
<hroi_> ok, so Im sensing the view that the actual OS kernel architecture
dont really make that much difference, good software does
<braunr> not at all
<braunr> i'm only saying that the phoronix benchmark results are useless
<braunr> because they didn't measure the right thing
<hroi_> ok
Optimizing Data Structure Layout
IRC, freenode, #hurd, 2014-01-02
<braunr> teythoon_: wow, digging into the vm code :)
<teythoon_> i discovered pahole and gnumach was a tempting target :)
<braunr> never heard of pahole :/
<teythoon_> it's nice
<teythoon_> braunr: try pahole -C kmem_cache /boot/gnumach
<teythoon_> on linux that is. ...
<braunr> ok
<teythoon_> braunr: http://paste.debian.net/73864/
<braunr> very nice
IRC, freenode, #hurd, 2014-01-03
<braunr> teythoon: pahole is a very handy tool :)
<teythoon> yes
<teythoon> i especially like how general it is
Measure
On some pages, we're filing information about performace measurements.
kepler.SCHWINGE
Debian GNU/Linux, x86. Running as a Xen domU, the system is not reserved exclusively for measurement purposes, so it's a best-effort service.
laplace.SCHWINGE
Debian GNU/Hurd, x86. Running as a QEMU/KVM instance, the system is not reserved exclusively for measurement purposes, so it's a best-effort service.
id:"[email protected]"
IRC, freenode, #hurd, 2014-02-27
<braunr> tschwinge: about your concern with regard to performance
measurements, you could run kvm with hugetlbfs and cpuset
<braunr> on a machine that provides nested page tables, this makes the
virtualization overhead as small as it could be considering the
implementatoin
<braunr> hugetlbs reduces the overhead of page faults, and also implies
locked memory while cpuset isolates the vm from global scheduling
<braunr> hugetlbfs*
2014-07-25, tschwinge
Support for huge pages as well as CPU sets requires special setup; not doing that at the moment.