Wikipedia:Reference desk/Archives/Computing/2013 May 25

= May 25 =

Getting rid of the daemon & speeding up D-Bus without hacking the kernel?
Someone pointed out that Dbus is very slow in an e-mail of DBus's mailing list. According to Speeding up D-Bus [LWN.net ], the performance issue again has something to do with the "daemon" or "server process", just like the issue X Window is facing. The problem, as my understanding, is DBus daemon and X Server get involved too much so that frequent IPC between server and clients slows down the whole subsystems. Offloading server's job to the clients should somewhat improve the situation.

Diagram 1 demonstrates a DBus flow where Process A sends a message to Process B. (I know very little about DBus, all diagrams here are just to demonstrate the idea only) When message sent by Process A arrives Process B, it calls its message handler to handle the message. Because each message must pass through the DBus daemon, a message sent by a client process must undergo at least two context switches to be propagated to the client process on the other side.



The Message handler in Diagram 1 may look like this:

void messageHandler(DBusMessage & dBusMessage) {  if (dBusMessage == FOO_MESSAGE) { ... do something for FOO_MESSAGE ... } else if (dBusMessage == BAR_MESSAGE) { ... do something for BAR_MESSAGE ... } else if (dBusMessage == ...) { ...  } }

Message Handler is executed in the context of Process B. Normally, the handler need to access variables in address space of Process B in order to do something useful.

The flow in Diagram 1 shows the possible high latency problem (made by context switches) currently DBus has. So I came up with a simple idea to change the design like this:



Mechanism of shared memory is the foundation in this design. The message handler code and all variables it would access are shared by the message sender process (Process A) and the message receiver process (Process B) (as shown in Diagram 2 where address space of Process A and Process B overlap on the Message handler part). The role of DBus daemon is no longer be a message relay or router. Instead, the daemon is more like a telephone operator whose job is to bind the receiver's message handler to the sender so that when the sender wants to send a message to the receiver, it actually calls the message handler from the receiver directly. Similar to the original design, the message is passed as an argument to the message handler, but the handler is now called by the message sender rather than by the message receiver, which implies the handler is executed in the context of the sender rather than executed in the context of the receiver. The outcome of the design is that the two context switches in the original design are completely eliminated (the latency is gone) and no kernel hacking is required (required by the solution mentioned in article "Speeding up D-Bus [LWN.net]" above) ... although synchronization would be needed because sender and receiver may share some part of variables.

So I would like to know how do you think about the new design? And is it feasible? Is it really beneficial? Or my description is unclear? -- Justin545 (talk) 12:30, 25 May 2013 (UTC)


 * The question as to whether it's beneficial has to be informed by empirical evidence. Evidence first that there is a user-perceptible slowness, secondly evidence (from a profiler) that the cause of this slowness is IPC, and thirdly solid evidence of what "slowness" means (whether it's latency, slow throughput. So to decide that you need a solid benchmark (without which you can't measure any progress you hope to make) and a real breakdown of the costs (to the user) of a given task. Deciding on the solution for a problem you haven't measured is solving a problem you don't yet understand. -- Finlay McWalterჷTalk 14:48, 25 May 2013 (UTC)


 * There has been many empirical evidence reported:


 * "I generally estimate send->receive time of 30ms for a single dbus message going via the bus on capable ARM systems. That means the time for 10 IPC calls is a human-visible time."


 * "100ms / 4.7ms = 21 messages only. Moreover, this is the ideal case where the CPU has nothing else to do than sending messages ... In the reality, the CPU has lots'of things to do, so I could never send up to 21 messages."


 * It has confirmed "there is a user-perceptible slowness" for ARM system. And DBus is an IPC mechanism on its own, so saying DBus is slow should actually mean the IPC is slow, which should explain "the cause of this slowness is IPC". -- Justin545 (talk) 16:40, 25 May 2013 (UTC)
 * Shared memory programming is already a thing that exists. D-Bus is not shared-memory programming.  If you see a performance problem when using some configuration on some specific platform, your first response shouldn't be to try and contort the platform into a different programming paradigm.  You should recognize that in your current implementation you chose the wrong paradigm for your implementation and/or its platform-specific performance needs.  Instead of contorting D-Bus into something that it's not, just consider what data your application software is sharing that could be better shared by a different mechanism.  Or, maybe you don't need to share so much data so often!  D-Bus is a general-purpose message passing system. and it can be coopted for any other purpose, but that doesn't mean it should.  Nimur (talk) 17:50, 25 May 2013 (UTC)


 * Does "general purpose" necessarily mean inefficiency...? I don't know what's the intention the DBus creators want it to be used... maybe they had the efficiency in mind, maybe they hadn't or maybe they had no idea either and just left it undefined. Whatever it was, DBus has been widely adopted by many applications. And there have been many users of it complained about its bad performance. Perhaps, as you said, it's the users' faults to presume DBus is an efficient IPC (if DBus creator didn't say it should be efficient). But I think the real problem is how do we get out of the current situation (how to improve plenty of current applications that have already adopted DBus). One of solution is to change the applications one by one to use the other platforms. But a faster solution would be changing DBus itself ... even if it wouldn't be DBus's issue. If DBus can be improved, it shall save many programmers from the nightmares of redesigning their applications. And my another question is - shouldn't an infrastructure be robust and efficient? -- Justin545 (talk) 19:23, 25 May 2013 (UTC)


 * You won't believe what I have found. I just found it at the start of "D-Bus Specification":
 * -- Justin545 (talk) 15:06, 26 May 2013 (UTC)
 * -- Justin545 (talk) 15:06, 26 May 2013 (UTC)


 * That mailing list linked to some tests that showed for instance a 220MhZ ARM doing 240 a second so for 10 calls a second you'd need a 10Mhz ARM. Early ARMs cached using virtual addresses rather than real addresses so there would also be an overhead for that as well as them being less efficient. It still is a lot of time but is it a problem that is affecting you? Dmcq (talk) 00:19, 26 May 2013 (UTC)


 * I programmed for an E-book device with something low-end CPU like that. I don't think such a DBus performance is acceptable. Even if for iPhone 3G, its CPU (620 MHz) is underclocked to 412 MHz - not so much faster than 220MhZ. But frankly, I can't believe how smooth the GUI iPhone 3G has. Considering an Android phone which typically needs faster, more expensive hardware to achieve the same smoothness. Besides, inefficient software usually means its power usage (crucial on mobile devices) is also inefficient because it need to execute more instructions to accomplish a job. -- Justin545 (talk) 06:01, 26 May 2013 (UTC)
 * I'm no expert on those systems but I don't think they use D-Bus for graphics, I thought it was more for high level structured communication rather than anything optimised. My guess is that the iPhone's better graphics is due to better graphics hardware. I've just got myself a really cheap Nook simple touch to root and play with and I'm really surprised by how fast it goes and how good the graphics are. It's processor has a top speed of 800Mhz and I see the original Kindle in 2007 topped at 400Mhz. Dmcq (talk) 11:24, 26 May 2013 (UTC)


 * ...I didn't say Android uses DBus for graphics ... what I was trying to say is software/firmware/driver design is crucial in terms of efficiency and response time of the system. Good software design should be able to compensate for insufficient hardware (e.g. iPhone 3G's worse than ordinary 412 MHz CPU) or even make it better. -- Justin545 (talk) 14:46, 26 May 2013 (UTC)

Alt + Tab doesn't work when some upstart of a program think it should have focus
I have a game called Nuclear Dawn which I play sometimes. I like to alt + tab out while a map is loading. This usually works just fine. Other times I am able to select the window I wish to go to using alt + tab but when I release the alt button, Nuclear Dawn takes the focus back. Why is this happening? How to override? --89.241.227.119 (talk) 19:41, 25 May 2013 (UTC)


 * Which Operating system are you on? --Yellow1996 (talk) 20:56, 25 May 2013 (UTC)


 * Alternatively, tab-related settings might be in the in-game options menu; check there. --Yellow1996 (talk) 17:29, 26 May 2013 (UTC)