User:ScotXW/kdbus

kdbus is a character device inside of the Linux kernel to do inter-process communication (IPC). kdbus was design to mostly re-implement D-Bus.

The hurdles

 * 2011-11-09 "Fast interprocess communication revisited" by Neil Brown
 * 2012-07-03: "Missing the AF_BUS" by Jonathan Corbet AF_BUS was another project to port more IPC functionality into the Linux kernel; it was rejected to be accepted into mainline
 * 2013-02-11 "D-Bus soll in den Linux-Kernel" heise.de
 * 2013-05-30: "Automotive Linux Summit: Linux interprocess communication and kdbus" by Jake Edge
 * 2014-01-20 "Die Kernel-D-Bus-Implementation Kdbus" heise.de
 * 2015-04-15: "Obstacles for kdbus" by Jonathan Corbet
 * http://www.kroah.com/log/linux/af_bus.html
 * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel
 * Re: AF_BUS socket address family

To quote: https://www.youtube.com/watch?v=t0jgZKV4N_A "Well, airplane carriers carry air planes…"
 * It seems to be that some kernel maintainers rather want some kGenericIPC instead of only kDBUS.
 * AF_BUS is said to have been "killed" ultimately by David Miller, the kernel maintainer of /net because he disagreed with …
 * Android's Binder:
 * generic: http://elinux.org/Android_Binder
 * technical http://www.cubrid.org/blog/dev-platform/binder-communication-mechanism-of-android-processes/
 * was accepted into staging and is now part of Linux kernel mainline! Greg said, "Binder was about memory and kdbus is about CPU"
 * While there are the Android and even an libbionic-article, in the Wikipedia very little was done to make clear the technical plumbing of Android. Binder is one of them, ahsmem, pmem, wakelocks, logger, etc. are other Android-specific Linux kernel subsystems/frameworks.
 * iOS initial release June 29, 2007
 * Android initial release September 23, 2008
 * can do marshalling.

What is D-Bus again?
D-Bus or DBus is a specification for inter-process communication (IPC) (and remote procedure call (RPC)) in user-space.
 * D-Bus is a specification, multiple implementations do exists: libdbus, as a reference implementation by freedesktop.org, GDBus, QtDBus, sd-bus, dbus-java, and maybe more
 * like a file is a file-system thing, a process is a kernel-thing. Process = instance of a running program. Only the kernel can pass messages between processes! => D-Bus works on top of the existing IPC primitives in the (Linux) kernel.
 * D-Bus cannot avoid context switches just like that. But I guess, that maybe a programmer using D-Bus can write his stuff in a way, that avoids some context switches compared to when using kernel IPC directly. A process cannot even access a file on the hard disk without a context switch. To be precise, any process needs to ask the kernel to open a file, by issuing a system call = context switch.
 * D-Bus replaced GNOME's CORBA and KDE's DCOP
 * I (User:ScotXW) am guessing here: The idea behind D-Bus has been to provide some easy-to-use way for IPC as compared to the IPC primitives in the kernel.
 * Also, D-Bus works as an abstraction layer, meaning it works on top of Linux, BSD, Windows, OS X, etc!


 * D-Bus has been developed completely in user-space, hard to say, whether during its inception or during its development, the developers meant for D-Bus to one day have an implementation inside of some kernel
 * and attempt to integrate D-Bus into the Linux kernel was kdbus; it failed to be accepted into Linux kernel mainline

mechanism that allows communication between multiple computer programs (that is, processes) concurrently running on the same machine. D-Bus was developed as part of the freedesktop.org project, initiated by Havoc Pennington from Red Hat to standardize services provided by Linux desktop environments such as GNOME and KDE.

The freedesktop.org project also developed a free and open-source software library called of the specification. This library is often confused with the D-Bus itself. Other implementations of the D-Bus client library also exist, such as (GNOME), QtDBus (Qt/KDE), dbus-java and sd-bus (part of systemd).

What about the existing IPC primitives in the Linux kernel?
Sadly File:Oversimplified Structure of the Linux kernel.svg doesn't explain how IPC is implemented inside of the Linux kernel! Looking into [[The Linux Programming Interface] by Michael Kerrisk there is a lot of information about IPC facilities, e.g. »A running Linux system consists of numerous processes, many of which operate independently of each other.
 * Chapter 2.10 "Interprocess Communication and Synchronization":

Some processes, however, cooperate to achieve their intended purposes, and these processes need methods of communicating with one another and synchronizing their actions.

One way for processes to communicate is by reading and writing information in disk files. However, for many applications, this is too slow and inflexible.

Therefore, Linux, like all modern UNIX implementations, provides a rich set of mechanisms for interprocess communication (IPC), including the following:
 * signals, which are used to indicate that an event has occurred;
 * pipes (familiar to shell users as the | operator) and FIFOs, which can be used to transfer data between processes;
 * sockets, which can be used to transfer data from one process to another, either on the same host computer or on different hosts connected by a network;
 * file locking, which allows a process to lock regions of a file in order to prevent other processes from reading or updating the file contents;
 * message queues, which are used to exchange messages (packets of data) between processes;
 * semaphores, which are used to synchronize the actions of processes;
 * shared memory, which allows two or more processes to share a piece of memory.

When one process changes the contents of the shared memory, all of the other processes can immediately see the changes. The wide variety of IPC mechanisms on UNIX systems, ''with sometimes overlapping functionality'', is in part due to their evolution under different variants of the UNIX system and the requirements of various standards. For example, FIFOs and UNIX domain sockets essentially perform the same function of allowing unrelated processes on the same system to exchange data. Both exist in modern UNIX systems because FIFOs came from System V, while sockets came from BSD.«

The procfs is mounted and represents much of that…

Processes that need to communicate with one another are e.g. dconf, a back-end to GSettings, and everything that needs to access settings data. The settings data is of course stored in some file (plain text or some database file format) like Windows Registry stores its stuff in a 20MiB file. But somehow there is a dconf-daemon (processes that run in the back-ground have been called "daemons" for aeons. But they could alternatively also be called "services". Microsoft has done this, and it seams Red Hat and freedesktop.org also calls daemons now services. There are system daemons/services and session daemons/services.) and the communication with this dconf-daemon takes place over the IPC as defined by the D-Bus specification.

I guess the rationale behind the dconf-daemon is the same as behind the OpenWrt's netifd: track the configuration, in case it changes, make sure that these changes take effect immediately without the user needing to do anything more.

Why does the communication with dconf-daemon take place over D-Bus? I guess, D-Bus offers far more convenient ways for applications developers. GVfs contains a collection of daemons which communicate with each other and the GIO module over D-Bus.

As far as some end-user application like e.g. Inkscape or GIMP retrieves settings data from dconf-daemon, they (probably) use D-Bus.

D-Bus in the kernel
The kdbus is the low-level, native kernel D-Bus transport and actively developing now. Tizen developers are trying to replace socket based D-Bus with kdbus based D-Bus on Tizen 3.0 and its products, facing many challenges such as compatibility, security, performance, and others. In this presentation, we will share our experience, the current state, the plans, and the benchmark result. We look forward to replacing socket based D-Bus with kdbus based D-Bus for Linux devices, providing better performance transparently.
 * https://events.static.linuxfound.org/sites/events/files/slides/linuxconjapan2014.pdf

Most more modern OS designs than Unix started out with a high-level IPC from the beginning, and then built the rest of the OS on top of it. Linux/Unix began with only the most basic low-level IPC primitives in place (Pipes and stream sockets). Building on those over time various higher-level IPC systems were built, but only very few stood the test of time or became universal. On current Linux systems the best established high-level IPC layer is D-Bus. It implements a reliable message passing scheme, with access control, multicasting, filtering, introspection and supports a flexible object model.

D-Bus is a powerful design. However, being mostly a userspace solution its latency and throughput are not ideal. A full transaction consisting of method call and method reply requires 10 (!) copy operations for the messages passed. It is only useful for transfer of control messages, and not capable of streaming larger amounts of data.

In this talk I'd like to discuss the "kdbus" IPC system, a kernel implementation of the D-Bus logic and its userspace side. "kdbus" takes the concepts of classic D-Bus but makes them more universally useful, reducing latency and roundtrips, and increasing bandwidth, even covering streaming usecases. (For comparison, with kdbus, a full transaction takes only 2 copy operations, even supporting zero-copy for large messages). I'll discuss the lessons we learnt from Android's binder, Solaris' doors IPC system, and Mach's port scheme. We'll discuss, how we implemented a reliable (though probabilistic) multicasting scheme, and the tricks we used in userspace to make transparent zero-copy work, without compromising on security, and providing compatibility with classic D-Bus.

kdbus will soon show up in your favourite distribution as part of the systemd package, please attend if you want to know more about the ideas behind it.

kdbus also came with memfd, a new system call Memfd is a mechanism similar to Android's ashmem that allows zero-copy message passing. Memfd effectively comes down to just a chunk of memory with a FD (file descriptor) attached that can be passed to mmap. The memfd_create function returns a raw shmem file and there's optional support for sealing. memfd was mainlined into Linux kernel 3.17.

https://dvdhrm.wordpress.com/tag/memfd/

Method Call Transactions, Signals, Properties, OO, Broadcasting, Discovery, Introspection, Policy, Activation, Synchronization, Type-safe Marshalling, Security, Monitoring, exposes APIs/not streams, Passing of Credentials, File Descriptor Passing, Language agnostic, Network transparency, no trust required, High-level error concept. ..
 * D-Bus is powerful IPC:

Suitable only for control, not payload It’s inefficient (10 copies, 4 complete validations, 4 context switches per duplex method call transaction) Credentials one can send/recv are limited No implicit timestamping Not available in early boot, initrd, late boot Hookup with security frameworks happens in userspace Activatable bus services are independent from other system services Codebase is a bit too baroque, XML,. . . No race-free exit-on-idle bus activated services
 * D-Bus has limitations

Right approach: good concepts, generic, comprehensive, covers all areas Established, it’s the single most used local, high-level IPC system on Linux, bindings for most languages Used in init system (regardless if systemd or Upstart), the desktops, embedded,. ..
 * D-Bus is fantastic, solves real problems

Suitable for large data (GiB!), zero-copy, optionally reusable It’s efficient (2 or fewer copies, 2 validations, 2 context switches per duplex methd call transaction) Credentials sent along are comprehensive (uid, pid, gid, selinux label, pid starttime, tid, comm, tid comm, argv, exe, cgroup, caps, audit, . . . ) Implicit timestamping Always available, from earliest boot to latest shutdown Open for LSMs to hook into from the kernel side Activation is identical to activation of other services Userspace is much simpler, no XML,. . . Priority queues,. . . Race-free exit-on-idle for bus activated services
 * kdbus

memfd
Probably should start by summarizing the advantages of memfd over existing interfaces & mechanisms
 * For kdbus, this is the zero-copy message passing (even better for when for when you need to replay a message)
 * For those just wanting to use shared memory in userspace, the underlying file can never be truncated if you seal it against shrinking, so your code never has to muck with exception handling -- this is a pretty big thing for my project.


 * http://lwn.net/Articles/580249/
 * memfd is needed by KDBUS for message passing
 * https://dvdhrm.wordpress.com/2014/06/10/memfd_create2/
 * http://www.heise.de/open/artikel/Kdbus-Neue-Interprozesskommunikation-fuer-den-Linux-Kernel-2089531.html
 * the new  syscall was merged into Linux kernel 3.17;
 * is a mechanism similar to Android's  that allows zero-copy message passing in KDBUS. Memfd effectively comes down to just a chunk of memory with a file descriptor attached that can be passed to mmap. The memfd_create function returns a raw shmem file and there's optional support for sealing.
 * memfd is a requirement of the forthcoming KDBUS
 * replace memfd ioctls with shmem memfd_create + fcntl logic
 * memfd as a mechanism for passing single-use messages is only more efficient than copying for messages >= 512kb, due to the mmap overhead (at least according to the LWN article)
 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=9183df25fe7b194563db3fec6dc3202a5855839c

File sealing

 * https://dvdhrm.wordpress.com/2014/06/10/memfd_create2/

Other

 * Systemd 202 Starts supporting kdbus

kdbus is a solution for processes to talk to each other that is
 * fast (zero-copy if at all possible)
 * secure
 * behaves mostly like D-Bus – including its introspective and data marshalling features
 * doesn't require a daemon process which manages all of the above

kdbus is a way for processes to talk to each other that is
 * 1) fast (zero-copy if at all possible)
 * 2) secure
 * 3) behaves mostly like dbus – including its introspective and data marshalling features
 * 4) doesn't require a daemon process which manages all of the above
 * 5) it should have enough features to eventually supersede Android's binder and ashmem (and pmem) drivers.

Please either tell us how to achieve that with a multicast AF_UNIX and some (OK … a lot) of libdbus_mcast-ish scaffolding around it, or kindly shut up.

Differences between kdbus and Binder
...

link dump

 * kdbus is not replacing but rather re-implementing the D-Bus mechanism within the Linux kernel.
 * The new kdbus stuff is a basically a stand-alone device driver which creates a char device, and does not touch any other code. It's very isolated from the rest of the kernel and not intrusive to any other subsystems. That's quite different from the previous attempts.
 * https://github.com/gregkh/kdbus/blob/master/kdbus.txt
 * SIMPL, KBUS, AF_BUS, Bloom filters
 * memfd
 * https://github.com/gregkh/kdbus/blob/master/memfd.c
 * http://cgit.freedesktop.org/systemd/systemd/commit/?id=ddeb424198649c3993a54efc81652325b6e3bfa5 bus: add new API for kdbus memfd functionality 2013-05-10
 * https://lwn.net/Articles/580249/ civilized discussion on LWN.net
 * https://lwn.net/Articles/593918/ Sealed files 2014-04-09
 * http://thread.gmane.org/gmane.comp.video.dri.devel/102241 File Sealing & memfd_create 2014-03-19
 * http://lwn.net/Articles/594919/ File Sealing & memfd_create PATCH v2 0/3 2014-04-15
 * http://news.kde.org/2014/04/21/freedesktop-summit-2014-report 2014-04-21


 * Which is why we have HTTP, and no other way for applications to manipulate TCP/IP sockets, because that would be reinventing the wheel.
 * Once upon a time Linus would frown on these sorts of single-use interfaces. He tended to prefer focusing on making other primitives more performant so that you could compose more complex interfaces in userspace. And for the kernel that's an excellent rule of thumb.
 * Remember the HTTP server wars? Microsoft had a blazing fast HTTP server in-kernel. So somebody coded one up in the Linux kernel, to one up Microsoft. But then somebody with infinitely more sense tweaked some kernel APIs and wrote an HTTP server in user-space that bested both the in-kernel servers.

AF_BUS

 * http://www.kroah.com/log/linux/af_bus.html
 * IVI-people wanted faster D-Bus (IVI≠ECU)
 * Collabora created, GENIVI sponsored
 * Kernel network protocol, removes 2 system calls per message
 * Faster than D-Bus daemon
 * Rejected by upstream kernel developers: marshalling