User:Ldsmithperrin/Unified Software Technologies

Unified Software Technologies is an ISV (Independent software vendor) established in Orlando, Florida in 2010 that specializes in the development and commercialization of lock-free technologies including software libraries, applications, and systems.

Unified Software Technologies' stated mission is to assist the information technology sector to realize the benefits of lock-free data structures and algorithms in today's mobile devices, desktops, servers, cloud-based systems, and high performance computing environments. The company is preparing to launch a suite of new software products with the stated goal of revolutionizing the performance and stability of multi-threaded, parallel, and distributed processes and systems using recent and patent-pending advancements in lock-free algorithms.

According to Ken Pierce, VP of Business Development at Unified Software Technologies, "Unified Software Technologies has developed a number of algorithms designed to unlock the processing potential of modern computer systems. These algorithms represent a valuable tool for optimizing and simplifying information technology projects, particularly those concerned with maximizing performance and efficiency, such as mobile, server and service-oriented platforms."

Background
One of the goals of lock-free algorithms is to increase processing capacity and performance when there is contention for shared resources. Resource contention will become more pronounced as processor vendors reach the upper limits of CPU clock rates and begin to add more processing units ("cores") to their CPU packages. In order to continue to maintain optimal processing throughput, one of several approaches may be applied to reduce resource contention, including adoption of lock-free algorithms.

Processing Capacity and Moore's Law
Commodity personal computer hardware has facilitated tremendous growth in service-oriented and networked computing. With the advent of networked applications and the rapid growth of the mobile marketplace, the demand for efficient, feature-rich server platforms will continue to grow.

Server platforms are at the forefront of innovation in terms of improved computing efficiency, both in general and specific terms such as reducing electrical power and cooling requirements. Demands for user-friendly and more feature-rich service platforms are muted by business challenges such as greener, more environmentally optimized technologies and practices.

Another trend has emerged, one that regards computer technology and Moore’s law of increasing computing power. The focus of microprocessor technology performance growth has shifted from that of increasing processor frequency, or clock-speed, to multi-processing, particularly with the integration of multiple processor cores into a single package. Although this shift has facilitated additional performance gains, it has not done so without a commensurate increase in software complexity.

Shift to Multi-Processing Computing
The shift to multi-processing computing imposes several aspects of computer and processor architecture upon methods of software optimization. Software concurrency is complicated, for instance, by innovations such as instruction reordering and multi-level caching. As demands for software performance increase, many such details of computer hardware architecture must be factored into high-performance software design and implementation. Traditional methods of implementing software concurrency represent, for many systems, an expensive form of computational inefficiency. Lock-free algorithms provide a current and accessible solution for software vendors.

Proposed Benefits of Lock-Free Algorithms
When describing the benefits of lock-free data structures in their 1991 paper, A Lock-Free Multiprocessor OS Kernel, authors Henry Massalin and Calton Pu wrote that "Lock-Free synchronization avoids many serious problems caused by locks: considerable overhead, concurrency bottlenecks, deadlocks, and priority inversion in real-time scheduling."

Academic Research
Since the early 1990s researchers have been studying mechanisms for improving performance in both Operating System (Kernel) design as well as individual data structures. Incremental progress has been made as the technique has been demonstrated. At the ACM SIGPLAN 2004 conference on Programming language design and implementation author Maged M. Michael presented the IBM sponsored paper on Scalable Lock-free Dynamic Memory Allocation, stating that "A lock-free memory allocator guarantees progress regardless of whether some threads are delayed or even killed and regardless of scheduling policies."

In addition to Operating System and Kernel applications, lock-free programming techniques have been applied to primitive data structures used by software applications. The performance benefits demonstrated by the academic community through use of lock-free primitive data structures such as stacks, queues, and linked lists have provided a compelling case for adoption of the technique into commercially available products.

Corporate Research
IBM is pursuing development of Transactional Memory in order to simplify parallel programming and provide lock-free benefits through hardware support. Intel has announced Transactional Synchronization Extensions support scheduled for first release within the Haswell processor.

The actual performance benefits of transactional memory are unknown and may vary greatly depending upon the nature of the running process. If there is high-contention for shared resources then hardware aborts of executing threads may degrade performance to below that of optimized software lock-free implementations which obviate transaction acquire and release semantics.

Unified Software Technologies has researched a set of algorithms to provide true lock-free performance benefits optimized for current microprocessors and that provide a layer of abstraction for developers wishing to take advantage of new hardware features as they become available.

Problem Complexity
Algorithms that safely and accurately implement lock-free techniques are difficult to design and implement. According to Maged M. Michael and Michael L. Scott, "Most multiprocessors are multiprogrammed to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for a concurrent, atomic update of shared data structures: (1)preemption-safe locking and (2)non blocking (lock-free) algorithms. Preemption-safe locking requires kernel support. Nonblocking algorithms generally require a universal atomic primitive such as compare-and-swap or load-linked/store-conditional and are widely regarded as inefficient."

Academic Research
In addition to performance concerns over implementation of lock-free algorithms based upon atomic compare-and-swap instructions, such implementations may not actually be thread-safe. One category of problem that may occur is the ABA Problem whereby "multiple threads (or processes) accessing shared memory interleave".

Corporate Research
In order to continue to improve performance and support software run-time performance optimizations, Intel has provided recommendations for |optimal use of it's shared L2 cache architecture. Application of optmization techniques that managed shared caches is very important to optimal design of lock-free algorithms in terms of both performance and stability.

Commercially Available Solutions
Unified Software Technologies has announced availability of a set of libraries and a complete lock-free Web and Application Server scheduled for release in first quarter of 2012 with the stated goal of providing an optimal lock-free implementation for today's mobile, desktop, cloud, and server environments.

Parallel Scalable Solutions offers the NOBLE set of libraries that have been developed jointly by Hakan Sundell and Philippas Tsigas.