Crowdsourcing software development

Crowdsourcing software development or software crowdsourcing is an emerging area of software engineering. It is an open call for participation in any task of software development, including documentation, design, coding and testing. These tasks are normally conducted by either members of a software enterprise or people contracted by the enterprise. But in software crowdsourcing, all the tasks can be assigned to or are addressed by members of the general public. Individuals and teams may also participate in crowdsourcing contests.

Goals
Software crowdsourcing may have multiple goals.

Quality software: Crowdsourcing organizers need to define specific software quality goals and their evaluation criteria. Quality software often comes from competent contestants who can submit good solutions for rigorous evaluation.

Rapid acquisition: Instead of waiting for software to be developed, crowdsourcing organizers may post a competition hoping that something identical or similar has been developed already. This is to reduce software acquisition time.

Talent identification: A crowdsourcing organizer may be mainly interested in identifying talents as demonstrated by their performance in the competition.

Cost reduction: A crowdsourcing organizer may acquire software at a low cost by paying a small fraction of development cost as the price for award may include recognition awards.

Solution diversity: As teams will turn in different solutions for the same problem, the diversity in these solutions will be useful for fault-tolerant computing.

Ideas creation: One goal is to get new ideas from contestants and these ideas may lead to new directions.

Broadening participation: One goal is to recruit as many participants as possible to get best solution or to spread relevant knowledge.

Participant education: Organizers are interested in educating participants new knowledge. One example is nonamesite.com sponsored by DARPA to teach STEM Science, Technology, Engineering, and Mathematics.

Fund leveraging: The goal is to stimulate other organizations to sponsor similar projects to leverage funds.

Marketing: Crowdsourcing projects can be used for brand recognition among participants.

Architecture support
A crowdsourcing support system needs to include 1) Software development tools: requirement tools, design tools, coding tools, compilers, debuggers, IDE, performance analysis tools, testing tools, and maintenance tools. 2) Project management tools: ranking, reputation, and award systems for products and participants. 3) Social network tools: allow participants to communicate and support each other. 4) Collaborating tools: For example, a blackboard platform where participants can see a common area and suggest ideas to improve the solutions presented in the common area.

Social networks
Social networks can provide communication, documentation, blogs, twitters, wikis, comments, feedbacks, and indexing.

Processes
Any phase of software development can be crowdsourced, and that phase can be requirements (functional, user interface, performance), design (algorithm, architecture), coding (modules and components), testing (including security testing, user interface testing, user experience testing), maintenance, user experience, or any combination of these.

Existing software development processes can be modified to include crowdsourcing: 1) Waterfall model; 2) Agile processes; 3) Model-driven approach; 4) Open-Sourced approach; 5) Software-as-a-Service (SaaS) approach where service components can be published, discovered, composed, customized, simulated, and tested; 6) formal methods: formal methods can be crowdsourced.

The crowdsourcing can be competitive or non-competitive. In competitive crowdsourcing, only selected participants will win, and in highly competitive projects, many contestants will compete but few will win. In non-competitive manner, either single individuals will participate in crowdsourcing or multiple individuals can collaborate to create software. Products produced can be cross evaluated to ensure the consistency and quality of products and to identify talents, and the cross evaluation can be evaluated by crowdsourcing.

Items developed by crowdsourcing can be evaluated by crowdsourcing to determine the work produced, and evaluation of evaluation can be crowdsourced to determine the quality of evaluation.

Notable crowdsourcing processes include AppStori and Topcoder processes.

Pre-selection of participants is important for quality software crowdsourcing. In competitive crowdsourcing, a low-ranked participant should not compete against a high-ranked participant.

Platforms
Software crowdsourcing platforms including Apple Inc.'s App Store, Topcoder, and uTest demonstrate the advantage of crowdsourcing in terms of software ecosystem expansion and product quality improvement. Apple’s App Store is an online iOS application market, where developers can directly deliver their creative designs and products to smartphone customers. These developers are motivated to contribute innovative designs for both reputation and payment by the micro-payment mechanism of the App Store. Within less than four years, Apple's App Store has become a huge mobile application ecosystem with 150,000 active publishers, and generated over 700,000 IOS applications. Around the App Store, there are many community-based, collaborative platforms for the smart-phone applications incubators. For example, AppStori introduces a crowd funding approach to build an online community for developing promising ideas about new iPhone applications. IdeaScale is another platform for software crowdsourcing.

Another crowdsourcing example—Topcoder—creates a software contest model where programming tasks are posted as contests and the developer of the best solution wins the top prize. Following this model, Topcoder has established an online platform to support its ecosystem and gathered a virtual global workforce with more than 1 million registered members and nearly 50,000 active participants. All these Topcoder members compete against each other in software development tasks such as requirement analysis, algorithm design, coding, and testing.

Sample processes
The Topcoder Software Development Process consists of a number of different phases, and within each phase there can be different competition types:


 * 1) Architecture;
 * 2) Component Production;
 * 3) Application Assembly;


 * 1) Deployment

Each step can be a crowdsourcing competition.

BugFinders testing process:
 * 1) Engage BugFinders;
 * 2) Define Projects;
 * 3) Managed by BugFinders;
 * 4) Review Bugs;
 * 5) Get Bugs Fixed; and
 * 6) Release Software.

Theoretical issues
Game theory has been used in the analysis of various software crowdsourcing projects.

Information theory can be a basis for metrics.

Economic models can provide incentives for participation in crowdsourcing efforts.

Reference architecture
Crowdsourcing software development may follow different software engineering methodologies using different process models, techniques, and tools. It also has specific crowdsourcing processes involving unique activities such as bidding tasks, allocating experts, evaluating quality, and integrating software. To support outsourcing process and facilitate community collaboration, a platform is usually built to provide necessary resources and services. For example, Topcoder follows the traditional software development process with competition rules embedded, and AppStori allow flexible processes and crowd may be involved in almost all aspects of software development including funding, project concepts, design, coding, testing, and evaluation.

The reference architecture hence defines umbrella activities and structure for crowd-based software development by unifying best practices and research achievements. In general, the reference architecture will address the following needs:


 * 1) Customizable to support typical process models;
 * 2) Configurable to compose different functional components;
 * 3) Scalable to facilitate problem solution of varied size.

Particularly, crowdsourcing is used to develop large and complex software in a virtualized, decentralized manner. Cloud computing is a colloquial expression used to describe a variety of different types of computing concepts that involve a large number of computers connected through a real-time communication network (typically the Internet). Many advantages are to be found when moving crowdsourcing applications to the cloud: focus on project development rather than on the infrastructure that supports this process, foster the collaboration between geographically distributed teams, scale resources to the size of the projects, work in a virtualized, distributed, and collaborative environment.

The demands on software crowdsourcing systems are ever evolving as new development philosophies and technologies gain favor. The reference architecture presented above is designed to encompass generality in many dimensions including, for example different software development methodologies, incentive schemes, and competitive/collaborative approaches. There are several clear research directions that could be investigated to enhance the architecture such as data analytics, service based delivery, and framework generalization. As systems grow understanding the use of the platform is an important consideration, data regarding users, projects, and interaction between the two can all be explored to investigate performance. These data may also provide helpful insights when developing tasks or selecting participants. Many of the components designed in the architecture are general purpose and could be delivered as hosted services. By hosting these services the barriers for entry would be significantly reduced. Finally, through deployments of this architecture there is potential to derive a general purpose framework that could be used for different software development crowdsourcing projects or more widely for other crowdsourcing applications. The creation of such frameworks has had transformative effects in other domains for instance the predominant use of BOINC in volunteer computing.

Aspects and metrics
Crowdsourcing in general is a multifaceted research topic. The use of crowdsourcing in software development is associated with a number of key tension points, or facets, which should be considered (see the figure below). At the same time, research can be conducted from the perspective of the three key players in crowdsourcing: the customer, the worker, and the platform.

Task decomposition:

Coordination and communication:

Planning and scheduling:

Quality assurance: A software crowdsourcing process can be described in a game process, where one party tries to minimize an objective function, yet the other party tries to maximize the same objective function as though both parties compete with each other in the game. For example, a specification team needs to produce quality specifications for the coding team to develop the code; the specification team will minimize the software bugs in the specification, while the coding team will identify as many bugs as possible in the specification before coding.

The min-max process is important as it is a quality assurance mechanism and often a team needs to perform both. For example, the coding team needs to maximize the identification of bugs in the specification, but it also needs to minimize the number of bugs in the code it produces.

Bugcrowd showed that participants will follow the prisoner's dilemma to identify bugs for security testing.

Knowledge and Intellectual Property:

Motivation and Remuneration:

Levels
There are the following levels of crowdsourcing:

Level 1: single persons, well-defined modules, small size, limited time span (less than 2 months), quality products, current development processes such as the one by Topcoder and uTest. At this level, coders are ranked, websites contains online repository crowdsourcing materials, software can be ranked by participants, have communication tools such as wiki, blogs, comments, software development tools such as IDE, testing, compilers, simulation, modeling, and program analysis.

Level 2: teams of people (< 10), well-defined systems, medium large, medium time span (3 to 4 months), adaptive development processes with intelligent feedback in a blackboard architecture. At this level, a crowdsourcing website may support adaptive development process and even concurrent development processes with intelligent feedback with the blackboard architecture; intelligent analysis of coders, software products, and comments; multi-phase software testing and evaluation; Big Data analytics, automated wrapping software services into SaaS (Software-as-a-Service), annotate with ontology, cross reference to DBpedia, and Wikipedia; automated analysis and classification of software services; ontology annotation and reasoning such as linking those service with compatible input/output.

Level 3: teams of people ( 10), well-defined system, large systems, long time span (< 2 years), automated cross verification and cross comparison among contributions. A crowdsourcing website at this level may contain automated matching of requirements to existing components including matching of specification, services, and tests; automated regression testing.

Level 4: multinational collaboration of large and adaptive systems. A crowdsourcing website at this level may contain domain-oriented crowdsourcing with ontology, reasoning, and annotation; automated cross verification and test generation processes; automated configuration of crowdsourcing platform; and may restructure the platform as SaaS with tenant customization.

Significant events
Microsoft crowdsourcing Windows 8 development. In 2011, Microsoft started blogs to encourage discussions among developers and general public. In 2013, Microsoft also started crowdsourcing their mobile devices for Windows 8. In June 2013, Microsoft also announced crowdsourcing software testing by offering $100K for innovative techniques to identify security bugs, and $50K for a solution to the problem identified.

In 2011 the United States Patent and Trademark Office launching a crowdsourcing challenge under the America COMPETES Act on the Topcoder platform to develop for image processing algorithms and software to recognize figure and part labels in patent documents with a prize pool of $50,000 USD. The contest resulted in 70 teams collectively making 1,797 code submissions. The solution of the contest winner achieved high accuracy in terms of recall and precision for the recognition of figure regions and part labels.

Oracle uses crowdsourcing in their CRM projects.

Conferences and workshops
A software crowdsourcing workshop was held at Dagstuhl, Germany in September 2013.