Internet of Musical Things

The Internet of Musical Things (also known as IoMusT) is a research area that aims to bring Internet of Things connectivity  to musical and artistic practices. Moreover, it encompasses concepts coming from music computing, ubiquitous music, human-computer interaction, artificial intelligence, augmented reality, virtual reality, gaming, participative art, and new interfaces for musical expression. From a computational perspective, IoMusT refers to local or remote networks embedded with devices capable of generating and/or playing musical content.

Introduction
The term "Internet of Things" (IoT) is extensible to any everyday object connected to the internet, having its capabilities increased by exchanging information with other elements present in the network to achieve a common goal. Thanks to the technological advances that have occurred in the last decades, its use has spread to several areas of performance, assisting in medical analysis, traffic control and home security. When its concepts meet music, the Internet of Music Things (IoMusT) emerges.

The term "Internet of Musical Things" also receives numerous classifications, according to the use of certain authors. Hazzard et al., for example, uses it in the context of musical instruments that have QR code that directs the user to a page with information about this instrument, such as manufacturing date and history. Keller and Lazzarini, use this term in ubiquitous music (ubimus) research, while Turchet et al. define IoMusT as a subfield of the Internet of Things, where interoperable devices can connect to each other, aiding the interaction between musicians and the audience.

Like the IoT, the Internet of Music Things can encompass a variety of ecosystems. But generally, it is marked by being employed in musical activities (rehearsals, concerts, recordings, and music teaching) and relying on service and information providers.

In addition to the technological and artistic advantages that this field offers, new opportunities are still arising for the music industry, providing the emergence of new services and applications capable of exploiting the interconnection between logical and physical devices, always keeping the artistic purpose in mind.

Musical things
A musical thing is formally defined as a "computational device capable of acquiring, processing, acting, or exchanging data that serves a musical purpose." In short, these objects are entities that can be used for musical practice, can be connected in local and/or remote networks, and act as senders or receivers of messages. They can be, for example, a smart instrument (instruments that use sensors, actuators and wireless connection for audio processing), wearable devices or any other capable of controlling, generating or executing musical content over the network.

Unlike traditional audio devices, such as microphones and speakers, musical things are not useful by themselves, thus the need arises to insert them into a chain of equipment. Thus, the need arises to think about standards, protocols and means of communication between them. These challenges will be analyzed below.

The challenges of creating musical things
The first challenge concerns the hardware used in musical things. First, one should keep in mind that these devices are not analog. Because of this, they can be reprogrammed and must have internet connectivity and/or another possibility to communicate with other equipment. Secondly, they are not traditional computing devices. This means that they are programmed for a general purpose, not only to perform certain tasks, as is the case with smartphones and personal computers. Finally, it is important to note that they will be employed in an artistic and musical context. In this way, the aesthetic characteristics are as important as the computational ones.

That said, the hardware challenges are clear, and these include the processing capacity, as well as the storage and power consumption of musical things, which must be good enough to withstand artistic performances, while not making these objects expensive or unergonomic and unwieldy. In addition, they should be able to take on different roles in different scenarios. Thus, they should allow users to add or remove components (such as sensors and actuators) aiming to be adaptable, expressive and versatile.

The second challenge deals with the behavior of musical things. They must primarily exchange sound data, but it is desirable that they also exchange control data and processing parameters. In this sense, they must adapt their operating mode in order to be able to cooperate with the other elements present in the network, and also have their software and operating logics updated remotely.

The third adversity is possibly the most sensitive and most difficult topic to deal with. You have to think about what data is possible to share and how to do it. For audio formats, one can think of Pulse Code Moduation (PCM) formats, like WAV, because it is the most common in real-time audio processing systems. However, issues such as latency and quality are not guaranteed. Files in MP3, FLAC or OGG format, on the other hand, require more processing and the latency arising from this can make the environment impractical.

Possible solutions for creating musical things
Possible solutions to these problems include the use of common IoT elements in music practice or the assignment of networking capabilities on behalf of traditional audio objects. Effects units (such as guitar pedals) should be built so that the user is free to remove or insert buttons and sensors, while in logic units the software is modifiable. Audio equipment should send and receive data over the network, and also be remotely controlled. This can be useful for adapting these elements to the different types of data circulating on the network.

The musical instruments, on the other hand, will function similarly to smart musical instruments, where they will be equipped with sensors and actuators capable of capturing stimuli from the environment and from the musicians themselves. Musical aids such as metronomes and tuners can be transposed to digital media, while performance aids such as light and smoke cannons can be controlled and synchronized over the network.

However, IoMusT is not only about making adaptations of what already exists, but also by creating devices, capable of generating new perspectives for musical practices.

Related fields
This section reviews some of the various application domains that aid an IoMusT environment. The review is not intended to be exhaustive, aiming to describe the main features and functionalities of each area.

Network musical performance
A networked performance is a real-time interaction via machine, which allows artists dispersed across the globe to interact with each other as if they were in the same environment. While not intended to replace the traditional model, it contributes to music creation and its social interactions, promoting creativity and the exchange of cultures.

Among its main characteristics are: low latency, where the sounds produced should be heard almost instantaneously; synchronization, to prevent long delays from hindering interaction in the environment; interoperability through standardization, which allows different devices to communicate over the network; scalability, which makes the system comprehensive and allows distributed participation among users; and easy integration and participation, aspects that ensure that users have no difficulty in finding devices on the network, and can connect or disconnect from it whenever they want. As for the challenges in this area, they can be illustrated by the requirement for high bandwidth and ordering in the transmitted stream, and sensitivity to delay in the delivery of data packets.

Interactive art
Art has always had its interaction marked by the relationship between the artist and the medium he uses to materialize the work, while the audience had only the role of passively observing everything. This began to change when artistic movements led by Allan Kaprow and the Fluxus and Gutai groups began to allow for more active audience participation. In this context, interactive art emerged, characterized by allowing the viewer a degree of active involvement in the show, either by walking among the installations and sculptures, or by producing sounds, images, and movements.

The architecture of these environments is designed to handle different types of signals, ranging from audio and video to those produced by the human body, such as heartbeats. As such, they also require functionality that ensures interoperability and handles data delivery delays.

Ubiquitous music
Ubiquitous music, usually abbreviated to ubimus, is a research field that combines music, technology, and creative processes with strong social and community engagement. Although its original proposal is focused on music production, current technological developments have opened new social and cognitive dimensions to this field, leading it to become increasingly interested in educational and artistic topics. Thus, current perspectives encompass a wide diversity of subjects and actors, ranging from casual participants to highly trained musicians.

The ubimus ecosystem supports the integration of audio tools and audience interaction, and can be reconfigured to meet the needs of users. Consequently, the desired concepts are not dependent on specific implementations. Other important features are conceptual approaches and reliance on empirical methods. These aspects encourage the development of technologies for music creation, especially those that make use of common objects and spaces in the daily lives of those involved in the process.

Web Audio, cloud computing and edge computing
Web Audio is a JavaScript API for audio processing and synthesizing in web applications, representing a technological evolution in this segment. It presents some features common to DAWs, such as audio signal routing, low latency, and effects application. It also allows participative networked performances and expands the capabilities of using smartphones in these media.

Its environment uses audio nodes for manipulating sound in a musical context. They are connected by their inputs and outputs to create paths for routing audio, which is modified by effect nodes along the way. In this way, it is able to support numerous sources with different layouts, as well as being flexible and creating complex functions with dynamic effects.

Web Audio paves the way for using web browsers for musical purposes. Among the advantages observed from this are easy distribution (no installation required) and maintenance, platform and architecture independence, security (the browser can prevent plugins with incorrect behavior from affecting the system), and emergence of new types of collaboration.

Cloud computing, on the other hand, is a structure composed of distributed servers that simulate a centralized network, allowing load balancing and resource replication, minimizing the amount of network consumption and improving its scalability and reliability. It aims to provide numerous services, ranging from file storage to intercommunication between music applications, offering an unprecedented level of participation and performance.

Its main feature is to allow users to access the services without the need for knowledge about the technology used. Thus, they can access them on demand and regardless of location. Other points worth highlighting in this network are: broad access, elasticity, and resource management.

Cloud computing infrastructure is mostly composed of numerous physical machines connected together in a network. Each machine has the same software configurations, but can differ in the central processing unit, memory usage, and disk storage capacity. This model was developed with three main objectives in mind: i) reduce the cost in the acquisition and composition of the elements that form the network infrastructure, allowing it to be heterogeneous and adaptable to the resources required; ii) flexibility in adding or replacing computing resources; iii) ease of access to the services provided by it, where users only need their machines to have an operating system, browser and Internet access to access the resources available in the cloud.

Despite all the advantages listed above about using cloud computing, its centralized mode of operation creates a lot of service load on the network, in particular on costs and bandwidth resources for data transmission. In addition, network performance worsens as the amount of data increases. To address this problem, edge computing has emerged, which is a paradigm that combines cloud computing properties with real-time communication. The term "edge" refers to all the computational and network resources between the data sources and the cloud servers. In this way, objects present in the environment not only consume data and services, but also perform computational processing, decreasing stress on the network and significantly reducing latency in message exchange.

The key attributes of this computing model revolve around geographic distribution, mobility support, location recognition, computing resources and services close to the end user, low latency, context sensitivity, and heterogeneity.

Wearable technologies
Wearable computing is a new approach that has been redefining the way human-machine interaction happens, where electronic devices become directly connected to the user's body. They are called wearable devices and are built in such a way that the technologies and structures they contain are imperceptible, acting as an extension of the human being. Among the most popular models today are smartwatches and smartbands.

Although they are small, they are capable of continuously detecting, collecting, and uploading numerous physiological and sensory data, which aim to improve typical everyday activities such as making payments, assisting in location tracking, monitoring physical and mental health, providing analysis on certain physical activity, and aiding in artistic practice.

They must be able to fulfill three main goals: assign mobility to the user, that is, allow them to use the device in various locations; augment reality, such as generating images or sounds that are not part of the real world; and provide context sensitivity, which is the ability of the equipment to adapt to different environments and stimuli.

It is important to note that although they have connectivity and handle a large amount of data, not all wearable devices are IoT elements, and consequently, IoMusT elements. To be considered as such, they must have access to the internet.

Following a slightly different line of thought, but still using concepts from wearable computing, are the e-textiles. These consist of clothing enhanced with sensors and present some advantages over wearable devices, such as more comfort, more natural interfaces for human interaction and less intrusiveness. From this, electronic devices that are worn next to the human body can be classified according to the location in which they are inserted (wrist, head, feet and so on) and whether they already exist or are still in the prototyping phase.

IoMusT Challenges
In addition to facing the problems inherent to the use of the technology and also those present in IoT, the Internet of Music Things faces specific problems, ranging from technological issues to those artistic and environmental. The main ones are highlighted below.

Technology challenges
The possibility of IoMusT occurrence is dependent on network aspects such as bandwidth, latency and jitter. From this, it is necessary that these networks expand their operation beyond the current state-of-the-art, in order to provide better connection conditions and deal with the three aspects mentioned, in addition to ensuring synchronization and good quality of the representation of multimodal audio content.

With regard to latency, reliability and synchronization, they emerge as one of the main demands in the transmission of audio over a network and in real time, whether local or remote, wired or wireless. This occurs because of the random character of this type of communication, which can cause losses in the transmitted data and the desynchronization between them, even in small networks.

Still about synchronization, it is difficult to occur on devices that do not share the same global clock. Even in cases where this occurs, but with objects on different networks, resynchronization is required from time to time. Existing protocols are insufficient to meet this demand.

The importance of discussing interoperability and standardization of the devices present in this environment is that these concepts are essential pillars for its implementation. This is due to the fact that the devices do not know each other previously and do not have information about the elements in which they will connect. But given the heterogeneity of these objects, in many cases they do not operate under the same protocols nor are they able to interpret the data coming from their neighbors.

Artistic challenges
The main difference between IoMusT and IoT is the concern of the first field with artistic issues. Despite providing advantages, such as the possibility of creation among musicians arranged in different locations around the globe, massive connectivity and new forms of participation by the audience, some problems stand out. Among them, the rupture with the traditional model of artistic interactions, as observed in bands and orchestras; lack of visual feedback; choice of which elements will be displayed and/or controlled by the audience; absence of backup systems for remote concerts; expensive, inaccessible and unergonomic devices and lack of investment to elaborate the necessary infrastructure.

Legal challenges, privacy and security
With the enormous amount of data generated in these environments, legal concerns about personal data arise, since the devices are able to collect information from users involved in the process. Issues also appear involving infringement on protected material, copyright infringement, and intellectual property infringement.

Security issues are also worth mentioning. Because it is a system that communicates over the network, IoMusT is subject to attempts to steal sensitive data, denial of service attacks and trojans. Possible solutions involve encryption algorithms, but this can lead to high energy and memory usage of the devices.

Social challenges
One of the first thinkers to analyze the impact of technology on society was Herbert Marcuse. Among the problems cited by the author are: abundance of technology for one part of the population and scarcity for another; establishment of standards and demands by the ruling class; submission of workers to large corporations; retention of economic power and loss of individuality of thought. All these problems are present in IoMusT as well.

Allied to this, other problems can be accentuated, such as non-heterogeneous access to technologies, since people living in suburban or rural areas do not have the same possibilities of access as people living in denser areas; lack of infrastructure, which increases the socio-cultural difference between people and classes; excessive consumption, constant need for innovation, and social apartheid.

Economic challenges
While IoMusT can revolutionize the music industry by providing artificial intelligence algorithms capable of mixing and altering sound, reducing production costs, it can also negatively impact the creative part of this field by replacing human tasks with machine-based solutions, as well as causing reduced employment opportunities in the field.

Environmental challenges
With the growth of electro-electronic devices generated by this area, there is also a concern about environmental issues, especially those concerning waste generation, pollution in the making and use of these materials, use of chemical materials that can be toxic, use of non-renewable resources, and possible occurrences of ecological disturbances.

Possible usage scenarios
IoMusT allows rethinking some musical activities, such as live performances and rehearsals, multiplying the possibilities of interaction between the actors involved in these scenarios (musicians, audience, sound engineers, teachers, students, etc.). Given this brief elucidation, it is possible to think of some usage scenarios that are detailed below.

Scenario 1 - Augmented and immersive experiences
Imagine that when people arrive at a concert of their favorite band, they can choose different interfaces that will accompany them throughout the performance. One person might choose smartglasses (a computing device that adds information according to what the wearer sees), another chooses a wristband that responds to musical stimuli, and a third selects a set of sensors and speakers. All these objects can track the user's movements and send this information to the band. The band, in turn, can tailor its performance according to the audience's emotions, as well as send them stimuli that will be interpreted by the objects they are wearing.

Scenario 2 - Co-located hearing and remote hearing
Again, imagine users with wearable equipment capable of capturing their physiological data. From recording their wearer's movements and emotions (such as heart rate), musicians can decide what song to perform next, choreographers can create steps that best suit the recorded feelings, and the audience itself can make use of this data to control elements that aid the show, such as light and smoke cannons.

Meanwhile, people who were unable to physically attend the performance venue can experience the concert using virtual reality glasses or 360° video systems, allowing them to see behind the scenes of the stage and the details behind the musicians. IoMusT also predicts the possibility of an application that allows remote control. In this way, aspects present at the concert can be modified by the audience that is around the globe.

Scenario 3 - Remote rehearsals
Another possible scenario is a studio that uses IoMusT concepts to record solo artists, duos and small groups as well as orchestras with a variety of instruments. For this, the recording interface can adapt its size according to the amount of equipment connected to it. Musicians can record even if they are not in the same physical location, and audio files can be recorded for later mixing and mastering. Other possibilities include capturing audio from an instrument that is not in the same physical location, remote mixing and configuration of audio systems, obtaining performance data from musicians, and many others.

Scenario 4 - Music learning
Music learning is enriched by IoMusT by allowing the use of applications that display the scores to be played, capture audio in real time, and suggest improvements. Also, smart glasses can be used that indicate the correct position of the fingers on the instrument and share data in the cloud that can be viewed by teachers, who will indicate improvements and the next steps to be taken.

Scenario 5 - Improvisation session with electroacoustic instruments and musical things
This model is about a jam session that combines traditional instruments and electronic devices that exchange information over a network. These instruments can be plugged into speakers or connected to patches, while the users/musicians manipulate them from computer systems. Graphic elements such as videos, animations, and musical information can be displayed to assist the process; some users can participate only by controlling parameters of the instruments, such as volume, recording, instrument effects (delay and reverb, for example), as well as changing colors and resolutions in the graphics. It is also capable of having a sound technician who manages the connections, removing those with low network connection capacity or connecting those who wish to exchange information.