User:Qzheng75/sandbox

= Computing in the 21st century =

By the start of the 21st century, we have witnessed tremendous scientific advancements in the world of computing. Among the achievements in computing, countless resplendent breakthroughs mark our entrance to a new era. In this wiki page, our team selected six areas in the world of computing, each of which has been well developed and testified in the industry. In each of the following sections, we will present to you the outstanding publications, platforms, related tech communities, and some critical glossary terms in these fields. If you would like to learn more about these fields, you can explore the articles and links we provide in the works cited section at the end. Have you ever wondered how different factors affect the rising and falling of the housing price in your neighborhood or how Google classifies spam emails for you? If so, machine learning will be your best friend. According to the Georgia Tech Machine Learning Center, machine learning aims to produce machines that can learn from their experiences and make predictions based on those experiences and other data they have analyzed. By designing algorithms and applying statistical methods integratively, we can program computers to complete tasks that humans always find laborious. By training a model with data, we can "teach" the machine to learn underlying patterns that are hard to detect for humans in the data provided. Machine learning, together with its descendent, deep learning, allows the creation of diversified and intricate technology, such as computer vision, natural language processing, data classification, etc. While not fully realized just yet, quantum computing is an exciting new realm of computing that has huge potential to grow and help not only the computing world, but all corners of society. Quantum computing is much like classical computing, but instead of using classical bits for the basis of information, quantum computers use qubits. While regular bits can store information in only two ways, either on or off, qubits can store this information in a superposition of the two states, making a qubit exponentially more powerful than a bit. To give just one idea of how much more powerful, the current fastest method to factor an extremely long prime number would take the Earth’s strongest supercomputers millions, even billions of years to factor that number but a quantum computer would be able to factor it in just a couple of minutes. The internet seems to be an old topic, given how it has been created for so long. However, there is still a lot of potential in this field. The current growth in machine learning and cloud computing requires the internet to be able to transfer large amounts of data rapidly and safely. There is also a trend for developers to move towards the software side as opposed to the traditional hardware approach to increase efficiency. Furthermore, to improve latency in the transfer of information, researchers have been trying to improve on the protocol side, specifically on network scheduling. This wiki page provides entry points, including the tools and peer-reviewed papers, into all of the areas mentioned above. In an age where everything is interconnected through the cloud, the threat of cybersecurity attacks is at an all-time high. We are familiar with the basic phishing methods that creep into our inboxes and viruses across the internet waiting to infect our laptops, but cybersecurity not only protects our data but also protects every physical object connected through the Internet of Things. Computer science technology is advancing too rapidly for us to keep track of and much larger networks foundational to our everyday life such as power grids are being attacked every day. Artificial intelligence has entered the cybersecurity scene on both the attacking and defending sides. This wiki page explores how it has changed the game, utilising an offensive and defensive approach to protecting our networks while also predicting future threats and possible vulnerabilities of our cybersecurity in ways traditional methods would struggle to achieve.

We hope that you can learn about these technologies through this wiki site. If you are a beginner in computing or a computing enthusiast, you can see the Social or Platform parts in each section below to access prestigious computing communities and learning resources. If you want to get in touch with advanced topics, be sure to read the selected publications listed in each section to learn about the frontier areas in computing. Hope you enjoy viewing this wiki site!

Selected published papers:
==== Statistical Modeling: The Two Cultures by Leo Breiman ==== Just as what the title suggests, Leo Breiman's paper "Statistical Modeling: The Two Cultures" demonstrates the two cultures in statistical modeling: the classical statistical approach and the modern machine learning approach.

The classical statistical approach focuses on developing mathematical models based on assumptions about the underlying data distribution by using classical statical methods like statistical inferences and hypothesis testings.

The machine learning approach, on the other hand, focuses on developing algorithms that can learn from data without traditional statistic methods, which may rely heavily on strong assumptions about the underlying data distribution. Instead, the approach relies on algorithms executed to let the machine discover underlying patterns in data. This approach is often used for tasks such as computer vision tasks (image and pattern detection) and natural language processing (NLP).

Breiman introduces various techniques in machine learning, from the most basic linear regression model to random forest. In the paper, Breiman argues that the machine learning approach is often more suitable for current real-world problems where prediction accuracy is more important than statistical inference. He also suggests that researchers should let the two cultures cooperate so that the finalized model can combine the strengths of both cultures and produce a more comprehensive approach to address the issue.

Overall, Breiman's paper highlights the characteristics of the two approaches in data analysis and the significance of both the classical statistical approach and especially the machine learning approach when developing statistical models. In addition, the paper emphasizes the importance of empirical evaluation in assessing model performance.

==== Tidy data by Hadley Wickham ==== In machine learning, data preprocessing is a critical step. Decent data processing enables the model trained to obtain better patterns for the provided data. In this paper, Wickham introduced to us the concept of "Tidy data." He proposed three standards for "Tidy data," or Codd’s 3rd normal form, as the following:

1. Each variable forms a column.

2. Each observation forms a row.

3. Each type of observational unit forms a table.

In contrast, Wickham describes potential issues of messy datasets in machine learning. Through examples, Wickham demonstrates scenarios that contradict the three principles.

Furthermore, Wickham uses a case study about individual-level mortality data from Mexico to illustrate how to organize the data to fit the aforementioned criteria and how well-organized data and feature engineering can be used in data analytics and data visualization.

In conclusion, the paper emphasizes the well-designed data structures' importance for machine learning.

Scikit-learn: beginner-friendly toolbox
Scikit-learn is a popular library for machine learning that is written in python, c, and c++. It provides a simple and efficient data analysis and modeling toolset and offers a variety of supervised and unsupervised machine learning algorithms, as well as tools for data preprocessing, model evaluation, and data visualization. These models include basic linear regression to advanced machine learning features like Neural network models and Natural Language Processing.

Scikit-learn is designed to be beginner-friendly. The package not only owns extensive documentation but also active community support, which makes it easier for beginners in machine learning to learn and apply to a wide range of scenarios. With scikit-learn, beginners can quickly and easily start creating models that can be used to solve real-world problems.

Here is a list of commonly-used tools in the scikit-learn package:


 * Classification algorithms: logistic regression, k-nearest neighbors (KNN), decision trees, and support vector machines (SVM).
 * Regression algorithms: linear regression, Ridge regression, and Lasso regression.
 * Clustering algorithms: K-Means, DBSCAN, and hierarchical clustering.
 * Dimension reduction algorithms: Principal Component Analysis (PCA).

Visit the official website of Scikit-learn here!

Google Cloud
Google Cloud provides a comprehensive collection of services for machine learning that can be used to build and deploy machine learning models at a large scale using the robust computing resources powered by Google, regardless of a user's level of experience in data science or machine learning. Some popular services include:

Cloud AutoML:


 * Simplified process of creating custom machine learning models accessible to a broader range of users.

Tensorflow:
 * Provide services including image recognition, natural language processing, and structured data classification. You can select the service that best caters to your needs.
 * A software library for building, training and deploying machine learning models.


 * Strong hardware support: computations across multiple devices, such as CPUs and GPUs, allow users to take advantage of the processing power of modern hardware, which is harder to access at a personal level.

Visit the official website of Google Cloud here!

Kaggle:
Kaggle is a community for data scientists and machine learning enthusiasts. It provides a large number of datasets that you can use in your machine-learning project. In addition to being a repository of datasets, Kaggle provides tutorials for newcomers to grasp critical machine learning concepts and start building your first project of data science. There is also a supportive and active community empowering Kaggle, in which you can present the results of your projects to others and discuss data science problems.

I made a short video to tour you around the Kaggle community. Follow this link to watch it.

Contribute your dataset here!

Zooniverse:
Zooniverse is a citizen science platform that features people-powered research. Citizens can participate in scientific research by classifying data and contributing to various research projects. It was launched in 2007 and has since grown to become the world's largest platform for online citizen science. The platform allows researchers to harness the power of human intelligence to help preprocess large datasets that would be impossible for a single researcher to handle alone so that these datasets can be applied to various machine-learning techniques.

When volunteers classify data on Zooniverse, they create labeled datasets that can be used to train machine learning models. These models can then automate the classification process, allowing researchers to analyze larger datasets more efficiently. In addition, you can participate in the citizen projects on the platform, like anti-biotic resistant bacteria infection inspection, marine ecosystem, etc.

Visit Zooniverse's website here!

Published papers:
Post-Quantum Cryptography by Don Monroe

Modern exchanges of digital information are built upon public key encryption, which is a form of encryption that relies on the prime factorization of an extremely large prime integer. With classical computing, the factorization of a large prime is impractical and time consuming but theoretically, with quantum computing, the process becomes exponentially faster. This poses a threat to the existing information using a public key encryption system when or if quantum computing ever becomes prominent. However, data currently protected with a symmetric key system like the popular Advanced Encryption Standard (AES) will be at relatively less risk from a potential rise in quantum computers. According to the paper, a quantum algorithm developed in 1990 by Lov Grover can crack symmetric encryption faster than classical computing by a factor of a square root. This means that data secured with a symmetric key system might still be safe if key lengths are increased.

Problems can potentially arise with the sharing of the key between senders and recipients, but recent advances in mathematics and cryptography have worked towards a solution. In July of 2022, the NIST (National Institute of Standards and Technology) announced four finalists for a program to find new encryption schemes that will be both immune to classical and quantum computer attacks. According to the article, the NIST expects to release a formal standard in 2024.

In conclusion, Monroe's paper informs of the potential threats to cybersecurity that could arise alongside a surge in quantum computing. The paper also details the rising encryption techniques that could be both immune against classical and quantum attacks.

A Holographic Wormhole Traversed in a Quantum Computer by Adam R. Brown & Leonard Susskind This paper details the successful transfer of information through a simulated wormhole on a nine-qubit quantum computer. The idea of a wormhole comes from the studies of Albert Einstein and Nathan Rosen, who noticed that there is not only one exterior region in a black hole, but two. The two regions are connected via a kind of wormhole known as an Einstein-Rosen bridge.

Later the same year as the discovery of the Einstein-Rosen bridge, Einstein and Rosen, this time working with Boris Podolsky, examined a phenomenon known as quantum entanglement. This phenomenon described how quantum systems could be linked through strange, classically unexplainable patterns, even over extremely long distances. Einstein famously called this phenomenon "spooky action at a distance".

Over time, the idea of Einstein-Rosen bridges and quantum entanglements became more and more connected even though, at the time of discovery, they were thought to be entirely unrelated. The paper states that the recent successful transfer of information through a quantum computer simulated wormhole connects the two conjectures together into one and the same idea.

To summarize, the holographic wormhole simulated in a quantum computer used quantum entanglement to send a message through a wormhole, thus proving the two ideas are one and the same natural phenomenon. The paper finishes out by proposing that the simulation can be used in future science for even more breakthroughs in both quantum mechanics and traditional gravity physics.

Platforms:
Microsoft Azure Quantum

Microsoft Azure Quantum is a service that hosts many resources for those who have an interest in quantum computing. Azure Quantum partners with companies like IonQ and Quantinuum to provide users with access to quantum hardware.

Azure Quantum provides users with a Quantum Development Kit (QDK) and the language Q#, which is used to program on quantum hardware. Users can also choose to run their programs on a local simulation or an actual quantum computer. Along with this, users have access to Quantum Katas, a learning tool that guides users through the intricacies of quantum computing and programming for qubits. This makes Azure Quantum very beginner friendly, and helps to prop open the door for those who are merely curious about quantum computing.

Additionally, at the time of writing, Azure Quantum hosts open office hours every Thursday 8:30 AM PST to help users or review quantum projects. This makes quantum computing even more accessible and allows users to connect with a professional in the field of quantum computing.

Explore more about Microsoft Azure Quantum here.

Google Cirq

Google’s Cirq is an open source Python library that allows users to write, manipulate, and optimize quantum circuits. Cirq offers useful resources such as quantum circuit simulation and quantum virtual machines for users. Cirq is especially helpful for users who are fluent in Python, since Cirq uses the platform for their simulations.

Cirq also has an extensive support system. As previously mentioned, Cirq is open source, which means that anyone can read and modify the code. This means that Cirq can be modified by users to their liking. Cirq additionally includes weekly open source meetings with developers, which is an especially helpful tool for those who wish to advance the Cirq software. Along with this, Cirq offers help to any users struggling with several different options. If the user struggles with technical issues, Cirq has a dedicated forum on Stack Exchange, if the user wishes to dive into Cirq’s source code to dissect the program and assess its capabilities, then they can do so.

Explore more about Cirq here.

Social / How to engage:
IEEE Quantum IEEE (Institute of Electrical and Electronics Engineers) has a subgroup specifically focusing on anything related to quantum technologies or quantum physics. This could include quantum computing, information, optimization, engineering, and algorithms. IEEE Quantum hopes to redesign and challenge quantum technologies in preparation for the future and lay down the groundwork for the later expansion of quantum. IEEE Quantum is the first interdisciplinary organization to rally around quantum computing. Their community welcomes all members from researchers and engineers to architects and government.

IEEE Quantum was launched in 2019 under IEEE’s Future Directions initiative and serves as the IEEE’s leading community for all updates, news, projects, and events relating to quantum computing. Some of the members on the IEEE’s steering committee include: Greg Byrd, Scott Koziol, Lia Yeh, Amr Helmy, and Georgia Tech’s own Tom Conte. The IEEE Quantum community hosts yearly Quantum weeks, education talks, and even releases quantum technology focused podcasts.

Learn more about IEEE Quantum here.

OneQuantum

OneQuantum is a global community centered around quantum technology and guiding humanity’s transition into the quantum era. OneQuantum provides many values to members including: local chapters, mentoring, career fairs, and skill development. OneQuantum is led under André M. König, who studied quantum computing at MIT and received an MBA in Economics from the University of Chicago Booth School of Business.

OneQuantum also advocates for a unified community, enforcing a zero-tolerance policy for harassment and discrimination while also being inclusive. OneQuantum has 14,000 members and 21 communities worldwide. These communities are spread across Africa, Asia, Europe, Latam, the Middle East, and North America. OneQuantum additionally offers a Women in Quantum program that specifically offers resources directed towards women in the field of Quantum Computing. This program is led by Denise Ruffner and is the largest women in quantum community globally. OneQuantum’s WIQ program offers mentoring, career advantages like career fairs and a job posting board, and scholarships to help those in the community attend quantum events or school.

Learn more about OneQuantum here.

Published papers:
==== Internet congestion control by S.H. Low, F. Paganini, and J.C. Doyle ==== This article gives an overarching view on congestion control, specifically how TCP (Transmission Control Protocol), a protocol that is account for around 90% of internet traffic around the world, implements this. Since TCP was developed a long time ago (in the 1970s), it has its limitation judging by the technology and standard nowadays. Specifically, this article focuses on how the TCP congestion control is at its most efficient state when the queue is working close to its capacity, and most of the TCP connections in the real world are short connections and transfer few packages, but they require low latency, while few TCP connections are long connections that transfer more packages, and doesn’t have a high need in low latency. Yet in the current congestion control design, since TCP doesn’t have a system for prioritizing connections when a queue has a few large data transfers, it can block the queue and cause unnecessary loss and delays on the short and fast connections.

After showing the limitations, the articles present a new protocol design that is scalable, focusing on compatibility with arbitrary networks, link capacities, and delays.

==== A Survey of Software-Defined Networking: Past, Present, and Future of Programmable Networks By B.A.A. Nunes, M. Mendonca, X. Nguyen, K. Obraczka, and T. Turletti ==== Software-defined application is the main trend in computing as we move into the 21st century. It focuses on a software approach and development, rather than the traditional hardware-focused design. In simple terms, it makes the internet programmable and extensible. This article introduces us to the history of such transition from hardware to software, the early developments such as open signaling (the first step in moving from hardware to software networking), the software-defined network architecture that is currently commonly used (network stack and flow control design), different emulation tools for programmers. Furthermore, the articles provide an entry point in future research in networking, including improvement in switch design, information-centric networking, and improvement in compatibility with cloud computing.

Protocols:
Protocols are the foundation of all communications, they are communication formats that both the senders and the receivers agreed upon. Protocols provide a standard for all devices to interpret data uniformly. The common protocols nowadays are TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). In general,

TCP provides lots of functionalities other than the basic ones, such as flow control, acknowledgment, connection (Three-way handshake), etc. However, TCP transports data slower than UDP.

Compare to TCP, UDP provides almost no functionalities, other than DNS (turning domain names into IP addresses) and several other basic functionalities, most of which are covered by TCP too, and leave the most part for the developers to implement. Therefore, UDP is also faster than TCP.

Programming Languages:
Unsurprisingly, programming languages are the heart of digital products. For networking developers, the most important functionality they need in a programming language is the ability to access and manage low-level parts of a computer - for example, the architecture and the hardware.

Currently, the most common programming languages in use for networking are C and C++. These are the traditional language that doesn't provide high-level abstraction and allow programmers to access the hardware side of computers. However, these two languages have their ups and downs - while they are incredibly fast and efficient, they are also not secure, more specifically, because they don't provide security checks, therefore it is up to the developers to implement good security checks to protect the applications from malicious attack. Since these security checks are not implemented at the language level, loopholes are likely to occur.

Rust is a language that is not currently commonly used but has great potential. Rust provides allows low-level programming, and although it implements security checks, it is not much slower than C and C++, since the language translates to lower level instructions rather closely.

Google QUIC


QUIC stands for Quick UDP Internet Connections, which was designed by Jim Roskind at Google. QUIC is now supported by Chrome, Firefox, and Safari. Besides Google, many big companies have also migrated their applications and servers to QUIC, including all of Facebook's apps and server infrastructure and Uber's mobile app. With 8.9% of all websites using QUIC, Google is still constantly investing in the development of QUIC, which provides many job opportunities for networking researchers.

Source code of QUIC

OpenFlow
OpenFlow marks a recent trend that allows software-defined access to the control plane, also known as the forwarding plane. Traditionally, access to physical devices requires hardware knowledge, and tools like OpenFlow that helps developers to configure the physical networking devices and work more efficiently. It was first developed by Stanford researchers, and later first adopted by Google. Knowing how to use such tools and understanding how they work as well as their limitations will allow developers to take on engineering and researching jobs.

Published papers:
Internet of Things: Technology and Value Added

by Felix Wortmann and Kristina Fluchter The International Telecommunication Union (ITU) defines the term "Internet of Things" (IoT) as a global infrastructure that interconnects physical and virtual things using existing and evolving information and communication technologies. Having gained tremendous popularity over the years, this technology has rapidly expanded into numerous areas of everyday life, including smart industry, homes, energy, transport, and health, with a projected growth of over $7.1 trillion in the market.

The article defines IoT and its growing applications, highlighting its value and benefits. Innovation with IoT involves merging physical and digital components to create new products and business models. Technology advancements allow for the digitization of functions and capabilities of industrial products, creating opportunities for companies to generate incremental value. Value creation in IoT involves combining physical products with IT-based digital services to enhance their primary functions. Connecting related products can lead to larger product systems and systems of systems, expanding industry boundaries and changing competitive dynamics. Examples include using IoT to turn a light bulb into a low-cost security system or a bin into an automatic replenishment service.

Although the authors acknowledge and promote the benefits of IoT, they shed light on the possible obstacles and concerns regarding its integration into society. They highlight challenges such as security and privacy issues, interoperability problems, and the necessity for standardized protocols and frameworks. Additionally, the article conducts an examination of the social and ethical implications of IoT, such as job displacement and the impact on personal privacy.

The authors conclude by emphasizing the importance of collaboration among stakeholders to fully realize the potential of IoT and stress the need for a comprehensive approach to IoT implementation, which encompasses not only the technology but also organizational and cultural changes. In summary, the article provides a valuable introduction to the potential value and challenges of IoT adoption.

Security of the Internet of Things: Vulnerabilities, Attacks, and Countermeasures

by Ismail Butun, Patrik Österberg, and Houbing Song Wireless Sensor Networks (WSNs) are gaining popularity for their affordability, ease of installation, and long-term operation. However, their integration with the Internet of Things (IoT) has introduced new security challenges, as WSNs lack physical protection and are vulnerable to security attacks, especially for applications requiring confidentiality, integrity, and availability. This article examines the security concerns surrounding IoT and suggests countermeasures to mitigate these risks. It identifies vulnerabilities in IoT devices, such as weak authentication and authorization mechanisms, malware injection, and denial-of-service attacks. The security attacks on WSNs and IoT are classified by the authors into passive and active attacks.

As the IoT landscape continues to expand, security remains a major concern due to the complexity and heterogeneity of devices. To address these issues, high-level architectural security design and low-level security protocols should be developed, tailored to the specific needs of resource-constrained devices. Lightweight security protocols and cryptography algorithms can enhance security. Effective defense mechanisms are necessary for prevention, detection, and mitigation of security attacks, including timely installation of security patches and the use of Intrusion Detection System (IDS) techniques to identify and prevent zero-day attacks. Future research should prioritize security in the design of routing, key distribution, trust management, and data aggregation schemes across all network layers.

The article provides a comprehensive review of known security attacks on WSNs in the IoT context and corresponding defense mechanisms is provided in this article, with the hope of inspiring more robust and secure network solutions for researchers in the field. It emphasizes the need for collaboration among stakeholders to develop and implement effective security measures at all levels of the IoT ecosystem, from device manufacturers to end-users, to address the security risks associated with IoT devices. The authors stress that securing WSNs is crucial for securing IoT and that effective defense mechanisms are essential for the proliferation and public acceptance of IoT technology.

Platforms:
Virtual Assistant Technologies in IoTs Virtual assistant technologies are software programs that can understand and respond to natural language commands or questions given by users through text or speech. They are designed to simulate a conversation with a human user and provide relevant information or perform tasks based on the user's requests. They are becoming increasingly popular in the age of the Internet of Things (IoT), as more and more devices are being connected to the internet and can be controlled through voice commands or other natural language interfaces. Some of the most well-known virtual assistants include Amazon's Alexa, Google Assistant, Apple's Siri, and Microsoft's Cortana.

Here is a video of things Amazon's Alexa can do

These virtual assistants use a combination of natural language processing, machine learning, and artificial intelligence algorithms to understand and respond to user requests. They can perform a wide range of tasks, such as setting reminders, answering questions, playing music, controlling smart home devices, and even making phone calls or sending messages. One of the key benefits of virtual assistant technologies is their ability to integrate with other IoT devices, allowing users to control multiple devices with a single voice command. For example, a user could ask their virtual assistant to turn on the lights, adjust the thermostat, and start playing music all at once. This level of convenience and automation is a major driver of the growing popularity of virtual assistants in the IoT landscape.

However, as with any technology, there are also potential drawbacks and concerns surrounding virtual assistant technologies. One major concern is privacy and security, as these devices are constantly listening for voice commands and transmitting data back to their servers. There have been instances of virtual assistants accidentally recording private conversations or being hacked by malicious actors. Another concern is the potential for these technologies to further exacerbate the digital divide, as those without access to virtual assistant devices may be left behind in a world increasingly reliant on IoT technologies.

Despite this, virtual assistant technologies continue to evolve and improve, with new features and integrations being added regularly. As the IoT landscape continues to expand, it is likely that virtual assistants will play an increasingly important role in controlling and interacting with the devices around us.

Virtual Private Network (VPN) VPNs enable users to securely and privately connect to the internet, no matter where they are located or what device they are using. A VPN encrypts an individual's internet traffic and routes it through a remote server that is located anywhere. This makes it more difficult for third parties, such as Internet Service Providers (ISPs), hackers, and governments, to monitor or intercept the user's online activity. As concerns over privacy and security online have grown in recent years, VPNs have become increasingly popular. VPNs can provide users with a degree of protection from cyber attacks, data breaches, and online surveillance in particular. In addition to bypassing internet censorship, VPNs can provide access to content that might be restricted in certain countries.

Video: What is VPN

VPNs can be divided into two types: remote access VPNs and site-to-site VPNs. An individual can use a remote access VPN to connect to a remote server and access the internet securely. This is commonly used by employees who need to access company resources while working from a remote location. In contrast, site-to-site VPNs allow entire networks to connect over the internet securely. This is often used by businesses with multiple offices or locations.

One of the main benefits of using a VPN is the increased level of privacy it provides. By encrypting internet traffic and routing it through a remote server, VPNs help protect users from online surveillance and data collection. This is especially important for individuals who may be accessing sensitive information, such as financial data or personal medical records. Another benefit is the increased level of security it provides. By encrypting internet traffic, VPNs make it more difficult for hackers and other malicious actors to intercept and steal sensitive data. This is especially important for businesses and organizations that may be handling sensitive customer data or proprietary information.

However, VPNs are not without their limitations and potential drawbacks. For example, some VPNs may slow down internet speeds due to the added encryption and routing of traffic. Additionally, some VPNs may not be fully secure, as not all VPN providers have the same level of security and privacy protections in place.

In conclusion, VPNs are a valuable technology that can provide increased privacy and security for users online. Whether for personal use or for businesses, VPNs offer a way to protect sensitive data and information from cyber threats and surveillance. However, it is important to carefully research and choose a reputable VPN provider in order to ensure the highest level of security and privacy protections.

Social / How to engage:
Smart Cities Smart cities refer to the integration of technology and urban development to create more efficient and sustainable cities. This integration involves the use of information and communication technologies (ICT) to manage and optimize city infrastructure, including transportation systems, public utilities, and public safety. Smart cities also rely on data collected from sensors, devices, and citizen input to inform decision-making processes and improve overall quality of life for residents.

The concept of smart cities has gained momentum in recent years as urban populations continue to grow and the need for efficient and sustainable urban development becomes more pressing. According to a report by the United Nations, 68% of the world's population is expected to live in urban areas by 2050, highlighting the need for smart city solutions to address the challenges of urbanization.

One of the key features of smart cities is the use of IoT devices and sensors to collect data and monitor various aspects of city life. This data can be used to optimize energy usage, reduce traffic congestion, and improve public safety. For example, sensors in parking lots can help drivers find available parking spaces, reducing congestion and emissions. Similarly, sensors in streetlights can be used to adjust lighting levels based on pedestrian and vehicular traffic, improving energy efficiency and safety.

Another important aspect of smart cities is the use of ICT to improve communication and citizen engagement. For example, smart city apps can provide real-time information on public transportation schedules, traffic updates, and emergency alerts. Additionally, citizen input can be collected through social media and other digital channels to inform city decision-making processes and improve overall service delivery.

Despite the potential benefits of smart cities, there are also concerns regarding privacy and security. The collection of large amounts of data from citizens and infrastructure raises questions about data ownership, access, and security. It is important for smart cities to have robust privacy and security policies in place to protect citizens' data and prevent misuse. Overall, the development of smart cities is an important step toward creating more efficient, sustainable, and livable urban environments. As urban populations continue to grow, smart city solutions will become increasingly necessary.

Healthcare and the Internet of Things In the last few decades, the Internet of Things (IoT) has emerged as a rapidly growing technology that is revolutionizing various industries, including healthcare. In healthcare, IoT technology can improve patient outcomes, enhance the quality of care, and reduce costs. The use of IoT in healthcare has many benefits. One of the significant advantages of IoT is remote patient monitoring. Patients with chronic conditions such as diabetes, heart disease, or respiratory illness can use IoT devices to monitor their vital signs, including blood pressure, glucose levels, and oxygen saturation, at home. This allows patients to receive early warnings about any changes in their health status, enabling them to seek medical attention promptly. IoT devices also help patients to manage their conditions better, reducing the need for hospitalization and improving their quality of life.

Another benefit of IoT in healthcare is improved clinical decision-making. Physicians and other healthcare providers can access real-time patient data from IoT devices, allowing them to make informed decisions about patient care. IoT technology also enables the integration of medical devices, electronic health records (EHRs), and other healthcare systems, making it easier for healthcare providers to access patient data and coordinate care. It is also changing the way medical research is conducted. IoT devices can collect massive amounts of data on patient behavior, environmental factors, and health outcomes, allowing researchers to gain new insights into disease prevention and treatment. For example, researchers can use IoT devices to study the effects of environmental factors such as air quality and temperature on patient health outcomes, enabling them to develop targeted interventions to improve health outcomes.

Despite the many benefits of IoT in healthcare, the technology also presents several challenges. One of the significant challenges is data privacy and security. IoT devices collect sensitive patient data, including personal information and medical records. This data must be protected from unauthorized access and breaches to ensure patient privacy and prevent data theft. Additionally, IoT devices must be designed with robust security features to prevent cyberattacks and protect patient data. Another challenge of IoT in healthcare is the interoperability of devices and systems. IoT devices from different manufacturers may not be compatible with each other, making it challenging to integrate data from different sources. This can lead to data silos, making it difficult for healthcare providers to access complete patient data and coordinate care effectively.

In conclusion, the Internet of Things is transforming healthcare by providing remote patient monitoring, improving clinical decision-making, and enabling medical research. However, the adoption of IoT in healthcare also presents significant challenges, including data privacy and security and device interoperability. Addressing these challenges will be critical to realizing the full potential of IoT in healthcare and improving patient outcomes.

Mathemetics: Auto-differentiation
Automatic Differentiation in Machine Learning: a Survey

by Atilim Gunes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind, edited by Leon Bottou



Automatic Differentiation is a family of techniques for efficiently computing derivatives of complex functions that has been around since the 1950s. It does this by breaking functions down into compositions of so called "elementary operations" and finding partial derivatives using the intermediate variables generated. It can do this using forward accumulation, which results in the direct calculation of the partial derivatives of the gradient function, or using backward accumulation, also known as backpropagation. Automatic differentiation has several implementation methods, each generally differentiated (pun intended) and therefore selected by the overhead they produce via arithmetic and bookkeeping. One major concern in the design of an implementation of an auto-differentiation algorithm is a class of bugs known as perturbation confusion bugs, which arise when the algorithm fails to keep the epsilons generated by two unique differentiations working on the same code separate.

A Comparison of Automatic Differentiation and Continuous Sensitivity Analysis for Derivatives of Differential Equation Solutions

by Yingbo Ma, Vaibhav Dixit, Michael J Innes, Xingjian Guo, and Chris Rackauckas



Given the ubiquitous nature of the derivatives of differential equation solutions in applied mathematics, there are many ways to algorithmically analyze them. This article compares two methods, one of which being discrete local sensitivity analysis implemented using automatic differentiation, and the other being continuous adjoint sensitivity analysis. Both methods are special cases of local sensitivity analysis, which is the calculation of derivatives of differential equation solutions with respect to system parameters. This is commonly done using numerical differentiation in cases where storage is not a problem, but this does not scale efficiently. Both forward-continuous sensitivity analysis and discrete sensitivity analysis are solutions to this problem. This paper tested both algorithms using the Julia language on the non-stiff Lotka-Volterra equations, a finite N × N difference discretization of the two-dimensional stiff Brusselator reaction-diffusion PDE, a standard stiff pollution model, and a non-stiff pharmacokinetic/pharmacodynamic system.

Platforms:
PyTorch Pytorch is a library for machine learning with interfaces for both Python and C++. It is run by a subsidiary of the Linux foundation known as the PyTorch foundation. It used to coexist with another library much like it known as convolutional architecture for fast feature embedding, or caffe2, before the two were compatible, but in 2017 the Open Neural Network exchange was developed and the two were merged by 2018. It, by virtue of being a machine learning platform, has robust support for both forward and backward automatic differentiation. It has several modules, crucially including a module called the autograd module which computes gradients using automatic differentiation. It also contains a module called optim that includes several methods for optimization. It also contains a module called nn specifically for implementing neural networks, and it contains several other submodules for more specialized implementations and methods.



AuTO AuTO is a framework for automatic differentiation in topology optimization. It is largely used for compliance minimization. More specifically, it is predominantly used to decrease the load required in finding sensitivities, which can be quite difficult and taxing on processer power when derived and implemented manually, especially in the case of non-triviality with respect to constraints, objectives, and material models. AuTO gets around most of these issues by using automatic differentiation (AD) and extensively employing JAX, a library for Python designed for the high-performance automatic computation of sensitivities from a topology optimization problem defined by the user. Within its introductory paper, the framework is demonstrated through a litany of examples that showcase its functionality with respect to compliance minimization and compliant mechanism design, and it even briefly touches upon AuTO's microstructural design abilities.

Social / How to engage:
TensorFlow TensorFlow is a software library developed by Google for machine learning and artificial intelligence. It is specialized for training deep neural networks, but it can be used for many different applications within the realm of machine learning. The newest version of TensorFlow was released by Google in September 2019. It is compatible with a wide variety of languages, including Java, C++, Python and JavaScript. It presents opportunities for direct engagement with algorithmic concepts like automatic differentiation in that it gives users the tools to implement them in their own software.



Published papers:
Systematic Review on Identification and Prediction of Deep Learning-Based Cyber Security Technology and Convergence Fields

by Seung-Yeon Hwang, Dong-Jin Shin and Jeong-Joon Kim

As technology has advanced through the fourth industrial revolution, cyberspace in various fields previously operated in different networks is now rapidly converging in this new era of “super-connection”. Now that drones based on artificial intelligence, cloud computing, robots, and big data are all interconnected in a much larger and more complex cyber system, there is potential for serious security threats. In an ever-advancing field of computer science, this article proposes the use of deep learning technology to predict the future evolution of attackers and identify future risks in our current systems by analysing papers published from 2010 to 2020 and patents in the information security field from four major societies. Three algorithms are primarily used to analyse the paper and patents


 * latent Dirichlet allocation (LDA)


 * dynamic topic model (DTM)


 * long short-term memory (LSTM)

The LDA algorithm is great at dealing with natural language processing and summaries. In this case, it clusters documents through a probability model by analysing the distribution of keywords by topic and the distribution of topics by document.

The DTM algorithm again is a probabilistic model but rather is time-based. It works hand in hand with LDA to analyse the trends occurring in the information security world. It grants us a clear overview of technological progress in cybersecurity to highlight emerging technologies and convergence issues.

Finally, the LSTM algorithm can predict future data. While DTM considered previous data, LSTM also considers microscopic past data. Rather than just general trends, we can achieve even greater precision this way.

Not only does this project propose ways of analysing risk as cybersecurity evolves, but it can also help identify future implementations of emerging technologies to increase cybersecurity. An example case may be identifying AI-related techniques such as deep learning to proactively respond to the evolution of attackers. Overall, this is a major tool that will increase information risk management.

A Systematic Review of Defensive and Offensive Cybersecurity with Machine Learning 

by Imatitikua D. Aiyanyo, Hamman Samuel and Heuiseok Lim

Today, artificial intelligence/machine learning poses a huge threat to cybersecurity. However, they are increasingly being utilised to also defend against these threats at the network and host levels. Host-based defense systems are what most people are familiar with including mechanisms such as firewalls, antiviruses, and Intrusion Detection Systems (IDS). Although very advanced, these systems are inherently imperfect as a  “security chain is only as strong as the weakest link” and such vulnerabilities lead to cascading security compromises at multiple sub-levels. This arises the need for ML cybersecurity systems which can be categorised into two different approaches:


 * Defensive approach - uses reactive strategies that focus on prevention, detection, and responses
 * Offensive approach - proactively predicts and removes threats in the system using ethical hacking techniques

The offensive approach aims to mimic exploits and attacks as cyber attackers would eliminate vulnerabilities ahead of time. The use of ML-based cybersecurity solutions is so useful because they can handle and analyse large amounts of data and complex detection logic where traditional methods would struggle. This is even clearer as we look at the defensive approach. ML techniques can be categorised into three main groups: supervised, unsupervised, and semi-supervised learning. Supervised learning requires training data that becomes associated with a “threat” or “no threat” label in order to assess future unknown threats. Unsupervised ML methods do not rely on training data or curated labels. Rather, it groups threats with similar attributes together based on general patterns such as signals for attacks. Lastly, semi-supervised ML is the in-between where training data may be insufficient, and yet unsupervised alternatives may not give the best results. Here, a small data set is used in combination with unsupervised approaches. The biggest challenge faced when training these models can be the high dimensionality of network traffic data making it difficult for researchers to train models in order to differentiate between anomalous and normal behaviour.

Some common attacks found in most of the data sets used in training these ML systems were:


 * Distributed Denial of Service (DDoS) - deploys attack vectors over the internet in a “distributed way” and generates lethal traffic through the aggregation of these forces
 * Remote to Local (R2l) - involves situations where attackers exploit vulnerabilities which could involve the guessing of passwords to take control of a remote machine
 * User to Root (U2R) - involves attackers attempting to get access to a target system without gaining official permission or approval

While ML techniques can be used by attackers to enhance their attack vectors and data sniffing activities as well as used in phishing and other social engineering attacks, it is the solution just as much as it is the threat.

Platforms:
Cloudflare Cloudflare is an IT service management company that provides a global network designed to make everything you connect to the Internet secure, private, fast, and reliable. Companies around the world use it to protect corporate networks, employees, and devices. In fact, According to The Hill, Cloudflare is used by more than 20 percent of the entire Internet for its web security services as of 2022. Its state of the art protection includes built-in software-defined Zero Trust services, DDoS mitigation, firewalls, and traffic acceleration which during Q1 blocked the largest DDoS attack in history.

Deep Instinct Deep Instinct is a cybersecurity company that utilises deep learning in order to prevent and detect malware. Deep Instinct's advanced artificial intelligence made it a World Economic Forum Technology Pioneer in 2017. An advantage of such technology is firstly the speed, stopping threats in <20ms, 750X faster than the fastest known ransomware is able to encrypt. Zero-day attacks are when hackers leverage an unknown security vulnerability to gain access into a system and deep instinct prides itself of its effectiveness to prevent these attacks and maintain a low false-positive rate (incorrectly identifying an intrusion).

Social / How to engage:
GreyHat cybersecurity club

GreyHat is Georgia Tech's cybersecurity club. Every year, they host a competitive Capture-the-Flag team where the competition consists of a number of intentionally vulnerable websites, cryptographic codes, and malware binaries that can be hacked. It is designed to be educational and cultivate practical skills in analysing, exploiting, and defending computer systems. For Tech students who have no experience in cybersecurity, also host a yearly beginner-intermediate capture the flag in the fall in order for students to get one foot in the door. There are two weekly meetings and one consists of learning how to solve CTF problems.

More on the GreyHat Club can be found on their website

Coursera

Coursera is an online educational tool that offers courses and degrees from world-class universities and companies, founded in 2012 by Stanford University computer science professors Andrew Ng and Daphne Koller. In demand skills are being taught by the companies themselves and often for free, providing great opportunities to expand your resume from beginner level and up. Some examples include Introduction to Cybersecurity for Business by the University of Boulder Colorado and IBM Cybersecurity Analyst Professional Certificate by IBM. Coursera allows you to explore these fields at no financial risk and promote your industry specific skill set with credible certificates you earn along the way. With self-paced guided projects, it is extremely convenient for those with busy schedules.

Glossary
// Should be 12 glossary terms here


 * Natural Language Processing (NLP): According to IBM, a prestigious Tech company, NLP refers to the branch of computer science—and more specifically, the branch of artificial intelligence or AI—concerned with giving computers the ability to understand text and spoken words in much the same way human beings can.
 * Supervised/Unsupervised learning: According to the book Pattern Recognition and Machine Learning by by Christopher Bishop, Supervised learning problems are the problems of inferring a function from labeled training data consisting of input-output pairs, while unsupervised learning problems are the problems of trying to find hidden structure in unlabeled data.
 * [[File:Software-defined-networking-SDN-architecture-source-Open-Networking-Foundation-ONF1.png|thumb]]Network Stack is an abstract design of how networking protocols are implemented, it is also commonly referred to as a protocol stack. The bottommost layer, the physical layer, is the foundation of the internet, it consists of the physical devices that transfer data (e.g. switches, routers, modems, etc. ). Above that is the control layer, and this is where the software-defined network we mentioned before stays, and it communicates with the physical layer through control planes, such as OpenFlow. Above the control layer is the application layer, and it communicates with the control layer through protocol API.
 * Reliability is a concept or functionality that ensures packages are processed in order on the client side. This is usually implemented through acknowledgments and timers, like the TCP three-way handshake we mentioned before.
 * Intelligent Virtual Assistant: It is an AI-enabled chat assistant that generates personalized responses by combining analytics and cognitive computing based on individual customer information, past conversations, and location, leveraging the corporate knowledge base and human insight. Think of Amazon's Alexa or Apple's Siri.


 * VPN: A Virtual Private Network is a mechanism for creating a secure connection between a computing device and a computer network, or between two networks, using an insecure communication medium such as the public Internet. It adds security and anonymity to users when they connect to web-based services and sites. A VPN hides the user’s actual public IP address and “tunnels” traffic between the user’s device and the remote server. Most users sign up for a VPN service online anonymity to avoid being tracked, and they often use public Wi-Fi where increased risks threaten the safety of their data.




 * Qubit: A qubit, or a quantum bit, is the basic storage system in a quantum computer. Like a bit in a classical computer, a quantum bit is used to store information but unlike the bit, the qubit has different states. A classical bit can be in only one of two states: either on (1) or off (0). A qubit however can be anywhere in between 0 and 1; it can be represented by a linear combination of 0s and 1s. This is called a superposition of the qubit. Qubits are exponentially more powerful than a classical bit; a task that is trivial for a 500 qubit computer would be impossible even for a classical computer with 2^500 bits. Qubits are far more fragile than classical bits however, and are victim to interference and entanglement with other qubits. While classical bits use silicon, qubits can be made from trapped ions, photons, or atoms. These foundations of qubits force the qubits to be housed at temperatures close to absolute zero or zero Kelvin.


 * Quantum Entanglement: Quantum entanglement is a phenomenon that was first discussed in a famous 1935 paper by Einstein, Podolsky, and Rosen (EPR) which describes the natural phenomenon where two particles become “linked together” in a certain way, even over extremely long distances. For example, measuring an entangled particle’s physical property like electron spin determines the result of the other entangled particle, even before measurement. The EPR paper was actually written to attack the emerging quantum field, since the three scientists believed that quantum theory was incomplete. Quantum entanglement was famously called “spooky action at a distance” by Albert Einstein due to its contradiction with part of Einstein’s theory of relativity that stated that information cannot travel faster than the speed of light. Thus, quantum entanglement adds to the rift between Einstein’s classical physics built on concrete realness and the emerging world of quantum physics governed by probability. Recently, scientists have begun to merge the ideas of wormholes described in an Einstein and Rosen 1935 paper and quantum entanglement, arguing that the two ideas are one and the same. This idea offers a bridge between the ideas of the quantum world and of the concrete classical physics world.
 * Phishing: Phishing is a type of online scam where a malicious actor or organisation attempts to trick individuals into divulging their sensitive information, such as login credentials, credit card details, or other personal information, by posing as a trustworthy entity in an electronic communication, such as an email or text message. Phishing attempts often use social engineering tactics to make the message appear legitimate, such as by mimicking the branding of a well-known company or organisation.
 * Cloud Computing: The cloud (or cloud computing) refers to a network of remote servers that are hosted on the Internet and used for storing, managing, and processing data, instead of on a disconnected local computer or server. Most servers today are connected to the Cloud because cloud computing allows individuals and organisations to access larger servers, storage, and applications from anywhere with an Internet connection.
 * Backpropagation: Backpropagation is a type of algorithm used largely to train neural networks. It is used to compute the gradient of a network with respect to its weights, and it is the reverse-accumulating case of Automatic Differentiation.
 * Quadrature Quadrature is a mathematical term for solving problems via area computation, typically by converting the problem into an integral or set of integrals.