|
|
|
|
|
|
|
New Issue available online
|
|
|
|
|
|
|
Table of Contents
|
|
| |
|
This new issue contains the following articles:
|
|
| |
|
|
|
| |
|
| |
|
A Survey on Hyperdimensional Computing aka Vector Symbolic Architectures, Part II:
Applications, Cognitive Models, and Challenges |
|
|
|
Denis Kleyko, Dmitri Rachkovskij, Evgeny Osipov and Abbas Rahimi |
|
|
|
This is Part II of the two-part comprehensive survey devoted to a computing framework
most commonly known under the names Hyperdimensional Computing and Vector Symbolic
Architectures (HDC/VSA). Both names refer to a family of computational models that
use high-dimensional distributed representations and rely on the algebraic properties
of their key operations to incorporate the advantages of structured symbolic representations
and vector distributed representations. Holographic Reduced Representations [321,
326] is an influential HDC/VSA model that is well known in the machine learning domain
and often used to refer to the whole family. However, for the sake of consistency,
we use HDC/VSA to refer to the field.Part I of this survey [222] covered foundational
aspects of the field, such as the historical context leading to the development of
HDC/VSA, key elements of any HDC/VSA model, known HDC/VSA models, and the transformation
of input data of various types into high-dimensional vectors suitable for HDC/VSA.
This second part surveys existing applications, the role of HDC/VSA in cognitive computing
and architectures, as well as directions for future work. Most of the applications
lie within the Machine Learning/Artificial Intelligence domain; however, we also cover
other applications to provide a complete picture. The survey is written to be useful
for both newcomers and practitioners. |
|
|
|
Pages:
1–52 |
DOI:
10.1145/3558000 |
|
| |
|
|
|
|
|
|
| |
|
Camera Measurement of Physiological Vital Signs |
|
|
|
Daniel McDuff |
|
|
|
The need for remote tools for healthcare monitoring has never been more apparent.
Camera measurement of vital signs leverages imaging devices to compute physiological
changes by analyzing images of the human body. Building on advances in optics, machine
learning, computer vision, and medicine, these techniques have progressed significantly
since the invention of digital cameras. This article presents a comprehensive survey
of camera measurement of physiological vital signs, describing the vital signs that
can be measured and the computational techniques for doing so. I cover both clinical
and non-clinical applications and the challenges that need to be overcome for these
applications to advance from proofs of concept. Finally, I describe the current resources
(datasets and code) available to the research community and provide a comprehensive
webpage (https://cameravitals.github.io/) with links to these resource and a categorized
list of all papers referenced in this article. |
|
|
|
Pages:
1–40 |
DOI:
10.1145/3558518 |
|
| |
|
|
|
|
|
|
| |
|
Trustworthy AI: From Principles to Practices |
|
|
|
Bo Li, Peng Qi, Bo Liu, Shuai Di, Jingen Liu, Jiquan Pei, Jinfeng Yi and Bowen Zhou |
|
|
|
The rapid development of Artificial Intelligence (AI) technology has enabled the deployment
of various systems based on it. However, many current AI systems are found vulnerable
to imperceptible attacks, biased against underrepresented groups, lacking in user
privacy protection. These shortcomings degrade user experience and erode people’s
trust in all AI systems. In this review, we provide AI practitioners with a comprehensive
guide for building trustworthy AI systems. We first introduce the theoretical framework
of important aspects of AI trustworthiness, including robustness, generalization,
explainability, transparency, reproducibility, fairness, privacy preservation, and
accountability. To unify currently available but fragmented approaches toward trustworthy
AI, we organize them in a systematic approach that considers the entire lifecycle
of AI systems, ranging from data acquisition to model development, to system development
and deployment, finally to continuous monitoring and governance. In this framework,
we offer concrete action items for practitioners and societal stakeholders (e.g.,
researchers, engineers, and regulators) to improve AI trustworthiness. Finally, we
identify key opportunities and challenges for the future development of trustworthy
AI systems, where we identify the need for a paradigm shift toward comprehensively
trustworthy AI systems. |
|
|
|
Pages:
1–46 |
DOI:
10.1145/3555803 |
|
| |
|
|
|
|
|
|
| |
|
A Survey on Recent Approaches to Question Difficulty Estimation from Text |
|
|
|
Luca Benedetto, Paolo Cremonesi, Andrew Caines, Paula Buttery, Andrea Cappelli, Andrea Giussani and Roberto Turrin |
|
|
|
Question Difficulty Estimation from Text (QDET) is the application of Natural Language
Processing techniques to the estimation of a value, either numerical or categorical,
which represents the difficulty of questions in educational settings. We give an introduction
to the field, build a taxonomy based on question characteristics, and present the
various approaches that have been proposed in recent years, outlining opportunities
for further research. This survey provides an introduction for researchers and practitioners
into the domain of question difficulty estimation from text and acts as a point of
reference about recent research in this topic to date. |
|
|
|
Pages:
1–37 |
DOI:
10.1145/3556538 |
|
| |
|
|
|
|
|
|
| |
|
Lexical Complexity Prediction: An Overview |
|
|
|
Kai North, Marcos Zampieri and Matthew Shardlow |
|
|
|
The occurrence of unknown words in texts significantly hinders reading comprehension.
To improve accessibility for specific target populations, computational modeling has
been applied to identify complex words in texts and substitute them for simpler alternatives.
In this article, we present an overview of computational approaches to lexical complexity
prediction focusing on the work carried out on English data. We survey relevant approaches
to this problem which include traditional machine learning classifiers (e.g., SVMs,
logistic regression) and deep neural networks as well as a variety of features, such
as those inspired by literature in psycholinguistics as well as word frequency, word
length, and many others. Furthermore, we introduce readers to past competitions and
available datasets created on this topic. Finally, we include brief sections on applications
of lexical complexity prediction, such as readability and text simplification, together
with related studies on languages other than English. |
|
|
|
Pages:
1–42 |
DOI:
10.1145/3557885 |
|
| |
|
|
|
|
|
|
| |
|
A Survey of User Perspectives on Security and Privacy in a Home Networking Environment |
|
|
|
Nandita Pattnaik, Shujun Li and Jason R. C. Nurse |
|
|
|
The security and privacy of smart home systems, particularly from a home user’s perspective,
have been a very active research area in recent years. However, via a meta-review
of 52 review papers covering related topics (published between 2000 and 2021), this
article shows a lack of a more recent literature review on user perspectives of smart
home security and privacy since the 2010s. This identified gap motivated us to conduct
a systematic literature review (SLR) covering 126 relevant research papers published
from 2010 to 2021. Our SLR led to the discovery of a number of important areas where
further research is needed; these include holistic methods that consider a more diverse
and heterogeneous range of home devices, interactions between multiple home users,
complicated data flow between multiple home devices and home users, some less studied
demographic factors, and advanced conceptual frameworks. Based on these findings,
we recommended key future research directions, e.g., research for a better understanding
of security and privacy aspects in different multi-device and multi-user contexts,
and a more comprehensive ontology on the security and privacy of the smart home covering
varying types of home devices and behaviors of different types of home users. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3558095 |
|
| |
|
|
|
|
|
|
| |
|
Emotion Ontology Studies: A Framework for Expressing Feelings Digitally and its Application
to Sentiment Analysis |
|
|
|
Eun Hee Park and Veda C. Storey |
|
|
|
Emotion ontologies have been developed to capture affect, a concept that encompasses
discrete emotions and feelings, especially for research on sentiment analysis, which
analyzes a customer's attitude towards a company or a product. However, there have
been limited efforts to adapt and employ these ontologies. This research surveys and
synthesizes emotion ontology studies to develop a Framework of Emotion Ontologies
that can be used to help a user select or design an appropriate emotion ontology to
support sentiment analysis and increase the user's understanding of the roles of affect,
context, and behavioral information with respect to sentiment. The framework, which
is derived from research on emotion ontologies, psychology, and sentiment analysis,
classifies emotion ontologies as discrete emotion or one of two hybrid ontologies
that are combinations of the discrete, dimensional, or componential process emotion
paradigms. To illustrate its usefulness, the framework is applied to the development
of an emotion ontology for a sentiment analysis application. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3555719 |
|
| |
|
|
|
|
|
|
| |
|
Trust in Edge-based Internet of Things Architectures: State of the Art and Research
Challenges |
|
|
|
Lidia Fotia, Flávia Delicato and Giancarlo Fortino |
|
|
|
The Internet of Things (IoT) aims to enable a scenario where smart objects, inserted
into information networks, supply smart services for human beings. The introduction
of edge computing in IoT can reduce the decision-making latency, save bandwidth resources,
and expand the cloud services to be allocated at the network’s edge. However, edge-based
IoT systems currently face challenges in their decentralized trust management. Trust
management is essential to obtain reliable mining and data fusion, improved user privacy
and data security, and provisioning of services with context-awareness. In this survey,
we first examine the edge-based IoT architectures currently reported in the literature.
Then a complete review of trust requirements in edge-based IoT systems is produced.
Also, we discuss about blockchain as a solution to solve several trust problems in
IoT and analyze in detail the correlation between blockchain and edge computing. Finally,
we provide a detailed analysis of performance aspects of trusted edge-based IoT systems
and recommend promising research directions. |
|
|
|
Pages:
1–34 |
DOI:
10.1145/3558779 |
|
| |
|
|
|
|
|
|
| |
|
Evaluating the Cybersecurity Risk of Real-world, Machine Learning Production Systems |
|
|
|
Ron Bitton, Nadav Maman, Inderjeet Singh, Satoru Momiyama, Yuval Elovici and Asaf Shabtai |
|
|
|
Although cyberattacks on machine learning (ML) production systems can be harmful,
today, security practitioners are ill-equipped, lacking methodologies and tactical
tools that would allow them to analyze the security risks of their ML-based systems.
In this article, we perform a comprehensive threat analysis of ML production systems.
In this analysis, we follow the ontology presented by NIST for evaluating enterprise
network security risk and apply it to ML-based production systems. Specifically, we
(1) enumerate the assets of a typical ML production system, (2) describe the threat
model (i.e., potential adversaries, their capabilities, and their main goal), (3)
identify the various threats to ML systems, and (4) review a large number of attacks,
demonstrated in previous studies, which can realize these threats. To quantify the
risk posed by adversarial machine learning (AML) threat, we introduce a novel scoring
system that assigns a severity score to different AML attacks. The proposed scoring
system utilizes the analytic hierarchy process (AHP) for ranking—with the assistance
of security experts—various attributes of the attacks. Finally, we developed an extension
to the MulVAL attack graph generation and analysis framework to incorporate cyberattacks
on ML production systems. Using this extension, security practitioners can apply attack
graph analysis methods in environments that include ML components thus providing security
practitioners with a methodological and practical tool for both evaluating the impact
and quantifying the risk of a cyberattack targeting ML production systems. |
|
|
|
Pages:
1–36 |
DOI:
10.1145/3559104 |
|
| |
|
|
|
|
|
|
| |
|
Edge Computing with Artificial Intelligence: A Machine Learning Perspective |
|
|
|
Haochen Hua, Yutong Li, Tonghe Wang, Nanqing Dong, Wei Li and Junwei Cao |
|
|
|
Recent years have witnessed the widespread popularity of Internet of things (IoT).
By providing sufficient data for model training and inference, IoT has promoted the
development of artificial intelligence (AI) to a great extent. Under this background
and trend, the traditional cloud computing model may nevertheless encounter many problems
in independently tackling the massive data generated by IoT and meeting corresponding
practical needs. In response, a new computing model called edge computing (EC) has
drawn extensive attention from both industry and academia. With the continuous deepening
of the research on EC, however, scholars have found that traditional (non-AI) methods
have their limitations in enhancing the performance of EC. Seeing the successful application
of AI in various fields, EC researchers start to set their sights on AI, especially
from a perspective of machine learning, a branch of AI that has gained increased popularity
in the past decades. In this article, we first explain the formal definition of EC
and the reasons why EC has become a favorable computing model. Then, we discuss the
problems of interest in EC. We summarize the traditional solutions and hightlight
their limitations. By explaining the research results of using AI to optimize EC and
applying AI to other fields under the EC architecture, this article can serve as a
guide to explore new research ideas in these two aspects while enjoying the mutually
beneficial relationship between AI and EC. |
|
|
|
Pages:
1–35 |
DOI:
10.1145/3555802 |
|
| |
|
|
|
|
|
|
| |
|
A Survey of Security and Privacy Issues in V2X Communication Systems |
|
|
|
Takahito Yoshizawa, Dave Singelée, Jan Tobias Muehlberg, Stephane Delbruel, Amir Taherkordi, Danny Hughes and Bart Preneel |
|
|
|
Vehicle-to-Everything (V2X) communication is receiving growing attention from industry
and academia as multiple pilot projects explore its capabilities and feasibility.
With about 50% of global road vehicle exports coming from the European Union (EU),
and within the context of EU legislation around security and data protection, V2X
initiatives must consider security and privacy aspects across the system stack, in
addition to road safety. Contrary to this principle, our survey of relevant standards,
research outputs, and EU pilot projects indicates otherwise; we identify multiple
security- and privacy-related shortcomings and inconsistencies across the standards.
We conduct a root cause analysis of the reasons and difficulties associated with these
gaps, and categorize the identified security and privacy issues relative to these
root causes. As a result, our comprehensive analysis sheds lights on a number of areas
that require improvements in the standards, which are not explicitly identified in
related work. Our analysis fills gaps left by other related surveys, which are focused
on specific technical areas but do not necessarily point out underlying root issues
in standard specifications. We bring forward recommendations to address these gaps
for the overall improvement of security and safety in vehicular communication. |
|
|
|
Pages:
1–36 |
DOI:
10.1145/3558052 |
|
| |
|
|
|
|
|
|
| |
|
Advancing SDN from OpenFlow to P4: A Survey |
|
|
|
Athanasios Liatifis, Panagiotis Sarigiannidis, Vasileios Argyriou and Thomas Lagkas |
|
|
|
Software-defined Networking (SDN) marked the beginning of a new era in the field of
networking by decoupling the control and forwarding processes through the OpenFlow
protocol. The Next Generation SDN is defined by Open Interfaces and full programmability
of the data plane. P4 is a domain-specific language that fulfills these requirements
and has known wide adoption over recent years from Academia and Industry. This work
is an extensive survey of the P4 language covering domains of application, a detailed
overview of the language, and future directions. |
|
|
|
Pages:
1–37 |
DOI:
10.1145/3556973 |
|
| |
|
|
|
|
|
|
| |
|
Android Source Code Vulnerability Detection: A Systematic Literature Review |
|
|
|
Janaka Senanayake, Harsha Kalutarage, Mhd Omar Al-Kadri, Andrei Petrovski and Luca Piras |
|
|
|
The use of mobile devices is rising daily in this technological era. A continuous
and increasing number of mobile applications are constantly offered on mobile marketplaces
to fulfil the needs of smartphone users. Many Android applications do not address
the security aspects appropriately. This is often due to a lack of automated mechanisms
to identify, test, and fix source code vulnerabilities at the early stages of design
and development. Therefore, the need to fix such issues at the initial stages rather
than providing updates and patches to the published applications is widely recognized.
Researchers have proposed several methods to improve the security of applications
by detecting source code vulnerabilities and malicious codes. This Systematic Literature
Review (SLR) focuses on Android application analysis and source code vulnerability
detection methods and tools by critically evaluating 118 carefully selected technical
studies published between 2016 and 2022. It highlights the advantages, disadvantages,
applicability of the proposed techniques, and potential improvements of those studies.
Both Machine Learning (ML)-based methods and conventional methods related to vulnerability
detection are discussed while focusing more on ML-based methods, since many recent
studies conducted experiments with ML. Therefore, this article aims to enable researchers
to acquire in-depth knowledge in secure mobile application development while minimizing
the vulnerabilities by applying ML methods. Furthermore, researchers can use the discussions
and findings of this SLR to identify potential future research and development directions. |
|
|
|
Pages:
1–37 |
DOI:
10.1145/3556974 |
|
| |
|
|
|
|
|
|
| |
|
A Survey on Video Moment Localization |
|
|
|
Meng Liu, Liqiang Nie, Yunxiao Wang, Meng Wang and Yong Rui |
|
|
|
Video moment localization, also known as video moment retrieval, aims to search a
target segment within a video described by a given natural language query. Beyond
the task of temporal action localization whereby the target actions are pre-defined,
video moment retrieval can query arbitrary complex activities. In this survey paper,
we aim to present a comprehensive review of existing video moment localization techniques,
including supervised, weakly supervised, and unsupervised ones. We also review the
datasets available for video moment localization and group results of related work.
In addition, we discuss promising future directions for this field, in particular
large-scale datasets and interpretable video moment localization models. |
|
|
|
Pages:
1–37 |
DOI:
10.1145/3556537 |
|
| |
|
|
|
|
|
|
| |
|
Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence |
|
|
|
Jacky Cao, Kit-Yung Lam, Lik-Hang Lee, Xiaoli Liu, Pan Hui and Xiang Su |
|
|
|
Mobile Augmented Reality (MAR) integrates computer-generated virtual objects with
physical environments for mobile devices. MAR systems enable users to interact with
MAR devices, such as smartphones and head-worn wearables, and perform seamless transitions
from the physical world to a mixed world with digital entities. These MAR systems
support user experiences using MAR devices to provide universal access to digital
content. Over the past 20 years, several MAR systems have been developed, however,
the studies and design of MAR frameworks have not yet been systematically reviewed
from the perspective of user-centric design. This article presents the first effort
of surveying existing MAR frameworks (count: 37) and further discusses the latest
studies on MAR through a top-down approach: (1) MAR applications; (2) MAR visualisation
techniques adaptive to user mobility and contexts; (3) systematic evaluation of MAR
frameworks, including supported platforms and corresponding features such as tracking,
feature extraction, and sensing capabilities; (4) and underlying machine learning
approaches supporting intelligent operations within MAR systems. Finally, we summarise
the development of emerging research fields and the current state-of-the-art and discuss
the important open challenges and possible theoretical and technical directions. This
survey aims to benefit both researchers and MAR system developers alike. |
|
|
|
Pages:
1–36 |
DOI:
10.1145/3557999 |
|
| |
|
|
|
|
|
|
| |
|
Technical Requirements and Approaches in Personal Data Control |
|
|
|
Junsik Sim, Beomjoong Kim, Kiseok Jeon, Moonho Joo, Jihun Lim, Junghee Lee and Kim-Kwang Raymond Choo |
|
|
|
There has been a trend of moving from simply de-identification to providing extended
data control to their owner (e.g., data portability and right to be forgotten), partly
due to the introduction of the General Data Protection Regulation (GDPR). Hence, in
this paper, we survey the literature to provide an in-depth understanding of the existing
approaches for personal data control (e.g., we observe that most existing approaches
are generally designed to facilitate compliance), as well as the privacy regulations
in Europe, United Kingdom, California, South Korea, and Japan. Based on the review,
we identify the associated technical requirements, as well as a number of research
gaps and potential future directions (e.g., the need for transparent processing of
personal data and establishment of clear procedure in ensuring personal data control). |
|
|
|
Pages:
1–30 |
DOI:
10.1145/3558766 |
|
| |
|
|
|
|
|
|
| |
|
Blockchain-Based Federated Learning for Securing Internet of Things: A Comprehensive
Survey |
|
|
|
Wael Issa, Nour Moustafa, Benjamin Turnbull, Nasrin Sohrabi and Zahir Tari |
|
|
|
The Internet of Things (IoT) ecosystem connects physical devices to the internet,
offering significant advantages in agility, responsiveness, and potential environmental
benefits. The number and variety of IoT devices are sharply increasing, and as they
do, they generate significant data sources. Deep learning (DL) algorithms are increasingly
integrated into IoT applications to learn and infer patterns and make intelligent
decisions. However, current IoT paradigms rely on centralized storage and computing
to operate the DL algorithms. This key central component can potentially cause issues
in scalability, security threats, and privacy breaches. Federated learning (FL) has
emerged as a new paradigm for DL algorithms to preserve data privacy. Although FL
helps reduce privacy leakage by avoiding transferring client data, it still has many
challenges related to models’ vulnerabilities and attacks. With the emergence of blockchain
and smart contracts, the utilization of these technologies has the potential to safeguard
FL across IoT ecosystems. This study aims to review blockchain-based FL methods for
securing IoT systems holistically. It presents the current state of research in blockchain,
how it can be applied to FL approaches, current IoT security issues, and responses
to outline the need to use emerging approaches toward the security and privacy of
IoT ecosystems. It also focuses on IoT data analytics from a security perspective
and the open research questions. It also provides a thorough literature review of
blockchain-based FL approaches for IoT applications. Finally, the challenges and risks
associated with integrating blockchain and FL in IoT are discussed to be considered
in future works. |
|
|
|
Pages:
1–43 |
DOI:
10.1145/3560816 |
|
| |
|
|
|
|
|
|
| |
|
A Review on C3I Systems’ Security: Vulnerabilities, Attacks, and Countermeasures |
|
|
|
Hussain Ahmad, Isuru Dharmadasa, Faheem Ullah and Muhammad Ali Babar |
|
|
|
Command, Control, Communication, and Intelligence (C3I) systems are increasingly used
in critical civil and military domains for achieving information superiority, operational
efficacy, and greater situational awareness. The critical civil and military domains
include, but are not limited to, battlefield, healthcare, transportation, and rescue
missions. Given the sensitive nature and modernization of tactical domains, the security
of C3I systems has recently become a critical concern. This is because cyber-attacks
on C3I systems have catastrophic consequences including loss of human lives. Despite
the increasing number of cyber-attacks on C3I systems and growing concerns about C3I
systems’ security, there is a paucity of a comprehensive review to systematize the
body of knowledge on the security of C3I systems. Therefore, in this article, we have
gathered, analyzed, and synthesized the body of knowledge on the security of C3I systems.
We have identified and reported security vulnerabilities, attack vectors, and countermeasures/defenses
for C3I systems. In particular, this article has enabled us to (i) propose a taxonomy
for security vulnerabilities, attack vectors, and countermeasures; (ii) interrelate
attack vectors with security vulnerabilities and countermeasures; and (iii) propose
future research directions for advancing the state-of-the-art on the security of C3I
systems. We believe that our findings will serve as a guideline for practitioners
and researchers to advance the state-of-the-practice and state-of-the-art on the security
of C3I systems. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3558001 |
|
| |
|
|
|
|
|
|
| |
|
Path Planning for UAV Communication Networks: Related Technologies, Solutions, and
Opportunities |
|
|
|
Junhai Luo, Zhiyan Wang, Ming Xia, Linyong Wu, Yuxin Tian and Yu Chen |
|
|
|
Path planning has been a hot and challenging field in unmanned aerial vehicles (UAV).
With the increasing demand of society and the continuous progress of technologies,
UAV communication networks (UAVCN) are also flourishing. The mobility of UAV nodes
allows for flexible network deployment, but some challenges are brought, such as power
constraints, throughput, cost, and time efficiency. Therefore, path planning is significant
for UAVCN. This article presents a review of UAVCN path planning. We first introduce
the network structure and performance evaluation of UAVCN. We then investigate the
generic UAV path planning algorithms and the path planning algorithms in UAVCN. In
this article, the advantages and disadvantages of each path planning algorithm and
the functional problems. The challenges faced in path planning for UAVCN, the solutions,
state-of-the-art, and representative results are also presented. In addition, we illustrate
future research directions for UAVCN path planning as well, which can provide some
help to researchers. |
|
|
|
Pages:
1–37 |
DOI:
10.1145/3560261 |
|
| |
|
|
|
|
|
|
| |
|
Explainable AI (XAI): Core Ideas, Techniques, and Solutions |
|
|
|
Rudresh Dwivedi, Devam Dave, Het Naik, Smiti Singhal, Rana Omer, Pankesh Patel, Bin Qian, Zhenyu Wen, Tejal Shah, Graham Morgan and Rajiv Ranjan |
|
|
|
As our dependence on intelligent machines continues to grow, so does the demand for
more transparent and interpretable models. In addition, the ability to explain the
model generally is now the gold standard for building trust and deployment of artificial
intelligence systems in critical domains. Explainable artificial intelligence (XAI)
aims to provide a suite of machine learning techniques that enable human users to
understand, appropriately trust, and produce more explainable models. Selecting an
appropriate approach for building an XAI-enabled application requires a clear understanding
of the core ideas within XAI and the associated programming frameworks. We survey
state-of-the-art programming techniques for XAI and present the different phases of
XAI in a typical machine learning development process. We classify the various XAI
approaches and, using this taxonomy, discuss the key differences among the existing
XAI techniques. Furthermore, concrete examples are used to describe these techniques
that are mapped to programming frameworks and software toolkits. It is the intention
that this survey will help stakeholders in selecting the appropriate approaches, programming
frameworks, and software toolkits by comparing them through the lens of the presented
taxonomy. |
|
|
|
Pages:
1–33 |
DOI:
10.1145/3561048 |
|
| |
|
|
|
|
|
|
| |
|
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural
Language Processing |
|
|
|
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi and Graham Neubig |
|
|
|
This article surveys and organizes research works in a new paradigm in natural language
processing, which we dub “prompt-based learning.” Unlike traditional supervised learning,
which trains a model to take in an input x and predict an output y as P(y|x), prompt-based
learning is based on language models that model the probability of text directly.
To use these models to perform prediction tasks, the original input x is modified
using a template into a textual string prompt x′ that has some unfilled slots, and
then the language model is used to probabilistically fill the unfilled information
to obtain a final string x̂, from which the final output y can be derived. This framework
is powerful and attractive for a number of reasons: It allows the language model to
be pre-trained on massive amounts of raw text, and by defining a new prompting function
the model is able to perform few-shot or even zero-shot learning, adapting to new
scenarios with few or no labeled data. In this article, we introduce the basics of
this promising paradigm, describe a unified set of mathematical notations that can
cover a wide variety of existing work, and organize existing work along several dimensions,
e.g., the choice of pre-trained language models, prompts, and tuning strategies. To
make the field more accessible to interested beginners, we not only make a systematic
review of existing works and a highly structured typology of prompt-based concepts
but also release other resources, e.g., a website NLPedia–Pretrain including constantly
updated survey and paperlist. |
|
|
|
Pages:
1–35 |
DOI:
10.1145/3560815 |
|
| |
|
|
|
|
|
|
| |
|
An Experimental Investigation of Text-based CAPTCHA Attacks and Their Robustness |
|
|
|
Ping Wang, Haichang Gao, Xiaoyan Guo, Chenxuan Xiao, Fuqi Qi and Zheng Yan |
|
|
|
Text-based CAPTCHA has become one of the most popular methods for preventing bot attacks.
With the rapid development of deep learning techniques, many new methods to break
text-based CAPTCHAs have been developed in recent years. However, a holistic and uniform
investigation and comparison of these attacks’ effects is lacking due to inconsistent
choices of model structures, training datasets, and evaluation metrics. In this article,
we perform an experimental investigation on the effects of existing attacks on text-based
CAPTCHA schemes. We first summarize existing text-based CAPTCHAs using a newly proposed
taxonomy based on their resistance mechanisms and systematically review corresponding
attacks in terms of methods and pros/cons. Then, we introduce a unified attack framework
that contains a number of different attack modules and transfer learning strategies.
Applying this framework, we extensively evaluate the performance of known attacks
on 20 CAPTCHA schemes in terms of accuracy and efficiency; then, we investigate the
robustness of these widely used schemes and discover the effects of previously unexplored
attacks. Finally, we discuss future CAPTCHA designs based on our experimental results
and findings. Our work also contributes to the CAPTCHA community by offering an open-access
dataset that contains 22 different CAPTCHA sample sets. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3559754 |
|
| |
|
|
|
|
|
|
|
|
|
| |
The ACM
Digital Library is published by the Association for Computing
Machinery.
Copyright ©
2023
ACM, Inc.
|
|
|
|
|
|
|
To update which email alerts you receive,
manage your alerts within the My Account area.
You can also unsubscribe from this alert with one click.
If you need any
further help, please contact us at
dl-...@hq.acm.org
|