|
|
|
|
|
|
New Issue available online
|
|
|
|
|
|
|
Table of Contents
|
|
|
|
This new issue contains the following articles:
|
|
|
|
|
|
|
|
|
|
Deep Learning for Android Malware Defenses: A Systematic Literature Review |
|
|
|
Yue Liu, Chakkrit Tantithamthavorn, Li Li and Yepang Liu |
|
|
|
Malicious applications (particularly those targeting the Android platform) pose a
serious threat to developers and end-users. Numerous research efforts have been devoted
to developing effective approaches to defend against Android malware. However, given
the explosive growth of Android malware and the continuous advancement of malicious
evasion technologies like obfuscation and reflection, Android malware defense approaches
based on manual rules or traditional machine learning may not be effective. In recent
years, a dominant research field called deep learning (DL), which provides a powerful
feature abstraction ability, has demonstrated a compelling and promising performance
in a variety of areas, like natural language processing and computer vision. To this
end, employing DL techniques to thwart Android malware attacks has recently garnered
considerable research attention. Yet, no systematic literature review focusing on
DL approaches for Android malware defenses exists. In this article, we conducted a
systematic literature review to search and analyze how DL approaches have been applied
in the context of malware defenses in the Android environment. As a result, a total
of 132 studies covering the period 2014–2021 were identified. Our investigation reveals
that, while the majority of these sources mainly consider DL-based Android malware
detection, 53 primary studies (40.1%) design defense approaches based on other scenarios.
This review also discusses research trends, research focuses, challenges, and future
research directions in DL-based Android malware defenses. |
|
|
|
Pages:
1–36 |
DOI:
10.1145/3544968 |
|
|
|
|
|
|
|
|
|
|
An Empirical Survey on Long Document Summarization: Datasets, Models, and Metrics |
|
|
|
Huan Yee Koh, Jiaxin Ju, Ming Liu and Shirui Pan |
|
|
|
Long documents such as academic articles and business reports have been the standard
format to detail out important issues and complicated subjects that require extra
attention. An automatic summarization system that can effectively condense long documents
into short and concise texts to encapsulate the most important information would thus
be significant in aiding the reader’s comprehension. Recently, with the advent of
neural architectures, significant research efforts have been made to advance automatic
text summarization systems, and numerous studies on the challenges of extending these
systems to the long document domain have emerged. In this survey, we provide a comprehensive
overview of the research on long document summarization and a systematic evaluation
across the three principal components of its research setting: benchmark datasets,
summarization models, and evaluation metrics. For each component, we organize the
literature within the context of long document summarization and conduct an empirical
analysis to broaden the perspective on current research progress. The empirical analysis
includes a study on the intrinsic characteristics of benchmark datasets, a multi-dimensional
analysis of summarization models, and a review of the summarization evaluation metrics.
Based on the overall findings, we conclude by proposing possible directions for future
exploration in this rapidly growing field. |
|
|
|
Pages:
1–35 |
DOI:
10.1145/3545176 |
|
|
|
|
|
|
|
|
|
|
Post-hoc Interpretability for Neural NLP: A Survey |
|
|
|
Andreas Madsen, Siva Reddy and Sarath Chandar |
|
|
|
Neural networks for NLP are becoming increasingly complex and widespread, and there
is a growing concern if these models are responsible to use. Explaining models helps
to address the safety and ethical concerns and is essential for accountability. Interpretability
serves to provide these explanations in terms that are understandable to humans. Additionally,
post-hoc methods provide explanations after a model is learned and are generally model-agnostic.
This survey provides a categorization of how recent post-hoc interpretability methods
communicate explanations to humans, it discusses each method in-depth, and how they
are validated, as the latter is often a common concern. |
|
|
|
Pages:
1–42 |
DOI:
10.1145/3546577 |
|
|
|
|
|
|
|
|
|
|
A Survey of Joint Intent Detection and Slot Filling Models in Natural Language Understanding |
|
|
|
Henry Weld, Xiaoqi Huang, Siqu Long, Josiah Poon and Soyeon Caren Han |
|
|
|
Intent classification, to identify the speaker’s intention, and slot filling, to label
each token with a semantic type, are critical tasks in natural language understanding.
Traditionally the two tasks have been addressed independently. More recently joint
models that address the two tasks together have achieved state-of-the-art performance
for each task and have shown there exists a strong relationship between the two. In
this survey, we bring the coverage of methods up to 2021 including the many applications
of deep learning in the field. As well as a technological survey, we look at issues
addressed in the joint task and the approaches designed to address these issues. We
cover datasets, evaluation metrics, and experiment design and supply a summary of
reported performance on the standard datasets. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3547138 |
|
|
|
|
|
|
|
|
|
|
Taxonomy of Machine Learning Safety: A Survey and Primer |
|
|
|
Sina Mohseni, Haotao Wang, Chaowei Xiao, Zhiding Yu, Zhangyang Wang and Jay Yadawa |
|
|
|
The open-world deployment of Machine Learning (ML) algorithms in safety-critical applications
such as autonomous vehicles needs to address a variety of ML vulnerabilities such
as interpretability, verifiability, and performance limitations. Research explores
different approaches to improve ML dependability by proposing new models and training
techniques to reduce generalization error, achieve domain adaptation, and detect outlier
examples and adversarial attacks. However, there is a missing connection between ongoing
ML research and well-established safety principles. In this article, we present a
structured and comprehensive review of ML techniques to improve the dependability
of ML algorithms in uncontrolled open-world settings. From this review, we propose
the Taxonomy of ML Safety that maps state-of-the-art ML techniques to key engineering
safety strategies. Our taxonomy of ML safety presents a safety-oriented categorization
of ML techniques to provide guidance for improving dependability of the ML design
and development. The proposed taxonomy can serve as a safety checklist to aid designers
in improving coverage and diversity of safety strategies employed in any given ML
system. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3551385 |
|
|
|
|
|
|
|
|
|
|
Eye-tracking Technologies in Mobile Devices Using Edge Computing: A Systematic Review |
|
|
|
Nishan Gunawardena, Jeewani Anupama Ginige and Bahman Javadi |
|
|
|
Eye-tracking provides invaluable insight into the cognitive activities underlying
a wide range of human behaviours. Identifying cognitive activities provides valuable
perceptions of human learning patterns and signs of cognitive diseases like Alzheimer’s,
Parkinson’s, and autism. Also, mobile devices have changed the way that we experience
daily life and become a pervasive part. This systematic review provides a detailed
analysis of mobile device eye-tracking technology reported in 36 studies published
in high-ranked scientific journals from 2010 to 2020 (September), along with several
reports from grey literature. The review provides in-depth analysis on algorithms,
additional apparatus, calibration methods, computational systems, and metrics applied
to measure the performance of the proposed solutions. Also, the review presents a
comprehensive classification of mobile device eye-tracking applications used across
various domains such as healthcare, education, road safety, news, and human authentication.
We have outlined the shortcomings identified in the literature and the limitations
of the current mobile device eye-tracking technologies, such as using the front-facing
mobile camera. Further, we have proposed an edge computing driven eye-tracking solution
to achieve the real-time eye-tracking experience. Based on the findings, the article
outlines various research gaps and future opportunities that are expected to be of
significant value for improving the work in the eye-tracking domain. |
|
|
|
Pages:
1–33 |
DOI:
10.1145/3546938 |
|
|
|
|
|
|
|
|
|
|
Deep Learning in Sentiment Analysis: Recent Architectures |
|
|
|
Tariq Abdullah and Ahmed Ahmet |
|
|
|
Humans are increasingly integrated with devices that enable the collection of vast
unstructured opinionated data. Accurately analysing subjective information from this
data is the task of sentiment analysis (an actively researched area in NLP). Deep
learning provides a diverse selection of architectures to model sentiment analysis
tasks and has surpassed other machine learning methods as the foremast approach for
performing sentiment analysis tasks. Recent developments in deep learning architectures
represent a shift away from Recurrent and Convolutional neural networks and the increasing
adoption of Transformer language models. Utilising pre-trained Transformer language
models to transfer knowledge to downstream tasks has been a breakthrough in NLP.This
survey applies a task-oriented taxonomy to recent trends in architectures with a focus
on the theory, design and implementation. To the best of our knowledge, this is the
only survey to cover state-of-the-art Transformer-based language models and their
performance on the most widely used benchmark datasets. This survey paper provides
a discussion of the open challenges in NLP and sentiment analysis. The survey covers
five years from 1st July 2017 to 1st July 2022. |
|
|
|
Pages:
1–37 |
DOI:
10.1145/3548772 |
|
|
|
|
|
|
|
|
|
|
A Critical Review on the Use (and Misuse) of Differential Privacy in Machine Learning |
|
|
|
Alberto Blanco-Justicia, David Sánchez, Josep Domingo-Ferrer and Krishnamurty Muralidhar |
|
|
|
We review the use of differential privacy (DP) for privacy protection in machine learning
(ML). We show that, driven by the aim of preserving the accuracy of the learned models,
DP-based ML implementations are so loose that they do not offer the ex ante privacy
guarantees of DP. Instead, what they deliver is basically noise addition similar to
the traditional (and often criticized) statistical disclosure control approach. Due
to the lack of formal privacy guarantees, the actual level of privacy offered must
be experimentally assessed ex post, which is done very seldom. In this respect, we
present empirical results showing that standard anti-overfitting techniques in ML
can achieve a better utility/privacy/efficiency tradeoff than DP. |
|
|
|
Pages:
1–16 |
DOI:
10.1145/3547139 |
|
|
|
|
|
|
|
|
|
|
Privacy Intelligence: A Survey on Image Privacy in Online Social Networks |
|
|
|
Chi Liu, Tianqing Zhu, Jun Zhang and Wanlei Zhou |
|
|
|
Image sharing on online social networks (OSNs) has become an indispensable part of
daily social activities, but it has also increased the risk of privacy invasion. An
online image can reveal various types of sensitive information, prompting the public
to rethink individual privacy needs in OSN image sharing critically. However, the
interaction of images and OSN makes the privacy issues significantly complicated.
The current real-world solutions for privacy management fail to provide adequate personalized,
accurate, and flexible privacy protection. Constructing a more intelligent environment
for privacy-friendly OSN image sharing is urgent in the near future. Meanwhile, given
the dynamics in both users’ privacy needs and OSN context, a comprehensive understanding
of OSN image privacy throughout the entire sharing process is preferable to any views
from a single side, dimension, or level. To fill this gap, we contribute a survey
of “privacy intelligence” that targets modern privacy issues in dynamic OSN image
sharing from a user-centric perspective. Specifically, we present the important properties
and a taxonomy of OSN image privacy, along with a high-level privacy analysis framework
based on the lifecycle of OSN image sharing. The framework consists of three stages
with different principles of privacy by design. At each stage, we identify typical
user behaviors in OSN image sharing and their associated privacy issues. Then a systematic
review of representative intelligent solutions to those privacy issues is conducted,
also in a stage-based manner. The analysis results in an intelligent “privacy firewall”
for closed-loop privacy management. Challenges and future directions in this area
are also discussed. |
|
|
|
Pages:
1–35 |
DOI:
10.1145/3547299 |
|
|
|
|
|
|
|
|
|
|
A Survey on DNS Encryption: Current Development, Malware Misuse, and Inference Techniques |
|
|
|
Minzhao Lyu, Hassan Habibi Gharakheili and Vijay Sivaraman |
|
|
|
The domain name system (DNS) that maps alphabetic names to numeric Internet Protocol
(IP) addresses plays a foundational role in Internet communications. By default, DNS
queries and responses are exchanged in unencrypted plaintext, and hence, can be read
and/or hijacked by third parties. To protect user privacy, the networking community
has proposed standard encryption technologies such as DNS over TLS (DoT), DNS over
HTTPS (DoH), and DNS over QUIC (DoQ) for DNS communications, enabling clients to perform
secure and private domain name lookups. We survey the DNS encryption literature published
from 2016 to 2021, focusing on its current landscape and how it is misused by malware,
and highlighting the existing techniques developed to make inferences from encrypted
DNS traffic. First, we provide an overview of various standards developed in the space
of DNS encryption and their adoption status, performance, benefits, and security issues.
Second, we highlight ways that various malware families can exploit DNS encryption
to their advantage for botnet communications and/or data exfiltration. Third, we discuss
existing inference methods for profiling normal patterns and/or detecting malicious
encrypted DNS traffic. Several directions are presented to motivate future research
in enhancing the performance and security of DNS encryption. |
|
|
|
Pages:
1–28 |
DOI:
10.1145/3547331 |
|
|
|
|
|
|
|
|
|
|
Adversarial Attacks and Defenses in Deep Learning: From a Perspective of Cybersecurity |
|
|
|
Shuai Zhou, Chi Liu, Dayong Ye, Tianqing Zhu, Wanlei Zhou and Philip S. Yu |
|
|
|
The outstanding performance of deep neural networks has promoted deep learning applications
in a broad set of domains. However, the potential risks caused by adversarial samples
have hindered the large-scale deployment of deep learning. In these scenarios, adversarial
perturbations, imperceptible to human eyes, significantly decrease the model’s final
performance. Many papers have been published on adversarial attacks and their countermeasures
in the realm of deep learning. Most focus on evasion attacks, where the adversarial
examples are found at test time, as opposed to poisoning attacks where poisoned data
is inserted into the training data. Further, it is difficult to evaluate the real
threat of adversarial attacks or the robustness of a deep learning model, as there
are no standard evaluation methods. Hence, with this article, we review the literature
to date. Additionally, we attempt to offer the first analysis framework for a systematic
understanding of adversarial attacks. The framework is built from the perspective
of cybersecurity to provide a lifecycle for adversarial attacks and defenses. |
|
|
|
Pages:
1–39 |
DOI:
10.1145/3547330 |
|
|
|
|
|
|
|
|
|
|
Quantum Software Components and Platforms: Overview and Quality Assessment |
|
|
|
Manuel A. Serrano, José A. Cruz-Lemus, Ricardo Perez-Castillo and Mario Piattini |
|
|
|
Quantum computing is the latest revolution in computing and will probably come to
be seen as an advance as important as the steam engine or the information society.
In the last few decades, our understanding of quantum computers has expanded and multiple
efforts have been made to create languages, libraries, tools, and environments to
facilitate their programming. Nonetheless, quantum computers are complex systems at
the bottom of a stack of layers that programmers need to understand. Hence, efforts
towards creating quantum programming languages and computing environments that can
abstract low-level technology details have become crucial steps to achieve a useful
quantum computing technology. However, most of these environments still lack many
of the features that would be desirable, such as those outlined in The Talavera Manifesto
for Quantum Software Engineering and Programming. For advancing quantum computing,
we will need to develop quantum software engineering techniques and tools to ensure
the feasibility of this new type of quantum software. To contribute to this goal,
this paper provides a review of the main quantum software components and platforms.
We also propose a set of quality requirements for the development of quantum software
platforms and the conduct of their quality assessment. |
|
|
|
Pages:
1–31 |
DOI:
10.1145/3548679 |
|
|
|
|
|
|
|
|
|
|
Recent Advances in Baggage Threat Detection: A Comprehensive and Systematic Survey |
|
|
|
Divya Velayudhan, Taimur Hassan, Ernesto Damiani and Naoufel Werghi |
|
|
|
X-ray imagery systems have enabled security personnel to identify potential threats
contained within the baggage and cargo since the early 1970s. However, the manual
process of screening the threatening items is time-consuming and vulnerable to human
error. Hence, researchers have utilized recent advancements in computer vision techniques,
revolutionized by machine learning models, to aid in baggage security threat identification
via 2D X-ray and 3D CT imagery. However, the performance of these approaches is severely
affected by heavy occlusion, class imbalance, and limited labeled data, further complicated
by ingeniously concealed emerging threats. Hence, the research community must devise
suitable approaches by leveraging the findings from existing literature to move in
new directions. Towards that goal, we present a structured survey providing systematic
insight into state-of-the-art advances in baggage threat detection. Furthermore, we
also present a comprehensible understanding of X-ray-based imaging systems and the
challenges faced within the threat identification domain. We include a taxonomy to
classify the approaches proposed within the context of 2D and 3D CT X-ray-based baggage
security threat screening and provide a comparative analysis of the performance of
the methods evaluated on four benchmarks. Besides, we also discuss current open challenges
and potential future research avenues. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3549932 |
|
|
|
|
|
|
|
|
|
|
A Comprehensive Survey on Poisoning Attacks and Countermeasures in Machine Learning |
|
|
|
Zhiyi Tian, Lei Cui, Jie Liang and Shui Yu |
|
|
|
The prosperity of machine learning has been accompanied by increasing attacks on the
training process. Among them, poisoning attacks have become an emerging threat during
model training. Poisoning attacks have profound impacts on the target models, e.g.,
making them unable to converge or manipulating their prediction results. Moreover,
the rapid development of recent distributed learning frameworks, especially federated
learning, has further stimulated the development of poisoning attacks. Defending against
poisoning attacks is challenging and urgent. However, the systematic review from a
unified perspective remains blank. This survey provides an in-depth and up-to-date
overview of poisoning attacks and corresponding countermeasures in both centralized
and federated learning. We firstly categorize attack methods based on their goals.
Secondly, we offer detailed analysis of the differences and connections among the
attack techniques. Furthermore, we present countermeasures in different learning framework
and highlight their advantages and disadvantages. Finally, we discuss the reasons
for the feasibility of poisoning attacks and address the potential research directions
from attacks and defenses perspectives, separately. |
|
|
|
Pages:
1–35 |
DOI:
10.1145/3551636 |
|
|
|
|
|
|
|
|
|
|
Remote Electronic Voting in Uncontrolled Environments: A Classifying Survey |
|
|
|
Michael P. Heinl, Simon Gölz and Christoph Bösch |
|
|
|
Remote electronic voting, often called online or Internet voting, has been subject
to research for the last four decades. It is regularly discussed in public debates,
especially in the context of enabling voters to conveniently cast their ballot from
home using their personal devices. Since these devices are not under the control of
the electoral authority and could be potentially compromised, this setting is referred
to as an “uncontrolled environment” for which special security assumptions have to
be considered.This paper employs general election principles to derive cryptographic,
technical, and organizational requirements for remote electronic voting. Based on
these requirements, we have extended an existing methodology to assess online voting
schemes and develop a corresponding reference attacker model to support the preparation
of tailored protection profiles for different levels of elections. After presenting
a broad survey of different voting schemes, we use this methodology to assess and
classify those schemes comparatively by leveraging four election-specific attacker
models. |
|
|
|
Pages:
1–44 |
DOI:
10.1145/3551386 |
|
|
|
|
|
|
|
|
|
|
Formal Concept Analysis Applications in Bioinformatics |
|
|
|
Sarah Roscoe, Minal Khatri, Adam Voshall, Surinder Batra, Sukhwinder Kaur and Jitender Deogun |
|
|
|
The bioinformatics discipline seeks to solve problems in biology with computational
theories and methods. Formal concept analysis (FCA) is one such theoretical model,
based on partial orders. FCA allows the user to examine the structural properties
of data based on which subsets of the dataset depend on each other. This article surveys
the current literature related to the use of FCA for bioinformatics. The survey begins
with a discussion of FCA, its hierarchical advantages, several advanced models of
FCA, and lattice management strategies. It then examines how FCA has been used in
bioinformatics applications, followed by future prospects of FCA in those areas. The
applications addressed include gene data analysis (with next-generation sequencing),
biomarkers discovery, protein-protein interaction, disease analysis (including COVID-19,
cancer, and others), drug design and development, healthcare informatics, biomedical
ontologies, and phylogeny. Some of the most promising prospects of FCA are identifying
influential nodes in a network representing protein-protein interactions, determining
critical concepts to discover biomarkers, integrating machine learning and deep learning
for cancer classification, and pattern matching for next-generation sequencing. |
|
|
|
Pages:
1–40 |
DOI:
10.1145/3554728 |
|
|
|
|
|
|
|
|
|
|
Honeyword-based Authentication Techniques for Protecting Passwords: A Survey |
|
|
|
Nilesh Chakraborty, Jianqiang Li, Victor C. M. Leung, Samrat Mondal, Yi Pan, Chengwen Luo and Mithun Mukherjee |
|
|
|
Honeyword (or decoy password) based authentication, first introduced by Juels and
Rivest in 2013, has emerged as a security mechanism that can provide security against
server-side threats on the password-files. From the theoretical perspective, this
security mechanism reduces attackers’ efficiency to a great extent as it detects the
threat on a password-file so that the system administrator can be notified almost
immediately as an attacker tries to take advantage of the compromised file. This paper
aims to present a comprehensive survey of the relevant research and technological
developments in honeyword-based authentication techniques. We cover twenty-three techniques
related to honeyword, reported under different research articles since 2013. This
survey paper helps the readers to (i) understand how honeyword based security mechanism
works in practice, (ii) get a comparative view on the existing honeyword based techniques,
and (iii) identify the existing gaps that have yet to be filled and the emergent research
opportunities. |
|
|
|
Pages:
1–37 |
DOI:
10.1145/3552431 |
|
|
|
|
|
|
|
|
|
|
Evaluating Recommender Systems: Survey and Framework |
|
|
|
Eva Zangerle and Christine Bauer |
|
|
|
The comprehensive evaluation of the performance of a recommender system is a complex
endeavor: many facets need to be considered in configuring an adequate and effective
evaluation setting. Such facets include, for instance, defining the specific goals
of the evaluation, choosing an evaluation method, underlying data, and suitable evaluation
metrics. In this article, we consolidate and systematically organize this dispersed
knowledge on recommender systems evaluation. We introduce the Framework for Evaluating
Recommender systems (FEVR), which we derive from the discourse on recommender systems
evaluation. In FEVR, we categorize the evaluation space of recommender systems evaluation.
We postulate that the comprehensive evaluation of a recommender system frequently
requires considering multiple facets and perspectives in the evaluation. The FEVR
framework provides a structured foundation to adopt adequate evaluation configurations
that encompass this required multi-facetedness and provides the basis to advance in
the field. We outline and discuss the challenges of a comprehensive evaluation of
recommender systems and provide an outlook on what we need to embrace and do to move
forward as a research community. |
|
|
|
Pages:
1–38 |
DOI:
10.1145/3556536 |
|
|
|
|
|
|
|
|
|