November 17th, 2021 in Coimbra, Portugal

Data: Acquisition to Analysis

A SenSys/BuildSys 2021 Workshop


Zoom Link:

YouTube Link:

NOTE: Time listed below are in UTC, click "link to world clock" to convert to your local time.

The workshop will be held on Wednesday, November 17th, from 2-4pm and 5-6:30pm in UTC.

Welcome + Introduction!

14:00 - 14:10 Link to World Clock

** Panel Discussion at: 15:00 - 16:00 ** Link to World Clock

Accepted Papers (Session 1):

14:10 - 14:55 Link to World Clock

Person Re-ID Testbed with Multi-Modal Sensors

14:10 - 14:25

Guangliang Zhao, Guy Ben-yosef, Jianwei Qiu, Yang Zhao, Prabhu Janakaraj, Sriram Boppana, Austars R Schnore (General Electric Research)

Dataset: Environmental Impact on the Long-Term Connectivity and Link Quality of an Outdoor LoRa Network

14:25 - 14:35

Pei Tian (Shanghai Advanced Research Institute, CAS), Fengxu Yang (Shanghai Advanced Research Institute, CAS & ShanghaiTech University), Xiaoyuan Ma (SKF Group, China), Carlo Alberto Boano (Graz University of Technology), Xin Tian (Shanghai Advanced Research Institute, CAS), Ye Liu (Nanjing Agricultural University), Jianming Wei (Shanghai Advanced Research Institute, CAS)

Dataset: Enabling Offline Tuning of Fat Channel Communication

14:35 - 14:45

Konrad-Felix Krentz, Padmal Madhushanka, Bappaditya Mandal, Robin Augustine, Thiemo Voigt (Uppsala universitet)

Dataset: Large-scale Urban IoT Activity Data for DDoS Attack Emulation

14:45 - 14:55

Arvin Hekmati, Eugenio,Grippo, Bhaskar Krishnamachari (University of Southern California)

** Panel Discussion !! **

15:00 - 16:00 Link to World Clock

Break (Lunch/Dinner/Night snack)

16:00 - 17:00 Link to World Clock

Accepted Papers (Session 2):

17:00 - 18:30 Link to World Clock

Footstep-Induced Floor Vibration Dataset: Reusability and Transferability Analysis

17:00 - 17:15

Zhizhang Hu, Yue Zhang, Shijia Pan (University of California, Merced)

Dataset: Analysis of IFTTT Recipes to Study How Humans Use Internet-of-Things (IoT) Devices

17:15 - 17:25

Haoxiang Yu, Jie Hua, Christine Julien (University of Texas at Austin)

Dataset: Thermal Energy Harvesting Profiles in Residential Settings

17:25 - 17:35

Victor Ariel Leal Sobral (University of Virginia), John Lach (The George Washington University), Jonathan L. Goodall, Bradford Campbell (University of Virginia)

Dataset: Container Escape Detection for Edge Devices

17:35 - 17:45

James Pope, Francesco Raimondo (University of Bristol), Vijay Kumar (Toshiba Europe Limited), Ryan McConville, Robert Piechocki, George Oikonomou (University of Bristol),Thomas Pasquier (University of British Columbia), Bo Luo, Dan Howarth (Smartia Ltd), Ioannis Mavromatis, Pietro Carnelli, Adrian Sanchez-Mompo, Theodoros Spyridopoulos, Aftab Khan (Toshiba Europe Limited)


17:45 - 17:50

Dataset: A Low-resolution infrared thermal dataset and potential privacy-preserving applications

17:50 - 18:00

Shuai Zhu, Thiemo Voigt, Daniel F. Perez-Ramirez, Joakim Eriksson (RISE Research Institutes of Sweden)

Dataset: Longitudinal personal thermal comfort preference data in the wild

18:00 - 18:10

Matias Quintana, Mahmoud Abdelrahman, Mario Frei (National University of Singapore), Federico Tartarini (Berkeley Education Alliance for Research in Singapore), Clayton Miller (National University of Singapore)

Dataset: Motion Tracklet Oriented 6-DoF Inertial Tracking Using Commodity Smartphones

18:10 - 18:20

Peize Li, Chris Xiaoxuan Lu, Peize Li (University of Edinburgh)

Open Discussion

18:20 - 18:30?

Panel Discussion (16:00 - 17:00)


Data Acquisition, Analysis and Reuse for AI + IoT Applications


Shijia Pan

Dr. Shijia Pan

Assistant Professor, University of California Merced

Dr. Shijia Pan is an assistant professor in the Computer Science and Engineering Department at the University of California Merced. She received her bachelor’s degree in Computer Science and Technology from the University of Science and Technology of China (USTC) and her Ph.D. degree in Electrical and Computer Engineering at Carnegie Mellon University (CMU). Her research interests include cyber-physical systems, Internet-of-Things (IoT), and ubiquitous computing. She worked in multiple disciplines and focused on self-assessing and self-adaptive heterogeneous cyber-physical systems for accurate indoor occupant information inference with limited resources.

Radislav A. Potyrailo

Dr. Radislav A. Potyrailo

Principal Scientist at GE Research

Dr. Radislav A. Potyrailo is a Principal Scientist at GE Research. He received PhD in Analytical Chemistry from Indiana University, Bloomington, IN in 1998. Dr. Potyrailo directs programs on innovative multi-response gas and biological sensors for diverse applications. His passion is to bring innovative sensing systems from laboratory feasibility to field validations and to commercialization. Dr. Potyrailo has been serving as a technical lead on GE R&D programs transitioned to GE businesses or GE partners for commercialization. Examples include optical multi-parameter chemical sensor for GE Water, wireless gas sensors for GE Oil & Gas, multi-parameter oil sensor for GE Renewable Energy, and GE Ventures start-up company on radio-frequency sensors. Dr. Potyrailo has been serving as a Principal Investigator on US Government programs funded by AFRL, DARPA, DHS, DOE, DTRA, NIH, NIOSH, and other agencies. Dr. Potyrailo has summarized some of his results in 140+ granted US Patents and numerous technical publications on transducers, sensing materials, and data analytics; his Google Scholar h-index is 50 and his i10-index is 215. Dr. Potyrailo is the initiator and a co-organizer of the First Gordon Research Conference on Combinatorial and High Throughput Materials Science. He serves as an editor of the Springer-Nature book series Integrated Analytical Systems. Dr. Potyrailo is the North America Regional Chair of International Society for Olfaction and Chemical Sensing, and is the Chair of the Device Working Group of the MEMS and Sensors Industry Group. His recognitions include Prism Award by Photonics Media, Innovation Award by the Association for Sensor and Measurement Technology, and SPIE Fellow.

Tao Gao

Dr. Tao Gao

Assistant Professor, University of California Los Angeles

Dr. Gao explores the visual roots of human social perception and cognition. He builds models of artificial social intelligence with human-like visual commonsense which — just by sharing the same visual environment — can cooperate, and communicate with humans in intuitive, effective, and trustworthy ways. He obtained his Ph.D. in cognitive psychology from Yale in 2011. He was a post-doctoral fellow in the Center of Brain, Mind and Machine at MIT between 2011-2015. He then worked at GE research as a computer vision scientist between 2015-2017. He has been jointly appointed to the departments of Statistics, Communication and Psychology at UCLA since 2017.

Qiang Xu

Dr. Qiang Xu

Founder and Chief Scientist at XYZ10

Qiang is the founder and the chief scientist of XYZ10, which is a startup company offering a reliable indoor-outdoor positioning service. Qiang received his Ph.D from the University of Michigan in 2013. Before founding XYZ10, Qiang has been working at NEC Labs Amercia, Boeing Labs, Microsoft Research, AT&T Labs Research. Along this career path, Qiang has been collected diverse hands-on experience regarding systems, networking, AIoT, machine learning, etc.

Check Call for Papers for information on submission!


As the enthusiasm for and success of the Internet of Things (IoT), Cyber-Physical Systems (CPS), and Smart Buildings grows, so too does the volume and variety of data collected by these systems. How do we ensure that this data is of high quality, and how do we maximize the utility of collected data such that many projects can benefit from the time, cost, and effort of deployments?

The Data: Acquisition To Analysis (DATA) workshop aims to look broadly at interesting data from interesting sensing systems. The workshop considers problems, solutions, and results from all across the real-world data pipeline. We solicit submissions on unexpected challenges and solutions in the collection of datasets, on new and novel datasets of interest to the community, and on experiences and results—explicitly including negative results—in using prior datasets to develop new insights.

The workshop aims to bring together a community of application researchers and algorithm researchers in the sensing systems and building domains to promote breakthroughs from integration of the generators and users of datasets. The workshop will foster cross-domain understanding by enabling both the understanding of application needs and data collection limitations.


The workshop seeks contributions across two major thrusts, but is open to a broad view of interesting questions around the collection, dissemination, and use of data as well as interesting datasets:

The collection and use of data

  • - Challenges and solutions in data collection, especially around security and privacy
  • - Expectations and norms for data collection from sensor networks, especially those that involve human factors
  • - Novel insights from existing datasets
  • - Metadata management for complex datasets
  • - Synthetic data, including its generation, application, and utility
  • - Success stories—key properties of useful datasets and how to generalize these
  • - Preprocessing, cleaning, and fusing datasets
  • - analysis and visualization of the data
  • - Shortcomings of prior datasets—and how to address these in the future
  • - Position papers on policies and norms from experimental design through data management and use are explicitly welcomed

New and interesting datasets, including but not limited to:

  • - Shopping related sensing data
  • - Animal related data or sensed data
  • - Anonymized health, or synthetic health related data
  • - Indoor localization, especially unprocessed/unfiltered physical layer measurements
  • - Smart building, occupancy, motion data, energy, human comfort, vibration, BIM
  • - Vehicular, GPS, cellular, or wifi traces and remote sensing
  • - Reproductions of prior work that validate, refute, or enhance results
  • - Anonymized contact tracing, interaction and exposure notification data

To enable the longevity of submitted datasets, we plan on providing a central location where a repository for the data, and information about the data can be archived for at least 5 years.

Submission Guidelines

Submissions may range from 1-5 pages in PDF format, excluding references, using the standard ACM conference template. DATA 2021 follows the single-blind review policy. The names and affiliations of all the authors must be present in the submitted manuscript. Submissions are strongly encouraged to use only as much space as needed to clearly convey the significance of the work—we fully expect many submissions, especially datasets, to use only 1-2 pages, but wish to allow those interested in fully elucidating positions on data collection and use or insights from reproducibility efforts ample space to do so. Submissions should use only as much space as necessary to clearly convey their ideas and contributions.

Dataset submissions should prefix paper titles with “Dataset: “ and must include a description of the dataset as well as a reasonable accompanying data sample. Once accepted, a full described dataset must be shared to a public repository by the camera ready deadline. Issues on licenses will be resolved by generally following the procedure similar to CRAWDAD ( and special treatments, if needed, will be discussed separately with the TPC chairs. The dataset submission must submit a link to the dataset at the time of submission.

Datasets will be reviewed by an artifact evaluation committee. To support this, dataset submissions must include:

  • - A link to the full dataset (not just a single sample) at the time of submission
  • - An example analysis or result from the dataset (what kind of insights might folks glean?)
  • - Steps to run an analysis on the dataset, e.g.
    • - A graph and the steps (sample code) to generate the graph
    • - A video demonstrating access and manipulation of the data or execution of queries and results on the data
    • - Other evidence or demonstration of how the dataset can be accessed and used

The evaluation committee will work with sumbitters to ask clarifying questions, etc. The goal is not to be a barrier to submission, but instead to help make sure datasets are usable and useful for folks in the future.

Each accepted submission is required to have at least one author attend the workshop and present to the workshop attendees.

Submission site: link

Important Dates

Abstract Registration: September 24th, 2021, AOE , HotCRP

Submission Deadline: September 24th, 2021, AOE

Notifications: October 15th, 2021

Camera-ready: October 26th, 2021 Firm Deadline

Workshop: November 17th, 2021

Useful links

Submission Site (HotCRP)


Co-Chairs & TPC Chairs

Gabe Fierro University of California, Berkeley

Yang Zhao GE Research

Steering Committee

Jie Gao Stony Brook University

Pei Zhang Carnegie Mellon University

Flora Salim RMIT University

Mikkel Baun Kjærgaard University of Southern Denmark

Shijia Pan University of California, Merced

Pat Pannuto University of California, Berkeley

Prabal Dutta University of California, Berkeley

Jie Liu Harbin Institute of Technology

Chien-Chun Ni Yahoo! Research

Haeyoung Noh Carnegie Mellon University


Shiwei Fang University of North Carolina at Chapel Hill

Technical Program Committee

Wan Du University of California, Merced

Andreas Reinhardt Technical University of Clausthal

Shiwei Fang University of North Carolina at Chapel Hill

Trevor Pering Google

Rachel Cardell-Oliver University of Western Australia

Clayton Miller National University of Singapore

Branden Ghena Northwestern University

Jorge Ortiz Rutgers University

Jason Koh Mapped

Zhenxiong Li University of Colorado Denver

Luca Davoli University of Parma

Jonathon Fagert Baldwin Wallace University

Artifact Evaluation Committee

Jens Hjort Schwee University of Southern Denmark

Jingxiao Liu Stanford University

Adeola Bannis Carnegie Mellon University

Nishant Bhaskar University of California San Diego

Rishiraj Adhikary Indian Institutes of Technology


The 4th DATA workshop is part of (co-located with) SenSys/BuildSys 2021.

For venue details, visa information, etcetera please visit the SenSys venue page.