The 5th IEEE Workshop on
Human-in-the-Loop Methods and Future of Work in BigData
(IEEE HMData 2021)

co-located with IEEE Bigdata 2021
Orlando, FL, USA, Dec. 15th (Fixed)
Now Taking Place Virtually

Photo courtesy of wikimedia commons
About Keynotes Dates and Submission Program Organization HMData 2020 IEEE Bigdata 2021

About IEEE HMData 2021


HMData workshop, which originally started as the "Human-Machine collaboration in BigData" workshop, will investigate the opportunities and challenges in human machine collaboration in work with bigdata, which are described by two terms: Human-in-the-Loop Methods and Future of Work. Human-in-the-Loop is a term focusing on the employer's viewpoint while Future of Work focuses more on worker's viewpoint, in both of which the division of labor among humans and machines is a key issue. This area is likely to be heavily AI driven, and we intend to invite papers covering the following aspects, (a) Capturing human capabilities through intelligent models and how to adapt them through changing perceptions, needs, and skills. (2) High level tools that provide the ability for all stakeholders in the new ecosystem, including regulators for policies and AI workers, to specify their requirements. (3) system design and engineering of job platforms for collection, storage, retrieval, and analysis of data deluge about workers, jobs, and their activities. (4) Benchmarking and the development of appropriate metrics to measure system performance as well as human aspects, such as satisfaction, capital advancement, and equity.

We welcome any interesting ideas and results on any relevant topics, but we also encourage submitting papers on new projects inspired by the COVID-19 crisis, such as those on human-in-the-loop solutions in the pandemic, those on re-evaluating how we organize labor and how we share work with machines in the future. To make the workshop an attractive place for those people, we solicit practitioner papers as well as research papers, in order to facilitate discussion among researchers who know solutions and practitioners who know problems. We also would like to make the place valuable for young researchers. All papers accepted for the workshop will be included in the Workshop Proceedings published by the IEEE Computer Society Press, made available at the Conference.


This workshop covers a wide range of topics of human-machine collaboration in work with bigdata. Keywords include: crowdsourcing, collaborative recommendation, crowdsensing, workflow model for humans and machines, incentives, human-assisted bigdata analysis, bigdata-human interaction, human-machine collaboration in real-world applications (such as natural disaster response, education, and citizen science), and ELSI in Human-in-the-loop systems and Future of Work. We expect submissions to address some of the following issues:
  1. capturing human characteristics and capabilities,
  2. stakeholder requirement specification,
  3. social processes around the human-in-the-loop systems,
  4. platforms and ecosystems,
  5. computation capabilities, and
  6. benchmarks and metrics for human-in-the-loop systems and Future of Work


Data Excellence: Better Data for Better AI
Lora Aroyo (Google)
Abstract: The efficacy of machine learning (ML) models depends on both algorithms and data. Training data defines what we want our models to learn, and testing data provides the means by which their empirical progress is measured. Benchmark datasets define the entire world within which models exist and operate, yet research continues to focus on critiquing and improving the algorithmic aspect of the models rather than critiquing and improving the data with which our models operate. If "data is the new oil," we are still missing work on the refineries by which the data itself could be optimized for more effective use. In this talk, I will discuss data excellence and lessons learned from software engineering to achieve the scare and rigor in assessing data quality.

Bio: Lora Aroyo is a Full Professor in Computer Science, currently working as a visiting research faculty at Google, NYC. Previously she was a visiting scholar at the Columbia Data Science Institute at Columbia University, New York. She is also Chief of Science for a NY-based startup Tagasauris, which works on hybrid machine learning and human-assisted computing strategies to enrich multimedia (e.g. video, images, and text) with meaningful information about its content, and ultimately improve video search and discovery. Lora is an active member of the Human Computation, User Modeling & Semantic Web communities. She is president of the User Modeling community UM Inc, which serves as a steering committee for the ACM Conference Series “User Modeling, Adaptation and Personalization” (UMAP) sponsored by SIGCHI and SIGWEB. She is also a member of the ACM SIGCHI conferences board. Since 2010 she has actively worked towards shaping the concept of “User-Centric Data Science“, which ultimately led to the forming of and heading the User-centric Data Science group at the Department of Computer Science, Vrije Universiteit Amsterdam, The Netherlands. As an expert in user-centric data science, Lora conceived the vision of an user-centric experimental lab for computer science researchers at the VU University Amsterdam. She headed the team that made it possible in 2010 to open VU INTERTAIN Lab – the first of its kind in an academic environment. Throughout her carrier, Lora was a principal investigator of a large number of research projects, she organized conferences, workshops, and tutorials to bring together methods and tools from human computation, linked (open) data, data science & human-computer interaction with the goal of building hybrid human-AI systems for augmenting both machine and human intelligence for understanding text, images, and videos with humans-in-the-loop and machines-in-the-loop. Her research projects focussing on semantic search, recommendation systems, personalized access to online multimedia collections have a major impact and established her as a recognized leader in human computation techniques for specific domains, such as digital humanities, cultural heritage, and interactive TV.

Program (PDF Version)

The workshop starts at 1PM on Dec. 15 (Eastern Standard Time)
New York, USA Wed, 15 Dec 2021 at 13:00 EST
Paris, France Wed, 15 Dec 2021 at 19:00 CET
Tokyo, Japan Thu, 16 Dec 2021 at 03:00 JST

Labels: [LR] - Research Paper (Long Presentation), [R] Research Paper (Short Presentation), [W] Project-in-Progress Paper (Short Presentation), [P] Practitioner Paper (Short Presentation)

1:00PM Opening (WS Chairs)

1:05-2:40 Session 1 (Chair: Alex Quinn)

1:05 Keynote by Lora Aroyo (Google)
Data Excellence: Better Data for Better AI

1:55-2:00 Short Break

2:00 [LR] Yuya Itoh and Shigeo Matsubara. Adaptive Budget Allocation for Cooperative Task Solving in Crowdsourcing
2:20 [P] Thais Rodrigues Neubauer, Glaucia Pamponet Sobrinho, Marcelo Fantinato, and Sarajane Marques Peres. Visualization for Enabling Human-in-the-Loop in Trace Clustering-based Process Mining Tasks
2:32 [W] Kanta Negishi, Hiroyoshi Ito, Masaki Matsubara, and Atsuyuki Morishima. A Skill-based Worksharing Approach for Microtask Assignment
2:44 [W] Shotaro Ishihara, Yuta Matsuda, and Norihiko Sawa. Editors-in-the-loop News Article Summarization Framework with Sentence Selection and Compression

2:56-3:20 Break

3:20-4:30 Session 2 (Chair: Senjuti Basu Roy)

3:20-3:40 [LR] Naofumi Osawa, Hiroyoshi Ito, Yukihiro Fukusima, Takashi Harada, and Atsuyuki Morishima. BUBBLE : A Quality-Aware Human-in-the-loop Entity Matching Framework
3:40-3:52 [R] Yunchong Zhang, Baisong Liu, Jiangbo Qian, Jiangcheng Qin, Xueyuan Zhang, and Xueyong Jiang. An Explainable Person-Job Fit Model Incorporating Structured Information
3:52-4:04 [P] Rina Kagawa, Yukino Baba, and Hideo Tsurushima. A Practical and Universal Framework for Generating Publicly Available Medical Notes of Authentic Quality via the Power of Crowds
4:04-4:16 [P] Ying Zhong, Masaki Kobayashi, Masaki Matsubara, and Atsuyuki Morishima. Does Multi-Hop Crowdsourcing Work? A Case Study on Collecting COVID19 Local Information
4:16-4:28 [W] Ayato Watanabe and Keishi Tajima. Spammer Detection Based on Task Completion Time Variation in Crowdsourcing

4:30-4:40 Break

4:40-5:50 Session 3 (Chair: Atsuyuki Morishima)

4:40-5:00 [LR] Yuhao Chen and Farhana Zulkernine. BIRD-QA: A BERT-based Information Retrieval Approach to Domain Specific Question Answering
5:00-5:12 [R] Catherine Inibhunu, Carolyn McGregor, and Edward Pugh. An Alert Notification Subsystem for AI Based Clinical Decision Support: A Protoype in NICU
5:12-5:24 [W] Munenari Inoguchi. Development of Prototype System to Generate Chronological Response Scenario Dataset by Assembling Multi-responders’ Action Logs at Past Disaster
5:24-5:36 [W] Ryusei Arisawa, Panote Siriaraya, Da Li, Kazutoshi Sumiya, Yukiko Kawai, and Shinsuke Nakajima. A Rival Recommendation Approach for Acoustic AR Running Support System Considering the Athletic Ability of Users
5:36-5:48 [W] Hisatoshi Toriya, Ashraf Dewan, and Itaru Kitahara. A Method to Correct Perspective Distortion of Ground Area without Camera Parameters

5:50 Closing Remarks (WS Chairs)

Important Dates

  • Oct 15 (Fri), 2021: Due date for workshop papers submission Extended
    (Authors have to submit the title and abstract by Oct. 8 (Fri))
  • Nov 10 (Wed), 2021: Notification of paper acceptance to authors
  • Nov 20 (Sat), 2021: Camera-ready of accepted papers
  • Dec 15-18 (Wed-Sat), 2021: Workshops


All submissions must be submitted electorically through the submission page. Please prefix your submission category such as [Research Paper] to the Title of Paper field in the submission page. For example, if you would like to submit a project-in-progress paper "Crowd-centric Approach to Digital Archive Maintenance," you have to put "[project-in-progress paper] Crowd-centric Approach to Digital Archive Maintenance" into the Title of Paper field.
All papers accepted for the workshop will be included in the Workshop Proceedings published by the IEEE Computer Society Press, made available at the Conference.

Submission Categories

  • Research Papers (*) (long presentation): They report significant and original results relevant to the scope of this workshop. We solicit innovative or thought-provoking work but they do not necessarily have to reach the level of completion. The expected length is between 4 and 6 pages. The maximum length is 10 pages, though the paper should be commensurate with the size of the contribution.
  • Practitioner papers (*)(long presentation): They present interesting problems that require human-in-the-loop solutions in a variety of application domains, or present the interesting results of applying existing human-in-the-loop solutions to their domains. The expected length is between 4 and 6 pages. The maximum length is 10 pages, though the paper should be commensurate with the size of the contribution.
  • Project-in-progress papers (short presentation): They present the goals, challenges, and preliminary results of research or real-world projects in progress. The maximum length is 3 pages.
(*) Some of the papers submitted to the research or practitioner paper categories may be accepted as project-in-progress papers and allotted to short presentation slots.


Papers should be formatted to IEEE Computer Society Proceedings Manuscript Formatting Guidelines in the IEEE Bigdata 2021 CFP page



Senjuti Basu Roy (NJIT)
Alex Quinn (Purdue University)
Atsuyuki Morihsima (University of Tsukuba)

Program Committee (to be extended)

  • Yukino Baba (University of Tsukuba)
  • Wolf-Tilo Balke (Technische Universitaet Braunschweig)
  • Ria Mae Borromeo (University of the Philippines Open University)
  • Francois Charoy (University of Lorraine, Inria, CNRS)
  • Marina Danilevsky (IBM Research - Almaden)
  • Ashraf Dewan (Curtin University)
  • Gianluca Demartini (University of Queensland)
  • Shady Elbassuoni (American University of Beirut)
  • Ujwal Gadiraju (Delft University of Technology)
  • David Gross Amblard (Rennes 1 University / IRISA Lab)
  • Munenari Inoguchi (University of Toyama)
  • Vana Kalogeraki (Athens University of Economics and Business)
  • Masaki Matsubara (University of Tsukuba)
  • Satoshi Oyama (Hokkaido University)
  • Raghav Rao (University of Texas at San Antonio)
  • Naoki Sakai (National Research Institute for Earth Science and Disaster Resilience)
  • Keishi Tajima (Kyoto University)
  • Hisashi Toriya (Akita University)
  • Saravanan Thirumuruganathan (QCRI)
  • Demetris Zeinalipour (University of Cyprus)