The First IEEE Workshop on
Human-Machine Collaboration in BigData

co-located with IEEE Bigdata 2017
Boston, Dec. 11th

Photo courtesy of wikimedia commons
AboutKeynotesDatesSubmission Program Organization

About HMData 2017


Human power is a key factor to maximize the impact of bigdata technologies. This workshop addresses human-in-the-loop approaches in bigdata lifecycle - in collecting, processing, analyzing, utilizing, archiving and disposing them. The purpose of this workshop is to give excellent opportunities for students, researchers and practitioners to identify important research problems and exchange their ideas on human-in-the-loop in the bigdata context. To make the workshop an attractive place for those people, we plan to solicit practitioner papers as well as research papers, in order to facilitate discussion among researchers who know solutions and practitioners who know problems. We also would like to make the place valuable for young researchers. We plan to provide student travel support. All papers accepted for the workshop will be included in the Workshop Proceedings published by the IEEE Computer Society Press, made available at the Conference.


This workshop covers a wide range of human-related topics in the bigdata context, such as crowdsourcing, collaborative recommendation, crowdsensing, workflow model for humans and machines, incentives, human-assisted bigdata analysis, bigdata-human interaction, and human-machine collaboration in real-world problems.


Targeted Crowdsourcing with a Billion (Potential) Users
Panos Ipeirotis (NYU Stern School of Business)
Abstract: We describe Quizz, a gamified crowdsourcing system that simultaneously assesses the knowledge of users and acquires new knowledge from them. Quizz operates by asking users to complete short quizzes on specific topics; as a user answers the quiz questions, Quizz estimates the user�s competence. To acquire new knowledge, Quizz also incorporates questions for which we do not have a known answer; the answers given by competent users provide useful signals for selecting the correct answers for these questions. Quizz actively tries to identify knowledgeable users on the Internet by running advertising campaigns, effectively leveraging �for free� the targeting capabilities of existing, publicly available, ad placement services. Quizz quantifies the contributions of the users using information theory and sends feedback to the advertising system about each user. The feedback allows the ad targeting mechanism to further optimize ad placement. Our experiments, which involve over ten thousand users, confirm that we can crowdsource knowledge curation for niche and specialized topics, as the advertising network can automatically identify users with the desired expertise and interest in the given topic. We present controlled experiments that examine the effect of various incentive mechanisms, highlighting the need for having short-term rewards as goals, which incentivize the users to contribute. Finally, our cost-quality analysis indicates that the cost of our approach is below that of hiring workers through paid-crowdsourcing platforms, while offering the additional advantage of giving access to billions of potential users all over the planet, and being able to reach users with specialized expertise that is not typically available through existing labor marketplaces.

Bio: Panos Ipeirotis is a Professor and George A. Kellner Faculty Fellow at the Department of Information, Operations, and Management Sciences at Leonard N. Stern School of Business of New York University. He received his Ph.D. degree in Computer Science from Columbia University in 2004. He has received nine Best Paper awards and nominations, a CAREER award from the National Science Foundation, and is the recipient of the 2015 Lagrange Prize in Complex Systems, for his contributions in the field of social media, user-generated content, and crowdsourcing.


Labels: [FR] - Full Research Paper, [R] Project-in-Progress Paper (Research-oriented), [P] Project-in-Progress Paper (Practice-oriented)

8:45 Opening (WS Chairs)

8:50-10:00 Session 1 (Chair: Senjuti Basu Roy)

8:50 [FR] Austin Graham, Yan Liang, Le Greunwald, and Christan Grant. Formalizing Interruptible Algorithms for Human over-the-loop Analytics
9:20 [R] Masahiro Kazama and Viviane Takahashi. Active Preference Learning for Generative Adversarial Networks
9:30 [R] Xiaoni Duan and Keishi Tajima. Improving Classification Accuracy in Crowdsourcing through Hierarchical Reorganization
9:40 [P] Naoki Kobayashi, Masaki Matsubara, Keishi Tajima, and Atsuyuki Morishima. A Crowd-in-the-Loop Approach for Generating Conference Programs with Microtasks
9:50 [P] Munenari Inoguchi, Keiko Tamura, Kei Horie, and Haruo Hayashi. Clarifying the Transition of Workload for Victims Life Reconstruction Support Programs in Affected Local Governments Using the Victims Master Database -Comparison between the 2007 Chuetsu-oki Earthquake and the 2016 Kumamoto Earthquake-

10:00-10:20 Coffee Break

10:20-12:00 Session 2 (Chair: Keishi Tajima)

10:20 [FR] Takahiro Komamizu, Toshiyuki Amagasa, and Hiroyuki Kitagawa. Implicit Order Join: Joining Log Data with Property Data by Discovering Implicit Order-oriented Keys with Human Assistance
10:50 [R] Joseph Cottam, Leslie Blaha, Dimitri Zarzhitsky, Mathew Thomas, and Elliott Skomski. Crossing the Streams: Fuzz testing with user input
11:00 [R] Yuzuki Furuhashi, Masaki Matsubara, and Atsuyuki Morishima. Crowd-based Best-effort Number Estimation
11:10 [P] Mamiko Matsubayashi and Keiko Kurata. Conceptual design for comprehensive research support platform
11:20 [P] Koyo Kobayashi, Hidehiko Shishido, Yoshinari Kameda, and Itaru Kitahara. A Method to Generate Disaster-Damage Map by Using 3D photometry and Crowd Sourcing

11:30 Participants Self-Introduction Session - All Participants

To prepare for the session, we would like to ask participants to answer participant survey questions. The collected information will be open to the participants in the workshop venue during the session to foster communications and future collaboration among the participants.

12:00-13:30 Lunch

13:30 Keynote by Panos Ipeirotis (NYU Stern School of Business)
Targeted Crowdsourcing with a Billion (Potential) Users

14:30-15:30 Session 3 (Chair: Atsuyuki Morishima)

14:30 [FR] Hiroki Morise, Satoshi Oyama, and Masahito Kurihara. Collaborative Filtering and Rating Aggregation Based on Multicriteria Rating
15:00 [R] Michalis Papakostas, Konstantinos Tsiakas, Theodoros Giannakopoulos, and Fillia Makedonn. Towards Predicting Task Performance from EEG Signals
15:10 [P] Hidehiko Shishido, Yutaka Ito, Youhei Kawamura, Toshiya Matsui, Atsuyuki Morishima, and Itaru Kitahara. Proactive Preservation of World Heritage by Crowdsourcing and 3D Reconstruction Technology
15:20 [P] Keiko Tamura and Naoshi Hirata. �DEKATSU" Activity of Data and Service Collaboration among Private Companies and Academic Institutions for Tokyo Metropolitan Resilience Project

15:30-16:10 Coffee Break with Posters

16:10-17:10 Session 4 (Chair: Satoshi Oyama)

16:10 [FR] Yoshitaka Matsuda, Yu Suzuki, and Satoshi Nakamura. A Trade-off between Estimation Accuracy of Worker Quality and Task Complexity
16:40 [FR] Panote Siriaraya, Yuriko Yamaguchi, Mimpei Morishita, Yoichi Inagaki, Reyn Nakamoto, Jianwei Zhang, Junichi Aoi, and Shinsuke Nakajima. Using categorized web browsing history to estimate the user's latent interests for web advertisement recommendation

17:10 Closing Remarks (WS Chairs)

Important Dates

  • Oct 13 (Fri), 2017: Due date for workshop papers submission
  • Oct 27 (Fri) 2017: Due date for student travel support application
  • Nov 6 (Mon), 2017: Notification of paper acceptance to authors
  • Nov 15 (Wed), 2017: Camera-ready of accepted papers
  • Dec 11, 2017: Workshop Day


All submissions must be submitted electorically through CyberChair. Please prefix your submission category such as [Research Paper] to the Title of Paper field in the submission page. For example, if you would like to submit a project-in-progress paper "Crowd-centric Approach to Digital Archive Maintenance," you have to put "[project-in-progress paper] Crowd-centric Approach to Digital Archive Maintenance" into the Title of Paper field.
All papers accepted for the workshop will be included in the Workshop Proceedings published by the IEEE Computer Society Press, made available at the Conference.

Submission Categories

  • Research Papers (*) (long presentation): They report significant and original results relevant to the scope of this workshop. We solicit innovative or thought-provoking work but they do not necessarily have to reach the level of completion. The expected length is between 4 and 6 pages. The maximum length is 10 pages, though the paper should be commensurate with the size of the contribution.
  • Practitioner papers (*)(long presentation): They present interesting problems that require human-in-the-loop solutions in a variety of application domains, or present the interesting results of applying existing human-in-the-loop solutions to their domains. The expected length is between 4 and 6 pages. The maximum length is 10 pages, though the paper should be commensurate with the size of the contribution.
  • Project-in-progress papers (short presentation): They present the goals, challenges, and preliminary results of research or real-world projects in progress. The maximum length is 3 pages.
(*) Some of the papers submitted to the research or practitioner paper categories may be accepted as project-in-progress papers and allotted to short presentation slots.


Papers should be formatted to IEEE Computer Society Proceedings Manuscript Formatting Guidelines in the IEEE Bigdata 2017 CFP page

Student Travel Support

We plan to provide student travel support to a selected student (MAX 115,000JPY which is about 1000USD, and paid after the conference). You could apply for student travel support by Oct. 27 if you are a student, and you are both the primary author and presenter of an HMData paper in case it is accepted. The Student Travel support provides support for students to present their work.
The eligibility criteria are:
  1. The applicant (student author) must be a student in his/her institution.
  2. The student author must attend the conference and present the work as the first author.
To apply for a grant please follow the five steps:
  1. Send the following Information to us (
    The subject of the email should be "HMData 2017 Student Travel Award Application".

    1. CV containing Personal Information (Full Name, Address, Email) and Student Information (Degree Program, University or College, Date you entered the current program, Date your degree will be completed, Advisor)
    2. Travel Expense Estimate (Air-ticket US$, Hotel US$, Registration US$)

  2. Ask your supervisor to email us (, confirming that you are a current student under their supervision and that you will be attending HMData2017 if the submission is accepted. The subject of the email should be "HMData 2017 Student Travel Award Application Verification".

  3. The accepted applicant will be required to send us the flight schedule (provided by the airline or the travel agency) and the information of your bank account in advance. Please note that the currencies available for the cash taransfer are USD, EUR, GBP, AUD, CAD, CHF, NZD, THB, SGD, HKD, KRW, TWD, PHP, IDR, INR, SEK, DKK, NOK, CZK, PLN, HUF, TRY, ZAR, AED, SAR, KWD, QAR, MXN, and JPY.

  4. The accepted applicant will be required to give us the original receipts (hotels, registration, air-tickets) and the original air-ticket stubs at the conference.

  5. After the conference, when the expenses are proved by the original receipts, air-ticket stab, and in some cases credit card statement, we will transfer the cash to your bank account.

Award Details

  • Eligible applications will be evaluated on a competitive basis.
  • Student support is intended to cover a portion of the travel expenses for students who have difficulty in receiving other financial support.
  • Expenses eligible for coverage include conference registration fees, air-ticket expenses to and from the conference, and accommodation expenses while attending IEEE BigData.
  • The amount of the award depends on the type of travel and the applicant needs and does not exceed 115,000JPY (about 1000USD as of Oct.5).



Atsuyuki Morihsima (Univesity of Tsukuba)
Senjuti Basu Roy (NJIT)
Lei Chen (HKUST)

Program Committee

Sihem Amer-Yahia (CNRS/LIG)
Wolf-Tilo Balke (Technische Universitat Braunschweig)
Adam Bradley (Amazon)
Jay Byungkyu Kang (Nokia Technologies)
Reynold Cheng (University of Hong Kong)
Marina Danilevsky (IBM Research Almaden )
Takahiro Hara (Osaka University)
Yoshiharu Ishikawa (Nagoya University)
Munenari Inoguchi (Shizuoka University)
Hyunjoon Jung (Apple)
Vana Kalogeraki (Athens University of Economics and Business)
Hisashi Kashima (Kyoto University)
Itaru Kitahara (University of Tsukuba)
Dongwon Lee (Penn State University)
Guoliang Li (Tsinghua University)
Shinsuke Nakajima (Kyoto Sangyo University)
John O'Donovan (University of California, Santa Barbara)
Masato Oguchi (Ochanomizu University)
Satoshi Oyama (Hokkaido University)
Nobuyuki Shimizu (Yahoo!Japan Research)
Keishi Tajima (Kyoto University)
Saravanan T (QCRI)
Masashi Toyoda (The University of Tokyo)
Lirong Xia (Rensselaer Polytechnic Institute)
Vladimir Zadorozhny (University of Pittsburgh)
Demetris Zeinalipour (Max Planck Institute for Informatics and University of Cyprus)