When the workers in the loop are volunteers, we can't rely on payment to motivate them.Of course, we need efficiency: we can't waste volunteers' time or they will stop contributing, putting system sustainability at risk.But we also need to support engagement through experience design to make the work worthwhile, which requires considering how incorporating AI changes workers' experiences. If applying machine learning to a task eliminates the "easy" work leaving only tasks that are increasingly difficult, it may discourage volunteers and sabotage the entire system. Plus the volunteers are smart and will figure out that they are working alongside machines; how do we manage this? Will it bias their work? Will they second guess themselves or be relieved to have machine support? We could see a boost to performance or a drop in engagement, which might change who participates and how, impacting data quality. We don't know how people will respond to this scenario, but the future of work will require that we find out so that we can balance sustainable and engaging experiences with maximizing system efficiency to create effective human-in-the-loop systems.
Crowdworks is the largest crowdsourcing platform in Japan and we went public on the Tokyo Stock Exchange market in 2014. As the clients, over 400 thousand companies, such as Toyota, Honda, Sony, Panasonic, and so many Small businesses, are already registered. And as workers, over 3 million users are using our services now. There are over 200 types of work that can be requested to a Crowd worker. Mainly, the clients are offering jobs, such as writing articles, designing, programming, engineering, online secretaries, and customer support. Who is registered on Crowd Works? 97% of users are individuals. Most of our workers are in their 30s and 40s, but our service is used by people of all ages, valued also by seniors and stay-at-home mothers. In terms of the contract amount, which means how much the job price is, the top contract amount is Programming and the second one is design. In terms of the number of contractors, the top number of contractors are designers and writers.
I am exploring potential collaborations between businesses and academia to bridge theory and practice in the future of work. I have five real-world crowdsourcing business issues. The 1st issue is ensuring the quality of the evaluations of users and clients. On any crowdsourcing platform, the quality of the work must be ensured, and low-quality users and clients must be identified. But, the accuracy, reliability, and objectivity of the evaluations are not still guaranteed. The 2nd issue is optimizing the order price. Can we trust the “Invisible Hand”? The more users who participate in the bidding, the more likely that the order price is driven downward. On the other hand, the cheapest bidder does not always leave the client fully satisfied. Clients are often more satisfied with the high-skilled workers, even if the price is much higher. The 3rd issue is preventing fraud and scam. In our business and many others, there are spammers and multi-level marketing companies that register as a client or worker and scam others to steal their money. We have built algorithms to detect such fraud users, but the scammers have developed ways to avoid detection. There are limits for humans to detect these users. The 4th issue is preventing money laundering. Similar to the previous issue, there are money launderers who actively work to steal money from our platform. They are different from MLMs or scammers in that they act as a good client for a long time -- building a good reputation across a few months -- then suddenly launder money by abusing the CrowdWorks advance payment system.
The 5th issue is balancing demand and supply on the platform. Every crowdsourcing platform always has a supply and demand. On our platform, the demand is higher than the supply. The reason is not completely known - maybe the contract prices are too low or there are not as many workers with the required skills for the contract work. With academia, I would like to explore ways in which AI can balance demand and supply. In the future, I imagine that when the client registers for a job, the AI will tell the client the rate of contract for that type of job and give suggestions to increase the rate (i.e. increase price, include more details about the job, etc.). I hope to work with each of you to discuss how current and future research can support businesses to resolve these issues.
In this presentation I discussed two related research challenges from a collaboration between psychology and computer science researchers. The research context was described as collaborative creativity, which is a research paradigm that brings groups of people together to come up with novel uses for various commonplace objects (e.g., coffee mug, shoelace, brick, etc.). One objective is to use a wide array of individual characteristics, personality dimensions, prior performance, as well as combinations of such variables in group settings, to predict the number of unique ideas generated. The second, more challenging objective, is to use a computational approach to tap the language of the ideas generated to evaluate the quality and novelty of the ideas so that computers/AI agents can facilitate human creativity and prevent inefficiency and process loss. This has proven, across other research domains, to be particularly difficult because of the known difficulties in deriving and using semantic meaning from human texts. We continue to work toward solving this problem and welcome solutions from the computational community.
Crisis mapping is the process of real-time collection and visualization of crisis data for humanitarian relief. Focusing on a real-life event, the 2010 Haiti earthquake, we explore the specific aspect of information categorization on a crisis mapping platform (known as Ushahidi). Information categorization is a process wherein messages from affected citizens (victims) are categorized by crowd volunteers (digital humanitarians) for use by crisis responders. In order to make categories actionable for onsite response and recovery efforts, first responders have to be confident that the categories are reliable. However, the reliability of information categorization has been questioned in the context of crowdsourced crisis mapping. This is because much of the victim reported information is ambiguous or incomplete and because crowd volunteers may not have the requisite training for processing such information. This leads to the following research questions: Are there features of citizen-reported crisis messages that can lead to reliable categorization? How can the capabilities of technology platforms help the online crowd volunteers in reliable categorization?
For investigating the reliability of categorization by crowd volunteers, we utilize agreement (majority vote or consensus) among volunteers and/or evaluators. We compare information categorization by Ushahidi volunteers during the event with post-event categorization by different groups of evaluators – CrowdFlower volunteers and a Registered Nurse-led team. We identify categories that have varied degrees of evaluator agreement. Subsequently, we employ collective sensemaking as an overarching framework to investigate the drivers of agreement within information categorization. Collective sensemaking is the shared comprehension of crisis messages that is facilitated through an understanding of the duality of (a) the crisis context and (b) interaction of crowd volunteers with the crisis mapping platform. We develop a research model that characterizes agreements within information categorization in terms of social and situational cues, as well as information structuring and crisis mapping interactions (captured through posts within crisis reports). Based on an analysis of 1,459 crisis reports in the Ushahidi crisis mapping platform for the 2010 Haiti Earthquake, we find the cues that are positively associated with agreements among crowd volunteers and crowd evaluators. In addition, crowd interaction with the platform is also seen to have a positive relationship with agreements regarding information categorization.
KEYWORDS: Crisis mapping, Digital humanitarianism, Crowd-sourcing, Information categorization, Collective sensemaking, Ushahidi, Haiti earthquake, Social cues, Situational cues, Posting mechanism
Firstly, I am from the field of practitioners. Once disaster occurs, I always visit to the affected area and monitor the activities of local responders and survivors. In affected area, I find out some issues which can be solved by ICT, and I develop some support tools on-site as a prototype system, and implement it to validate the efforts of ICT by myself. I know there are many kinds of advanced technology, methodology, knowledge and professions in the field of ICT, however there is no way to integrate them for solving actual problems occurring in affected site at disaster. My main purpose of joining to this meeting is to find out effective solution by Human-in-the-Loop for disaster response and damage detection at actual disaster.
In order to proceed effective disaster response, responders have to grab the actual damage situation for designing strategic plan. However, it always takes much time to do it at huge disaster. In my presentation, I introduce the actual situation about how long it took to confirm disaster damage at Tohoku Earthquake in 2011. In this case, it took 1 month to confirm human damage, and 6 months to confirm building damage. The reason why it took much time to confirm building damage, local responders inspected building damage by visiting each building one-by-one. Following this method, one inspection team can detect the damage of only 30 buildings per day. Against this situation, my research team established a web-GIS system in which remote users can judge each building is washed out or not by referring aerial photos taken after disaster. This was a kind of crowdsourcing. At that time, we have no knowledge to promote it effectively and efficiently, so it took about 3 months to accomplish judgement of building damage. After this activity, I met Prof. Morishima and learned the framework of crowdsourcing, then I found that crowdsourcing is effective and essential way to clarify the damage situation urgently after disaster occurrence.
In recent research, I tried to detect roof damage from images taken by drones in a case study of 2019 Yamagata-oki Earthquake. Murakami city in Niigata prefecture was affected by this earthquake, and many buildings were damaged. However, most of damage were concentrated on the roof of building. Then, we decided to detect roof damage by drones. Our research team designed the plan of drone flight, and operated drones to take pictures of damage roofs. We created orthophoto mosaic from those images, and we published it for Murakami city officers in cloud-based GIS platform. However, it took much time to arrange the environment because we have not enough experience of image processing. After accomplishment of this arrangement, they survey the roof damage of each building referring the orthophoto mosaic, and they understood that survivors were suffered by roof damage. Then, Murakami city decided to create new support program for roof damage relief. Just now, we challenge to monitor the progress in survivors’ life reconstruction by taking images of roof damage periodically. In this, we try to utilize object detection method with deep-learning in AI technology to detect roof damage, which is represented by blue-sheet covered over damaged roof. The accuracy of blue-sheet detection is not so high now, then we will find the effective way with Human-in-the-Loop to improve teaching data and processing model.
KEYWORDS: Disaster Response, Damage Detection, Rational Decision Making
We introduce CrowdFlow, a high level language for complex crowdsourcing workflows based on collaborative data-centric workflows.
In this demo, I show the results of a workflow written with CrowdFlow and compiled for the platform Headwork. In this workflow, a worker can decide to answer a task or redistribute it to others if she cannot complete it herself.
Cyber-Physical Systems (CPS) are used in many applications where they must perform complex tasks with a high degree of autonomy in uncertain environments. Traditional design flows based on domain knowledge and analytical models are often impractical for tasks such as perception, planning in uncertain environments, control with ill-defined objectives, etc. Machine learning based techniques have demonstrated good performance for such difficult tasks, leading to the introduction of Learning-Enabled Components (LEC) in CPS. Model based design techniques have been successful in the development of traditional CPS, and toolchains which apply these techniques to CPS with LECs are being actively developed. However, there are still several gaps in understanding the risks involved in the use of LEC based approaches. In this talk we examine the underlying differences between traditional CPS design and the CPS design with LEC. We also examine the problems of assuring the correctness of the LEC components and its impact on the overall safety of the system.
In recent times, data is considered synonymous with knowledge, profit, power, and entertainment, requiring development of new techniques to extract useful information and insights from data. In this talk, I will describe our work in intervention-based data analysis toward the goal of understanding data and ensuring interpretability of query answers for a broad range of users. First, I will talk about approaches to explaining query answers in terms of “intervention” (how changes in the input data changes the output of a query) and “counterbalances” (how an outlier can be explained by an outlier in the opposite direction). Then I will discuss how to facilitate understanding and exploration data and query answers with useful graphical interfaces. Finally, I will discuss how to go beyond correlation by causal inference from observational data from the Statistics literature, and how it benefits from techniques in data management.
Currently crowdsourcing employs a small fraction of the human workforce ranging in the hundreds of thousands. In the future, it is possible that the vast majority of the workforce will be involved in a crowdsourcing style platform. It is important to think about the design principles for these next-gen platforms. These crowdsourcing platforms must be inherently collaborative and customized to each task. Finally, it must also be locally installable on-premise so that it could be used by any organization. We are currently working on such systems for two domains: manual data cleaning and systematic review with promising results.
As platform is one of the main drivers of working transformation, understanding current issues relating to crowdsourcing is a first step to handle forthcoming challenges of future of work. This presentation offers an overview of current crowdsourcing issues in the Information Systems field which belong to management science. Crowdsourcing studies in Information Systems address three main questions: what is crowdsourcing? Under which circumstances to outsource to the crowd? How to incentivize the crowd? What are the best ways to manage the crowd for creating value? What are the drawbacks of crowdsourcing? As existing studies aims to optimize crowdsourcing activities for all stakeholders, they also point out the growing risk of crowd exploitation. Envisioning the future of work demands therefore a deep understanding about the distribution of power between workers, requesters and platform. We assume that crowd exploitation can be overcome through ethical reflections with a greater focus on platform’s responsibilities.
In this talk I have discussed recent research where we looked at ways to improve the crowd worker experience including 1) work on understanding the phenomenon of workers abandoning tasks after having completed work (such partially completed tasks result in unrewarded work as workers abandon tasks before completion), and 2) research on the power imbalance between workers and requesters where we provided workers with tools to be aware on which tasks their work quality is being evaluated by automatically detecting gold questions in crowdsourcing tasks.
We present efficient algorithmic mechanisms to partition graphs with up to 1.8 billion edges into subgraphs which are fix points of degree peeling. For fixed points which are larger than a desired interactivity parameter we further decompose them with a novel linear time algorithm into what we call "Graph Waves and Fragments". Fix points, Waves and Fragments have visual representations that we call Vases and Trapezoid Forks. We illustrate these visual abstractions in 2D and 3D with a variety of publicly available data sets; these include social, web, and citation networks.
Current crowdsourcing platforms (turning potentially into Future of Work platforms) are very limited: task are repetitive (micro-tasks), skills of participants, career path or promotion mechanisms are not or barely modeled. Moreover, complex tasks (beyond simple chaining of simple tasks) cannot be expressed. In this talk, we describe the concepts we are discussing within the HEADWORK project (ANR funding) that could overcome these limitations. For example, we propose that declarative workflows (in the spirit of Business Artifacts) should be at the center of the model, allowing complex jobs to be described, delegated, or redesigned by user themselves. We also present the HEADWORK prototype that illustrates these notions along with skill modeling and formal verification of job designs.
Project homepage: http://headwork.gforge.inria.fr/
Sourcecode homepage: https://gitlab.inria.fr/druid/headwork (ask for access if needed)
Many sources of data are intensional in the sense that data is not directly available in extension, but access to data access has a cost (which can be a computational cost, a monetary cost, time, a privacy budget…). This is the case of crowdsourcing, where one needs to pay or incentivize workers and wait for tasks to be completed, in order to get access to data; but this also applies to a large variety of settings (the deep Web, complex automated processes, reasoning over data, etc.). Intensional data management is about taking into account the cost of data access while solving a user’s knowledge need, by building a recursive, dynamic, adaptive knowledge acquisition plan that minimizes access cost, and provides probabilistic guarantees on the quality of the answer.
While the role of humans is increasingly recognized in machine learning community, representation of and interaction with models in current human-in-the-loop machine learning (HITL-ML) approaches are too low-level and far-removed from human's conceptual models. In this talk, I present ongoing work in my team to support human-machine co-creation with learning and human-in-the-loop techniques. In particular, I will focus on three topics: (1) how to use machine learning to leverage crowdsourced work in effectively to achieve expert-level quality while minimizing expert workload; (2) SystemER: how to learn an explainable AI model with active learning and a declarative system; (3) HEIDL, a system supports human-machine co-creation by exposing the machine-learned model through high-level, explainable linguistic expressions. In all three, human's role is elevated from simply evaluating model predictions to interpreting and even updating the model logic directly by enabling interaction with rule predicates themselves. Raising the currency of interaction to such semantic levels calls for new interaction paradigms between humans and machines that result in improved productivity for model development process. Moreover, by involving humans in the process, the human-machine co-created models generalize better to unseen data as domain experts are able to instill their expertise by extrapolating from what has been learned by automated algorithms from few labelled data.
Among diverse challenges that future of work may face, in particular, I first argue that how to address the retraining of human workers be a critical issue in an AI-powered environment where humans and machines collaborate and compete. Then, I lay out a few intellectually interesting research questions that we need to solve.
In this talk, I have demonstrated FaiRank, an interactive system to explore fairness of ranking in online job marketplaces. FaiRank takes as input a set of individuals and their attributes, some of which are protected, and a scoring function, through which those individuals are ranked for jobs. It finds a partitioning of individuals on their protected attributes over which fairness of the scoring function is quantified.
A common practice in validating task assignment algorithms in crowdsourcing is by first asking a worker to perform all possible tasks. For example, if there are ten tasks, worker A should do all ten tasks. After the algorithm is executed, and an assignment is generated, the metrics for only the assigned tasks are computed. If the algorithm assigns tasks 1 to 5 to worker A, only metrics such as quality, cost, and latency are computed for tasks 1 to 5. While this practice is cost-effective when comparing different task assignments, it does not capture the actual circumstances when a task is completed. In this talk, I introduce a simple web application that takes in three inputs: a database of tasks from the Figure Eight Open Data Library, HTML task templates, and task assignment generated by an algorithm. It then allows workers to complete only the tasks assigned to them. The answers provided by workers, along with task metadata, are stored in an SQL database, which can be easily analyzed later on.
Body Worn Camera (BWC) is an emerging Information Technology and System (IT) artifact that has recently started to be used in law enforcement. Although BWC can bring many benefits, it may also result in negative outcomes, such as loss of citizens’ privacy or failure of proper management of digital evidence. In this research, we develop a research model that focuses on police officers’ perspective about the ethical use of BWC, from the standpoint of organizational justice as well as risk and benefit of using BWC. The paper specifically develops hypotheses to test the relationship between three factors: 1) Work Environment, 2) Risk of using BWC and 3) Benefit of using BWC and an outcome variable, police perceptions of ethical use of BWC. In addition, it tests the moderating effects of work related uncertainty on the above three factors. Finally, relationships between organizational justice factors and Work Environment also are presented as hypotheses. The paper also develops three gray BWC scenarios that are not clearly ethical nor unethical. We apply a survey methodology to test the research model in the context of the gray scenarios. The results show that work motivation and risk of using BWC are negatively related with the police officers’ perceptions about ethical use of BWC and these relationships were negatively moderated by work related uncertainty, while justice constructs are positively associated with perceived work motivation. Understanding such issues will assist in development of policies and help in the provision of actionable guidelines for BWC use.
KEYWORDS: Police Body Worn Camera, Perceptions about Ethical Use of IT Artifact, Emerging Technology