Call for Papers: ICML 2021 Workshop
Machine Learning for Data: Automated Creation, Privacy, Bias
Website: https://sites.google.com/view/ml4data
Virtual conference
Date: July 23 or 24 (TBD), 2021 (submission deadline: June 10, 2021)
=============================
Call for Papers:
We invite researchers to submit their recent work that studies how ML techniques can be used to facilitate and automate a range of data operations (e.g. ML-assisted labeling, synthesis, selection, augmentation), and the associated challenges of quality, security, privacy, and fairness for which ML techniques can also enable solutions. Topics of interest include but are not limited to:
– Methods of using ML to assist human annotators in data labeling.
– Methods of automated data engineering, such as synthesis, augmentation, re-weighting, etc.
– Theories, methods, and studies to characterize, detect, or mitigate data bias.
– Methods of detecting and preserving privacy information in data.
– Systems for automating data operations and analytics.
– Applications based on data-human-machine interactions.
Authors are welcome to submit 4-6 page papers, with unlimited space for references and supplementary materials. The submissions should follow the ICML 2021 style and formatting guidelines. The review process is double-blind. The submissions should not have been previously published nor have appeared in the ICML main conference. Work currently under submission to another conference is welcome. Papers can be submitted at the following link: https://cmt3.research.microsoft.com/ICML2021ML4data
Submissions will be accepted as contributed talks or poster presentations. Accepted papers will be posted on the workshop website. Accepted papers are free to appear in other journals or conference proceedings.
Key Dates:
Submission Deadline: June 10, 2021 (11:59pm AOE)
Acceptance Notification: July 1, 2021
Workshop: July 23 or July 24 (TBD), 2021
Speakers:
Kamalika Chaudhuri (UCSD)
Aleksandra Korolova (USC) (tentative)
Hoifung Poon (Microsoft)
Alex Ratner (UW)
Dawn Song (UCB)
Eric Xing (CMU)
Organizers:
Zhiting Hu (UCSD, Amazon)
Willie Neiswanger (Stanford)
Benedikt Boecking (CMU)
Erran Li (Amazon, Columbia)
Yi Xu (Amazon)
Belinda Zeng (Amazon)
Workshop Overview:
As the use of machine learning (ML) becomes ubiquitous, there is a growing understanding and appreciation for the role that data plays for building successful ML solutions. Classical ML research has been primarily focused on learning algorithms and their guarantees. Recent progress has shown that data is playing an increasingly central role in creating ML solutions, such as the massive text data used for training powerful language models, (semi-)automatic engineering of weak supervision data that enables applications in few-labels settings, and various data augmentation and manipulation techniques that lead to performance boosts on many real world tasks. On the other hand, data is one of the main sources of security, privacy, and bias issues in deploying ML solutions in the real world.
This workshop will focus on the new perspective of machine learning for data — specifically how ML techniques can be used to facilitate and automate a range of data operations (e.g. ML-assisted labeling, synthesis, selection, augmentation), and the associated challenges of quality, security, privacy and fairness for which ML techniques can also enable solutions. In this workshop, we aim to bring together researchers and practitioners working on methodology, theory, applications, and systems to exchange ideas, identify key challenges, and advance the field towards the most exciting and promising future directions.