Artist and researcher Caroline Sinders is using research-based art projects to examine data and the impact of technology on society.
While data analysis has become one of the most valuable assets in a company’s arsenal, it is not without its flaws.
A key issue is how societal biases such as sexism can appear in datasets and AI algorithms because of the data that is fed into them. One woman who is trying to combat sexist data is Caroline Sinders, a machine learning design researcher and artist.
Speaking to SiliconRepublic.com, Sinders said he likes to use art as a mechanism for criticism.
“Art allows me to visualize current emergencies, or current imaginings, or possible speculative solutions. It also allows me to really play.”
Sinders has worked on a number of projects using data science, machine learning and art. One of which is the Feminist Dataset, a multi-year research-based art project that interrogates every step of the AI process, including data collection, data labeling, data training, choosing a the algorithm to use and the algorithmic model to control for bias.
She said one of the reasons she wanted to bring art into the project was to engage community members in the process and allow people to ask questions about how to create an algorithmic model in a feminist way.
“If I had done this in a much more controlled environment, like in a lab for example, this would have been a much shorter project and we probably would have had a lot fewer participants,” she said.
“What I like about it is making an art project, extending it, it allows me to do things over and over again, and I can change things, I can adjust things, I can move things around . But also, it allows me to follow the provocations that the participants give. Instead of saying, ‘Oh, that’s a great idea, but it’s not important,’ it allows me to actually say like, ‘Oh, actually, let’s follow that thread for a second.’
“Datasets should be thought of as organic entities that will one day expire”
– CAROLINE SINDERS
She added that making it as strict as a full research project would also limit the type of text participants could submit. Currently, participants can submit any type of text, including poetry, blog posts and song lyrics, to form the project.
She said the text model will be “wrong” because of the different types of text being used. She also said that, unlike natural image processing, she’s interested in manually annotating the data “to then try to fill some kind of additional narrative within it.”
“This is also an artistic choice. This becomes a form of poetry, which also becomes a form of text itself that can be turned back into text, but this is not how you would generate an NLP model. And I think that’s okay because it’s still an illustrative step because it’s like a kind of analysis maybe.”
Within the Feminist Data Set project, Sinders also created Technical Responsible Knowledge (TRK), which is a tool and advocacy initiative that highlights unfair labor in the machine learning pipeline.
It includes an open source data labeling and training tool and a salary calculator, and was designed to be usable by non-coders.
“I wanted to include this aspect of the datasheets, well, what’s a summary that someone would add about this? Who made it and what is it about and where did it come from? Why does this exist? And that becomes the way to sign a database,” she explained.
“One of the things I’m really interested in is this idea that maybe datasets should be thought of as organic entities that will one day expire. So what is the life cycle or lifespan of a data set? And then a data set needs a label. It should have the day it was made or the day it was finished, and who made it and where they are from. So those were other things that I incorporated into that as well.”
AI, machine learning and the public good
Outside of her Feminist Data Set project, Sinders is also extremely passionate about design for the public good and has noticed many examples of how machine learning can be beneficial to society.
One area in which she saw AI used for social good was when she was a fellow writer with the Google People and AI Research (PAIR) team, where she looked at how different cities are using artificial intelligence.
One example was Amsterdam using artificial intelligence alongside humans to analyze people making non-emergency calls, such as reporting fallen trees or illegal parking.
“They apparently had a lot of great success using that. It has helped them create different buckets and then helps people sort faster for the most part.
“One of the reasons they wanted to do this is that they recognized that a phone tree that they designed is probably really confusing to a regular constituent or consumer. They know which department should handle fallen trees, but a consumer might not know that.”
Sinders also said machine learning has a big role to play when it comes to the climate crisis. When she was an artist-in-residence at the European Commission, researchers explained how they used machine learning to analyze changes in thousands of images of coastlines to monitor erosion, as well as other tools such as heat maps.
“Machine learning is just able to sort through those images much faster than a person can. And it’s also provided these different levels of analysis of how things have changed. So then machine learning becomes this additional extension of the researcher in a way and is able to provide this really useful analysis,” she said.
“I think there are a lot of interesting moves in the climate change space of companies using machine learning to help analyze aspects of climate change already, but then also designing and creating simulations of what the future is if we change parts different of our present. “, she added. “I think this is a really great use of machine learning.”
10 things you need to know straight to your inbox every weekday. Register for Daily summarySilicon Republic’s roundup of essential tech news.