At first, this might appear to be a vague and shocking topic to read about, let alone do an entire top-down analysis on it. Human-trafficking and Data Science? Ok, anything is possible nowadays, but you need data, obviously. Where? I mean, how exactly?
Those could be some possible questions that could be bothering you. But if you take a step back and think about it, what I’m about to portray is nothing short of genius.
During my days in New York City, I was active and fortunate enough in attending meetups happening in NYC, which were several. There were two talks in this particular meetup I went for in SoHo; the first one was a very interesting NLP-based chat-box that was being designed and deployed by a New York-based company. The one following that blew my mind. The speaker, Mr. Eric Schles, is an Adjunct Professor at New York University and works as an Innovation Specialist at 18F, which is an office within the General Services Administration and works within the U.S. Government.
It all started when Eric was 12 and he visited our beautiful country, India. On a family vacation, he witnessed an enslaved man in a small town outside Bangalore. He realized that the world was a much harsher place and decided to dedicate his life to eradicate slavery! Human trafficking industry generates over 32 Billion dollars a year, which is preposterous! After embracing his journey as an Economist, Eric got his master’s in Computer Science and began his journey as a Data Scientist.
How did he approach this problem?
Ingeniously, Eric used the web as his database to fight human trafficking. Especially in the USA, online classifieds such as Craigslist and Backpage are extensively used for jobs, gigs, help needed and buy/sell/rent/trade absolutely anything. Sadly, this includes humans as well. Prostitution is heavily advertised on such pages, with explicit images, phone numbers, addresses and other contact information.
Breaking it down, Eric’s goal was to identify victims of human trafficking. As a start, it was also unclear if the majority of the women were forced or not, since the forced ones would be the true victims. Now, to do this, Eric built a Craigslist-scraping tool in Python, which he has displayed on his GitHub profile here.
During the talk, Eric went through almost all of his code to show the outcome of his efforts. Using his strong acumen, he was able to classify several hundreds of postings on Craigslist as likely sex workers. Now, the posts on Craigslist can vary from people reaching out seeking personal satisfaction or be wanting to meet other mates through other mediums or if they were likely sex workers.
So, in essence, this was a classification problem, to begin with. Eric made use of textblob’s Naive Bayesian Classifier and built an excellent tool to address this problem. (Simple Text Classification with Python and TextBlob). Eric also showed us 2 graphs, he developed through his scraping-classifying technique, on the ‘Demand’ & ‘Supply’ of prostitutes.
After seeing this shocking statistic, where the ‘demand’ and ‘supply’ of prostitutes in New York City is significantly increased during breaks like Thanksgiving, Eric directed his efforts to find the true victims of sex trafficking by using metrics. These were, for example, if there were addresses provided and all of them were associated with the same contact information, then he triangulated the true position and saved all of these numbers and in many cases, explicit images of the sex worker. The results were staggering.
Through his brilliant approach, Eric extracted the results, phone numbers, and contact information and alarmed the law enforcement agencies. Through this and through the collaborative efforts of many people across the country:
- Been able to identify and save many victims from sex trafficking.
- The CEO of Backpage was put behind bars, alarming others that trafficking had cost.
- Implement this method and algorithm on Craigslist and automatically remove the posting if it was related to prostitution. (Like a spam classifier)
Self-care and Thoughts:
After the talk, Eric and I were both going to Brooklyn and thus, needed to catch the same train. This is where the real talk happened. As an aspiring Data Scientist, during our journey, I displayed my passion to Eric and he could tell how I wanted to use these techniques to help the fight against poverty, illiteracy, etc. He said that it’s fantastic that people want to change the world. We both agreed that the individual pursuing these great goals must be very empathetic and compassionate, as those emotions are the motivation to address these issues.
But there is also an occupational hazard; Eric also stated how incredibly saddening it was to meet these victims in person and hear about their stories. He said that sometimes he would be extremely overwhelmed and find it hard to function, which led him to say, “It’s equally important to take care of yourself and do things you like, if not more important.” I couldn’t agree with him more. So for people aspiring to change the world, know that your mental and physical health comes first and are possibly tied on the leaderboard in 1st place with your goals and vision of improved lives.
You can’t solve an issue of this magnitude on your own, which is why you must find people who are obsessed with the same issue you care about and get their feedback and cooperation.
Eric has not only made his code available but also other tutorials in Python (NLP, Deep Learning) on his GitHub profile for everyone to see and get inspired by his work. He has also posted a tutorial on how he went about implementing code related to sex-slavery which followed the scraping, running, classifying approaches.
Eric and I both encourage people to go through the code and resources available… All in all, I would have never imagined that Data Science could be used to fight slavery, but it can. Not only slavery, but Data Science is being used, as we speak, for issues such as Disease Progression, Education, Economic Development and many more! It’s equally important to have a comprehensive grasp on the issue, like the Disease in question, and not just algorithms and computational expertise. Data Science and Machine Learning Algorithms are being used to create state-of-the-art Self-Driving Cars and what not, but at the same time, similar algorithms can help building sustainable interventions that affect someone’s life directly. This is the beauty of Data Science.