In the most recent edition of The Economist, an article titled “New schemes teach the masses to learn AI” appeared. The article profiles the efforts of fast.ai, a Bay Area non-profit that aims to demystify deep learning and equip the masses to use the technology. I was mentioned in the article as an example of the success of this approach — “A graduate from fast.ai’s first year, Sara Hooker, was hired into Google’s highly competitive ai residency program after finishing the course, having never worked on deep learning before.” I have spent the last few days feeling uneasy about the article. On the one hand, I do not want to distract from the recognition of fast.ai. Rachel and Jeremy are both people that I admire, and their work to provide access to thousands of students across the world is both needed and one of the first programs of its kind. However, not voicing my unease is equally problematic since it endorses a simplified narrative that is misleading for others who seek to enter this field. It is true that I both attended the first session of fast.ai and that I was subsequently offered a role as an AI Resident at Google Brain. Nevertheless, attributing my success to a part-time evening 12-week course (parts 1 and 2) creates the false impression of a quick Cinderella story for anyone who wants to teach themselves machine learning. Furthermore, this implication minimizes my own effort and journey. For some time, I have had clarity about what I love to do. I was not exposed to either machine learning or computer science during my undergraduate degree. I grew up in Africa, in Mozambique, Lesotho, Swaziland and South Africa. My family currently lives in Monrovia, Liberia. My first trip to the US was a flight to Minnesota, where I had accepted a scholarship to attend a small liberal arts school called Carleton College. I arrived for international student orientation without ever having seen the campus before. Coming from Africa, I also did not have any reference point for understanding how cold Minnesota’s winters would be. Despite the severe weather, I enjoyed a wonderful four years studying a liberal arts curriculum and majoring in Economics. My dream had been to be an economist for the World Bank. This was in part because the most technical people I was exposed to during my childhood were economists from organizations like the International Monetary Fund and the World Food Program.I decided to delay applying for a PhD in economics until a few years after graduation, instead accepting an offer to work with PhD economists in the Bay Area on antitrust issues. We applied economic modeling and statistics to real world cases and datasets to assess whether price fixing had taken place or to determine whether a firm was misusing its power to harm consumers. A few months after I moved to San Francisco, myself and some fellow economists (Jonathan Wang, Cecilia Cheng, Asim Manizada, Tom Shannahan, and Eytan Schindelhaim) started meeting on weekends to volunteer for nonprofits. We didn’t really know what we were doing, but we thought offering our data skills to non-profits for free might be a useful way of giving back. We emailed a Bay Area non-profit listserv and were amazed by the number of responses. We clearly saw that many non-profits possessed data, but they were uncertain on how to use it to accelerate their impact. That year, we registered as a non-profit called Delta Analytics and were joined by volunteers that worked as engineers, data analysts and researchers. Delta remains entirely run by volunteers, does not have any full time staff, and offers all engagements with non-profits for free. By the time I applied to the Google AI Residency, we had completed projects with over 30 non-profits. Delta was a turning point in my journey because the data of the partners we worked with was often messy and unstructured. The assumptions required to impose a linear model (such as homoscedasticity, no autocorrelation, normal distribution) were rarely present. I saw first-hand how linear functions, a favorite tool of economists, fell short. I decided that I wanted to know more about more complex forms of modeling. I joined a startup called Udemy as a data analyst. At the time, Udemy was a 150-person startup that aimed to help anyone learn anything. My boss carved out projects for me that were challenging, had wide impact and pushed me technically. One of the key projects I worked on during my first year was collecting data, developing and deploying Udemy’s first spam detection algorithm. Working on projects like spam detection convinced me that I wanted to grow technically as an engineer. I wanted to be able to iterate quickly and have end-to-end control over the models I worked on, including deploying them into production. This required becoming proficient at coding. I had started my career working in STATA (a statistical package similar to MATLAB), R, and SQL. Now, I wanted to become fluent at Python. I took part-time night classes at Hackbright and started waking up at 4 am most days to practice coding before work. This is still a regular habit, although now I do so to read papers not directly related to my field of research and carve out time for new areas I want to learn about. After half a year, while I had improved at coding, I was still not proficient enough to interview as an engineer. At the time, the Udemy data science team was separate from my Analytics team. Udemy invested in me. They approved my transfer to engineering where I started as the first non-PhD data scientist. I worked on recommendation algorithms and learned how to deploy models at scale to millions of people. The move to engineering accelerated my technical growth and allowed me to continue to improve as an engineer. n parallel to my growth at Udemy, I was still working on Delta projects. There are two that I particularly enjoyed, the first (alongside Steven Troxler, Kago Kagichiri, Moses Mutuku) was working with Eneza Education, a ed-tech social impact company in Nairobi, Kenya. Eneza used pre-smartphone technology to empower more than 4 million primary and secondary students to access practice quizzes by mobile texting. Eneza’s data provided wonderful insights into cell phone usage in Kenya as well as the community’s learning practices. We worked on identifying difficult quizzes that deterred student activity and improved tailoring pathways to individual need and ability. The second project was with Rainforest Connection (alongside Sean McPherson, Stepan Zapf, Steven Troxler, Cassandra Jacobs, Christopher Kaushaar) where the goal was to identify illegal deforestation using streamed audio from the rainforest. We worked on infrastructure to convert the audio into spectrograms. Once converted, we structured the problem as image classification and used convolutional neural networks to detect whether chainsaws were present in the audio stream. We also worked on models to better triangulate the sound detected by the recycled cellphones. In early 2017, I decided to start working on a curriculum to teach fundamental principles of machine learning. The decision was motivated by a desire to move Delta from a non-profit that bridged the skill gap to one that also built technical capacity all over the world. By empowering local communities to leverage their data, we were encouraging a more sustainable long term intervention. I left Udemy and worked full-time with a group of volunteers at Delta (Hannah Song, Amanda Su, Jack Pfeiffer, Rosina Norton, Emily Rourke, Kevin Pan, Melissa Fabros) to develop a curriculum that included both theory and coding modules. I relocated with Hannah Song to Nairobi, Kenya to teach our pilot course. We constructed a local dataset by calling the Kiva API to pull all loans given out in Kenya over the last 10 years. Melissa Fabros, Lina Huang and Sydney Wong are currently teaching a second iteration of this course in Agadir, Morocco and the teaching team has grown to include more incredible volunteers including Brian Spiering, Mario Carrillo, Thuongvu Ho and Parikshit Sharma. All in all, I have just described four years of effort and participation in the machine learning community that preceded my involvement with fast.ai. I will never know exactly why I was accepted to the Google AI residency program. However, I doubt it was solely because of the fact that I took the fast.ai course. This is not to discount the value of what Rachel and Jeremy are doing. fast.ai is very special: it is part of a wider statement about access, empowerment, and democratization. The curriculum has open sourced knowledge previously confined to a narrow set of research labs and PhD programs. Most importantly, they have moved discourse about deep learning out of the confines of academic conferences and made students feel comfortable applying technological advances to solve problems all over the world. However, I am concerned that the story in the Economist not only displaces my own narrative but also sets unrealistic expectations for anyone setting out in the field. The article discounts how difficult the road is and may unintentionally cause students to question themselves when they do not achieve their desired outcome immediately. The hard truth is that effort alone rarely fully explains someone’s achievements. Many people believed in me along the way, pushed me out of my comfort zone, and gave me the opportunity to work on high-impact, non-trivial problems that showcased my ability. I benefited from being in San Francisco, where the concentration of technical talent meant access to mentorship and immersion in interesting technology. There was also an element of luck. The Google AI residency program did not exist a few years ago. I am part of the second cohort in the program’s history. The premise of the residency is as revolutionary as the motivation of fast.ai; the residency was created to open up the research field to candidates from diverse and atypical backgrounds. The success of the program is clear from the number of residency programs that have since been announced at other top research labs (Uber, Facebook, Microsoft, OpenAI). Our field needs more diversity. There must be more people like me who feel welcome and who are given the tools to succeed. However, part of preparing people to succeed is to be candid about how challenging it can be and how many failures there are along the way (see an excellent set of articles about failure here). It is misleading for The Economist to suggest that after a 12-week part-time course, you are at the finish line. It is also disconcerting to imply that Google Brain is the finish line. I love my job and the people I work with, but we must be cautious of suggesting this is the only outcome that deserves to be highlighted. There are many talented individuals around the world that are working on important problems and are not working at Google. If you are considering embarking on your own self-taught journey, I would suggest asking yourself one very important question — are you still passionate about machine learning if you do not make it to a company like Google? If you are uncertain about whether the answer is yes, proceed with caution. Ultimately, there is no certain path to achieving your goals. I work in an area I love with collaborators who continually inspire me to improve. Google Brain has given me the opportunity to work on important research on deep neural network interpretability and model compression. I have also had the chance to be part of initiatives like the Google Brain research lab led by Moustapha Cisse in Accra, Ghana. The tricky thing is that the more you know, the more you realize there is to know. There remain many sub-fields where my exposure is limited. I continue to ask questions, and always follow-up when a concept is not clear. I never pretend to understand something when I do not. However, my unusual background also brings certain advantages. I have insights that my colleagues do not, and my exposure across fields often leads me to connect concepts in unexpected and novel ways. In 2017, I began teaching fundamentals of machine learning with the same motivation as Rachel and Jeremy. We are not alone, efforts like the Deep Learning Indaba, the AIMS masters program, Data Science Africa, the deep learning summer school and blog posts like distill.pub aim to open up the field to newcomers. My motivation for teaching is to ensure that anyone interested in machine learning is able to understand fundamental concepts in our field. For some, machine learning will be a passing hobby. For others, it will be the beginning of a journey to becoming a researcher/engineer/data scientist/analyst. Both outcomes are useful and enrich our community. We need more people with passing familiarity to enrich discussion of policy and how machine learning technology impacts society. We also need to encourage those who want to go further and contribute to fields of research and applied technology. My journey is undoubtedly still underway, and despite being slow and gradual, has always been deeply fulfilling. Acknowledgments: Thanks to Melissa Fabros, Amanda Su and Prajit Ramachandran who provided useful feedback and valuable edits to earlier drafts of this article. Note: I was never reached out to by the Economist or asked to comment on the article prior to publication.
11 Comments
|
AuthorSara Hooker is researcher at Google Brain working on training models beyond test-set accuracy to fulfill multiple criteria -- interpretable, compact, robust, fair. ArchivesCategories |