Shahan Shahid Nawaz
I am a data scientist passionate about solving exciting problems, and data visualization.
I hold the Master of
Information Management & Systems from UC Berkeley's School of Information for which I
studied data science, machine learning, natural language processing, and data visualization. I am currently a Data Scientist/Researcher at the
Othering and Belonging Institute's Equity Metrics team where I use data science to tackle questions of
inclusivity, including residential zoning reform in California.
During my master's, I built several data science projects involving Machine Learning,
Dimensionality Reduction, and Spatial Analysis; coded hardware to build a smart scent diffuser;
learnt swimming; and studied Punjabi in the Gurmukhi script.
In the past, I helped implement a randomized controlled trial in Pakistan aimed at mending the citizen-state
social contract, and was based in Hong Kong between 2018 and 2020 where I served as the Centre for Asian Philanthropy and Society's main data
resource and helped produce cross-regional and
one-of-a-kind data analysis.
I am open to work for Data Scientist/Analyst roles starting March 2024.
Email  / 
CV  / 
LinkedIn
 / 
Twitter  / 
GitHub
Click on the sections below to browse through examples of my work.
|
|
Through projects, I have explored my interests in hardware, data science, programming, sports,
design, and the news.
|
Inclusiveness Index 2022 | February 2022
Tool: R
Wrote reproducible R code to gather, clean and analyze data from multiple sources to
refresh the Inclusiveness Index, an annual, global ranking which gauges the degree of
inclusivity across the world. The index relies on data measuring inclusivity towards race,
gender, sexual orientation, religion, disability, economic status, and political
participation. My work served as the input for the visualization linked above.
|
|
Weather Chanel | October 2022
Tool: Arduino
Conceived of and programmed an ambient user interface to communicate current weather
through scentâakin to "smelling the rain"âand color combinations. Art and 3D modeling by
Helena Guo. Supported by Ellis Martin, Alan Kyle, and Akshay Dan Bhavish Juleemun.
|
|
How the New York Times Front Page Works | December 2021
Tool: Python (pandas, NLTK)
Used Natural Language Processing tools in Python to quantatively determine the lexical
difference between an article's headline and its preview text, as both appeared in a
sample of print New York Times front pages. I found that certain parts of speech,
i.e., plural nouns, singular nouns, and singular verbs, appear more frequently in
headlines than they do in an article's front page preview. This finding confirmed the idea
that print headlines are a unique form because they have to convey context, and capture
readers' interest.
|
|
Pakistan Super League Player Draft System | August 2018 - December 2021
Tool: Python (pandas)
Used Python to build a player draft simulator that outputs the most equitable pick order
ensuring no team has an unfair advantage at the time of player selection. The system has
been in use for the Pakistan Super League, one of the world's most recognizable cricket
leagues, since 2018.
|
|
One-Point Perspective Photos | March 2017
Tools: IFTTT (If This Then That), Tumblr, Instagram
Built a semi-automated Tumblr website to display hand-picked one-point perspective photos
from the web. Inspired by Stanley Kubrick's use of the one-point perspective and
Kogonada's video ode to it.
|
|
As an ardent follower of the news with training in data analysis, I am interested in visualizations
that
are simple and explainableâyet compelling. Please click on the links below to view my work in
more
detail; representative visualizations are highlighted.
|
Excelling at Telling Stories with Data: Learn and Get Inspired | August 2022
Blog post meant as an introduction to data visualization featuring notable visualizations to
inspire, and best practices.
|
|
The Healthcare Industry's Ongoing Great Reshuffle | April 2022
Tool: JavaScript, D3
Collaborated with Noelle Fa-Kaji and Daniela Perez to tell the story of the US healthcare industry's
labor
reshuffle away from low-paid, high-stress work as a scrollytelly. My contributions included figuring out
how
to animate line charts in D3 (which we use to set up the story), leading the design of the narrative
arc,
and determining the most suitable copy to accompany the visualizations. This was the final project for a
semester-long course in Information Visualization and we recevied very positive feedback from faculty.
[Icon]
|
|
Improving a Choropleth | March 2022
Tool: Tableau
Realized that a simple change from shading country areas to representing their values as dots helps
tell
the story of the 2017 Freedom of the Press Index better.
|
|
Redesigning to Improve Visualizations | March 2022
Tool: Microsoft Excel
Practiced iteration as a way of improving two published visualizations; focused on giving more
prominence
to the story by choosing an appropriate chart type (whereby vertical bars with a red shade represented
the
size of drop in the value of each bank), and well-thought titles and sub-titles.
|
|
Visualizing in a Constrained Space | February 2022
Tool: Observable
Used visualization as an aid for telling a compelling story in the context of transportation,
specifically
travelling via train versus a plane. My story tied in ticket prices, journey durations, and CO2
emissions to
tell a cohesive story with the help of three unique, bicolor charts.
|
|
Brainstorming Visualizations of Categorical Data | August 2020
Tool: Wireframes/Sketches
Suggested interesting ways of visualizing data from the Centre for Asian Philanthropy's one-of-a-kind
research looking at policy support for philanthropy across Asia. Most metrics were categorical in nature
or
followed a likert scale posing an interesting challenge for finding different ways of plotting the same
type
of data.
|
|
I've had an unconvential, winding path to data scienceâI did not do well in traditional computer
science
courses, and instead learnt Python from a colleague who read a web scraping script to me line-by-line.
I have an enduring passion for empathetic data science education. Some trainings I've attempted are accessible as follows. These trainings are also attempts to
implement
presentation best-practices (e.g., brevity, readable slides, use of isotypes and visuals, presenter focus, etc.).
|
Introduction to Python through Web Scraping | September 2016
Tools: Python, Microsoft PowerPoint
Delivered this training to 20+ individuals, and meant it as a gentle introduction to
computing. I
believe
web scraping serves as a powerful, and visual, problem/programming context a lot of people are
familiar
with, and I structured the training around that. This was my first experience leading such a
training, and I
took away important lessons such as conducting workshops instead of lectures. Training
materials
are
accessible here.
|
|
Webscraping and Beyond | August 2017
Tools: Python, Microsoft PowerPoint
Implemented takeaways from the previous effort. Namely, I divided the training into two
lectures
(the first
structured around ensuring participants had the right setup/workspace and programming
fundamentals/syntax,
and the second on data scraping). I made sure to disseminate the power in scraping, and
detailed
ethical
practices in this context (e.g., using APIs, reading TORs, scraping for educational purposes,
etc.).
Training materials are accessible here.
|
|
Fundamentals of Data Science in Stata | May 2020
Tools: Stata, Microsoft PowerPoint
Delivered as part of an organization-wide effort to transition from Microsoft Excel to Stata,
and
allow
colleagues to explore and anlayze data on their own. This was a smaller, more intimate set of
sessions
meaning I could troubleshoot in the moment, and ensure a shared level of understanding. All
colleagues
lauded the effort, and leadership hailed this as a sucess. Training materials are accessible
here.
|
|
Writing allows me to tap into my undergraduate training in sociology, and I have had the chance to
cover
music and sport, subjects I am very passionate about.
|
HamnawaâNewsletter on Pakistani Music | September 2020 - Present
Platform: Substack
Frequent contributor to Zeerak Ahmed's
newsletter
which
covers music and its interplay with technology and society in Pakistan. My contributions can
be read in
full
here: 1, 2, 3, 4, 5, 6.
|
|
Capturing Ashiq Hussain | December 2016
Platform: Huffington Post (India)
Experimental, longform, non-fiction piece chronicling my experience tracing a local musician
who I
chanced
upon during a road trip through the heart of Pakistan. Though broken now, the piece was
accompanied by 9
tracks I had recorded from the musician.
|
|
Inside Lahore Music Meet | January 2018
Platform: The News (Pakistan)
Recap of a local, two-day, in-person indie music festival published online and in print.
|
|
Cricket World Cup 2015 Predictor | February 2015
Platform: Dawn.com (Pakistan)
Five Thirty Eight-style prediction model for the 2015 Cricket World Cup; I scraped
necessary
data,
conceptualized the prediction model, contributed to its implementation, and wrote the copy for
the
story.
|
|
I am an ardent follower of illustration and design, and strive for aesthetic clarity in my own designs.
|
Feel free to take this website's source
code (I
did too!),
just add a link back to Jon Barron's website.
|
|