Data Science - The Fundamentals

In addition to each Alumni Career Paths panel discussion, we ask our panelists several fundamental questions about their fields so you don't have to! Use the answers they've provided below to get a quick take on whether you want to further investigate this career path, learn about the differences between roles and organizations in this field, and as a starting point for informational interviews if you want to learn more:

  1. What are the responsibilities of someone in your role?
  2. Is a postdoc required, recommended, useful, or unnecessary to enter or excel in this field?
  3. What types of experience are important to highlight in your resume and interview?
  4. What characteristics make someone good at this position?
  5. What do the typical application and interview processes entail?
  6. What possibilities do international folks have to work at your company/organization?

Data Science questions answered in February 2022 by:

Ibraheem Ali, PhD
Sciences Data Librarian, UCLA (UCSF alumnus 2018 Microbiology and Immunology Program)

Sandeep Sanga, PhD
Head of Real World Data Strategy and Alliances, Astellas Pharma US (UCSF postdoc alumnus 2010)

Dominic Tong, PhD
Senior Data Scientist, Insight RX (UCSF alumnus 2019 Bioengineering Program)

Data Science questions answered in February 2021 by:

Ravi Patel, PhD
Data Scientist / Computational Biologist

Peter Cimermancic, PhD
Verily (BMI, PhD 2014)

Natalie Korn, PhD
App Annie (BMI, PhD 2013)

Joanna Lipinski, PhD
Twist Bioscience (BMI, PhD 2013)


What are the responsibilities of someone in your role?

Ibraheem Ali

As a librarian I am called on to help formulate research questions and find diverse sources. I help with data analysis and methods for visualizing data in order to best collect information from the data. This also includes consulting on or validating statistics, and instructing on data visualization tools and methods.

 

Sandeep Sanga

Work with appropriate teams to identify and prioritize research questions, define data requirements to support the research roadmap, identify data sources and any data or digital gaps, plan to close gaps, enable the closure of those gaps, and support analysis of data and enable data science journeys. Lead a team of data scientists that are comfortable operating in the unknown, and enable them to be innovative and work on problems they have a passion for solving, while also ensuring those passions align with organizational objectives, and enabling their career progression.

 

Dominic Tong

I'm a data scientist at an early-stage precision drug dosing startup, which means I wear many hats every day. The core of my role is to convert the data we gather from our partner hospitals into action for our business, whether that is informing our business teams on how we're growing, our customer success teams on which of our partner hospitals require some attention or intervention, or executing research projects in the pharmacometrics space. I also spend time on both ends of that core mission: the Data Science team handles the extraction, transformation, and loading of data into our databases which precedes any analysis; and I communicate our insights to our sales, customer success, and leadership teams to help them make data-informed decisions.

 

Ravi Patel

I am responsible for developing and improving existing clinical genetic testing products. Our team uses computational biology and data science approaches to improve the clinical insights our tests can offer. We collaborate with experimental biologists, software engineers and product managers to develop the bioinformatics and statistics that are used to process thousands of clinical samples everyday.

 

Peter Cimermancic

Responsibilities of an individual contributor are similar to those of a grad student, with perhaps more teamwork. Responsibilities of a manager are similar to those of a PI, with more product and strategy focus. I'm a data scientist at Verily, and I've been here for 4.5 years. In my first role as an individual contributor, I was responsible for planning and executing analyses of proteomics and metabolomics data from clinical studies. Moreover, I was driving the development of new computational tools for mass spectrometry, in collaboration with several teams at Alphabet. I've been also proactively identifying and exploring several other research problems related to our vision (as 20% projects), either myself or by mentoring interns. One of my responsibilities was also to present our findings to partners, support our business development team with potential partner pitches, and evaluate startups for the Verily VC arm.

Recently, I've been managing and tech leading a Pathology Machine Learning team. In addition to overseeing projects of my reports and colleagues from other teams, my responsibilities now also include people management (reports’ career growth, performance evaluation, hiring, developing a productive environment, nurturing an effective team culture, etc.), vision and strategy setting with senior leadership, interteam coordination, managing partner relations, identifying new project opportunities, and driving high-level operations.

 

Natalie Korn

The major responsibility of a data scientist at App Annie is ownership of the market estimates in our products. Other engineers design the front-end and back-end, but the actual logic of how to create an estimate is our domain! Whatever product we are working on, we’re in charge of the numbers. We’re also heavily involved in design and prioritization of products.

On a day-to-day basis, my job is to boot up a cluster on AWS and work on any of two or three current projects. I also answer questions from sales and product managers about our products and have a hand in keeping track of bugs and QA. Every day I have a team standup where we talk about what we’re doing, and between one and six other meetings. An example of a week-long project would be deciding we want a new feature in a model. I do exploratory data analysis, discuss with the product manager and engineers, and write a class that produces a table containing the feature. I then work with the data engineer to hook that table up to the model pipeline.

 

Joanna Lipinski

As a director I am no longer hands-on. My primary function is to take care of my team, guide them and ensure smooth execution of projects. I focus on the big picture and rely on my team’s technical expertise to execute. I work with the executive team on long term planning and vision and ensure we have adequate resources and skills/talent to support company-wide needs for scientific computing (bioinformatics and data science).


Is a postdoc required, recommended, useful, or unnecessary to enter or excel in this field?

Ibraheem Ali

Unnecessary.

 

Sandeep Sanga

Useful

 

Dominic Tong

Unnecessary. A postdoc gives you more research experience and more things to talk about during your interview, but I don't view it as necessary for what I do. You may find it useful if you don't have the coding background to postdoc for a couple of years with a computational lab, but coding is also a skill you can learn independently.

 

Ravi Patel

About half of our team has done a postdoc. A postdoc is not required for data science or computational biology roles. Many people do get a postdoc before ultimately transitioning into data science, but many people transition to data science roles directly after their PhD. If you know you are interested in a career in data science you should only consider a postdoc if you believe that you will gain some set of skills that will make you stand out as a data science candidate. One example is doing a postdoc working on medical devices when you know you are interested in becoming a data scientist working on wearable devices. The type of data you are working with during that postdoc should be directly relevant to the types of roles you are excited about.

 

Peter Cimermancic

Postdoc is unnecessary in the field of data science/ML research. Candidates can join Verily at different stages of their careers. College and masters graduates generally join at an entry level, and are expected to execute independently in ~2 years. PhD students join a level higher and are expected to independently design/plan analyses, execute as an individual contributor and drive smaller teams, as well as proactively identify operational and/or research opportunities. Postdocs in my field (machine learning and computational biology) are uncommon, and generally result in a hire at the same level as PhD graduates. A postdoc is unnecessary when you're certain to continue your career in industry, unless you would like to boost up your resume or change your field of study. Industry experience and specialization on the job is more valuable than doing a postdoc. Many pharma and biotech offer this, but if your goal is a career in industry, I'm strongly opposed to industry postdocs; they're just fancier internship programs and can pay below the fair market value. If your goal is academia, then industry postdocs can be very interesting. In summary, I highly recommend you knowing your career goals (industry vs academia) by mid of your PhD time, and then align the next steps accordingly.

 

Natalie Korn

Unnecessary.

 

Joanna Lipinski

Unlike in academia, in industry there is no prescribed path. I’ve worked with and hired people from many different backgrounds and varied levels of experience. Some people came in with a postdoc, others straight out of a PhD or Data Science Boot-Camp. Some don’t even have a PhD they just worked their way up (usually start on a lower-level job and with experience move up).


What types of experience are important to highlight in your resume and interview?

Ibraheem Ali

Experience with data management, participating in the process of peer-review of literature, or writing literature reviews. Working with groups or committees on projects.

 

Sandeep Sanga

Nothing specific, but it should be clear that one need be hungry to gain real world experience. Keep trying things and pivoting towards areas where you see the light.

 

Dominic Tong

Initiative: we are a startup, so we have many more problems than people to solve them. Showing that you can identify high-value problems, adapt what you know to solve them, and deliver results is incredibly important. Communication: you will have to present your work in many mediums - a Zoom presentation, a poster, a talk in front of people, a paper, a Slack message, a check-in meeting - to many people of varying degrees of expertise. Being able to identify what matters most to the people you're talking to and deliver that information clearly is a great thing to show off. Technical skills: yes, you have to know relevant skills to do the work. Showing you're thoughtful about how to put together an analysis or project in code is essential to my job.

 

Ravi Patel

Focus on the translatable skills from your academic career. Did you have to perform complicated statistical analyses, work with messy high dimensional data or use a university computing cluster? Those are all great things to mention in your resume. You should also try to highlight the impact of your work in each and every bullet point in your experience section. Publications are less relevant than the other impacts of your work such as time saved, grant/fellowship dollars earned, or challenges overcome by your work. As a PhD student you are often acting as your own project manager for whatever you are working on, so make sure to talk about the project management aspects of your role including designing experiments, coordinating lab duties, etc.

 

Peter Cimermancic

Aim to cover all of the minimum requirements, and as many preferred ones as possible. Be succinct and on point. Cover not only the "what's", but also "how's".

When reviewing resumes and interviewing, we focus on:

  • technical/domain knowledge fit (all of the resumes cover this well),
  • complexity (what made your project difficult/challenging?),
  • execution (can you finish projects? how did you succeed in completing your work?),
  • communication (can I follow what you're talking about?), and
  • leadership (are you proactive, can you drive/organize a small team around a larger project?).
  • And googliness.

Use quantitative metrics. For example: “I published 10 first-author papers”, “I collaborated with 5 academic labs”, “I mentored 3 junior colleagues”, “Performance of my model is 5% better than that of state-of-the-art model X”.

A few tips not entirely related to this question: Recruiters and hiring managers will spend 30-60 seconds on your application. Repeating, you've got ~45 seconds of our attention. Keep your resumes short and tailored towards job requirements. Keywords searches are a thing, so use in your resume as many phrases from the job description as possible. Use your peer network and try to meet with someone from the company - getting a referral from within is very valuable and will speed up the hiring process. When it comes to the interview, keep your answers short and to the point - I know you want to prove you're smart and that you've done a lot more beyond just what the interviewer is asking, but long answers could mean interviewer not getting through their list of planned questions, hence not being able to evaluate you properly. Interviewing is a skill and expect a few to go poorly; applying for a few companies/roles that you're less likely to accept can be a good learning experience.

 

Natalie Korn

Side projects that showcase your ability to code are important. Also remember that data science is not just.fit(). Show real-world value in the interpretation of your results, or build a full-stack app.

The other common knock on academics is that we just think and never actually finish anything. So, when given a technical question in your interview, ask clarifying questions early, and err on the side of talking and decision making over thinking and tinkering. The best interviews are conversations.

 

Joanna Lipinski

Resume:

  • Keep it succinct & easy to read
  • Highlight skills that are applicable to the position (keep it focused)
  • Be honest and don’t exaggerate

Interview:

  • Ability to execute/deliver on a project
  • In-depth understanding of data and solutions (algorithms, tech-stack, etc) that went into your project; understanding limitations of your solution
  • Demonstrate you can be pragmatic - sometimes it is necessary to deliver the “good enough” solution rather than the ideal one

What characteristics make someone good at this position?

Ibraheem Ali

Being able to juggle multiple competing deadlines on projects with unrelated subject matter. Working with uncertainty about direction and scope of responsibilities. Working with others to create direction and boundaries on responsibilities and effort.

 

Sandeep Sanga

Innovative personality. Comfortable operating on the fringes relative to the pack.

 

Dominic Tong

You have to be a leader, even as an individual contributor. That means taking ownership of your projects, being accountable and communicating with your team and other teams about deadlines and expectations. It means taking the initiative to identify and solve problems as you see them. It means effectively prioritizing your own time, whether that's to work on the long-standing research project or taking a necessary day off to breathe and reset. It means respectfully speaking up when you have an idea to contribute.

 

Ravi Patel

There is certainly a set of technical skills that are required for this job such as bioinformatics, statistics, and algorithmic programming. However, to excel in this position one needs to be willing to apply creative problem-solving skills and know how to communicate with a wide variety of people, i.e software engineers, genetic counselors, marketing, etc. Communication to external stakeholders is the number one responsibility of all people in leadership roles on our team.

 

Peter Cimermancic

General attributes: technical competence, learning quickly, self-initiative, focus, drive, being a doer and team player. Specific to compbio/ML: more width than depth (math, ML, stats, coding, biology, domain knowledge), integration of ideas, critically following literature, risk-taker.

 

Natalie Korn

The ability to learn new things is important. Don’t be hooked on any particular language or design, things move quickly. I spent three years 'mastering' pandas in my PhD, and on the first day of my new job they told me to forget pandas! We use pyspark.

It’s not all “move fast and break things.” We can allow ourselves to be meticulous in our standards for data quality, but the pace is much faster than I was used to and direction can change very quickly.

 

Joanna Lipinski

One of the characteristics that is absolutely necessary is the ability to get along with people. Interpersonal conflicts are very costly - they kill productivity - and certainly are not fun to deal with. Technical skills can be taught; much harder to teach someone to be a good teammate.

As to more technical skills I’d be looking for a solid understanding of how ML algorithms work (you don’t have to be an expert on all, but you should have a solid understanding of those that you’ve used), validation techniques and deep understanding of your data. Basic understanding of good engineering practices such as use of source control, documentation, code commenting, unit testing, etc, are also a must.


What do the typical application and interview processes entail?

Ibraheem Ali

CV/Resume along with personal and diversity statements. Full day interview usually to speak to committee/team members, facilities and other key stakeholders.

 

Sandeep Sanga

Presentation on a topic that demonstrates you have technical skill, scientific skills, and communications skill. Multiple interviews with potential team mates, collaborators, and management.

 

Dominic Tong

We put up a job posting, you meet some of the criteria and apply. You're a unicorn if you hit every single requirement of the job description, so don't hesitate to apply if you hit some of the requirements! We read over your application, mostly your resume, and if we like what you're bringing, we'll schedule an introductory call. At this stage, you're one of potentially hundreds of resumes, so make sure you get friends and mentors to help with your resume. Better yet, meet someone at the company and have them refer you so your resume gets to the top of the pile. Then if you pass the initial 30 minute call, we'll give you a technical challenge, which is a mini-project designed to test your skills on what we consider to be the core parts of the job. Pass that, and we'll bring you in for an "on-site" interview over Zoom, where you'll meet the team and the company, and we'll get a chance to dive deeper into your resume, technical skills, and interpersonal skills. This is also your chance to evaluate the company and the team and see if it's the right fit for you. If the whole team is an enthusiastic yes on you, then we'll offer you the position and I'm sure the fine folks at UCSF OCPD have some great tips on that process!

 

Ravi Patel

The application process can be anything from applying on LinkedIn to getting a referral and introduction to the hiring manager for the role. I highly recommend leveraging your connections and getting referred to roles as often as possible. The data science and computational biology interview process is very similar at most companies but can vary depending on the size of the company/team and urgency with which the role needs to be filled. The first step is usually a meeting with a recruiter. The recruiter will tell you more about the rest of the interview process and about the role and the company. They will usually ask a few questions about your resume and possibly some cultural interview questions. The next step is typically a meeting with the hiring manager, which will be more in-depth questions about your resume and possibly some technical conversations. They will ask a lot of cultural interview questions as well. This is typically followed by a take-home assignment which usually takes 5-15 hours. This assignment is usually representative of the type of work you would be doing in the role. The last step is an "on-site" interview, although these days it is over video conference. The on-site varies a lot company to company. they may ask you to present some of your research or present the take-home assignment. You may have modules where you would be tested on skill relevant to the position, i.e. SQL and/or coding. There may be a data challenge or possibly a paired programming exercise. There will again be a lot of cultural interview questions and lots of opportunity for you to ask questions to your interviewers.

 

Peter Cimermancic

Recruiter screen (keywords), hiring manager review (1 minute per candidate), phone screen by recruiters or hiring manager (30-45 minutes), on-site interview (3-5 45 minute interviews, each one covering a different area, including white-board coding, statistics, ML, domain knowledge etc.). Importantly, declined applications do not mean you're not good enough - we're looking for a very specific set of skills and no-hire decisions are mainly due to suboptimal fit and not lesser competency. Also, note that any interview process has low specificity and sensitivity, with significant luck-factor involved (the right interviewer, the right questions etc.), so keep trying, even with the same company after a year or two.

 

Natalie Korn

Before 2021, all hiring was done through the Insight Data Science program. Now, we have a short phone screen mostly for culture fit, however we generally only extend our case study to quantitative PhDs. The case study is a one-week project, two days in there is a call with the manager to answer any questions you have. The onsite interview begins with your 40-minute presentation of the case study, followed by 30 minute interviews with our team, a product manager and data engineer.

 

Joanna Lipinski

The interview process might vary slightly depending on the position and the hiring manager’s preferences. But here are the general steps:

  • Hiring manager reviews resumes, selects “promising” candidates
  • HR or the manager will reach out for an intro/initial screen call
  • A more technical call/screen; some of my managers also like to give out a coding test
  • Onsite interview (zoom during covid); might include a presentation
  • Reference check + salary negotiation with HR
  • Offer

What possibilities do international folks have to work at your company/organization?

Ibraheem Ali

There are opportunities for international folks to work at the organization.

 

Sandeep Sanga

We are a global company hiring in many regions and we support H1B applications for candidates should they need it.

 

Dominic Tong

I am an international folk! Students on the F1 visa have a nice 1 year OPT and a 2 year STEM OPT extension, which are fairly straightforward for getting a job at any company. Past that, you'll need one of the work visas - TN for USMCA treaty folks, H1B for others, and I'm sure I don't know all of the visa types. The UCSF ISSO are not immigration lawyers, but they're super helpful with this process!

 

Ravi Patel

Larger organizations, like the one I work at, can typically sponsor visas and will absolutely consider candidates with OPT. Smaller companies typically can't sponsor visas, but occasionally do take candidates on OPT. Recent immigration policies have made many companies reluctant to consider candidates that will eventually need visa sponsorship, but many companies still are sponsoring visas, especially for highly technical roles. I highly recommend checking with the recruiter during your first interview if the company is able to sponsor visas for the role.

 

Peter Cimermancic

Verily is country-of-origin agnostic and will help you obtain work permits/visas. Verily is an equal opportunity employer and does not discriminate against any job applicant because of race, color, religion, national origin, sex, physical or mental disability, or age. In fact, we put in significant resources to attract talent from diverse and underrepresented backgrounds. Diversity, equity, inclusion are very important values to us. I'm not a US citizen, and have joined Verily on a student J1 visa academic training program "extension". Verily covered all of the expenses from my green-card application, including hiring an attorney firm who made the process very easy (for example, they drafted detailed reference letters and more). Not having US citizenship doesn’t matter when applying for large companies but may limit your chances with some startups.

 

Natalie Korn

App Annie is a global company with comprehensive visa sponsoring, international students and H-1B visa holders are encouraged to apply!

 

Joanna Lipinski

For the most part, same as anybody else - it just takes some extra paperwork/going through legal. However, occasionally we do have to turn down a qualified candidate if there is no clear path to securing a work visa.

Upcoming Events

Wed
04

OCPD Office Hours (for Graduate Students and Postdocs)

Date: December 4, 2024
Time: 10 - 11 a.m. PT
Mon
09

Career and Professional Development Writing Retreat

Date: December 9, 2024
Time: 12 - 4:30 p.m.
Mon
09

OCPD Winter Celebration (for Graduate Students and Postdocs)

Date: December 9, 2024
Time: 4:30 - 6 p.m.
Tue
10

Flagship Pioneering Fellowship Info Session

Date: December 10, 2024
Time: 12 - 1 p.m.