Home / Get Started / Start a Project

The zero-to-project guide

How to start a research project from zero

No coding. No lab access. No paid programs. Just a real question, free public data, and a method that fits your timeline. Work through it in order.

Step 1

Pick a disease area you genuinely care about

Research is long and quiet. The students who finish are the ones who actually care about the question. Start from something personal: a cancer that touched your family, a biology topic that hooked you, a news story you could not stop thinking about.

  • Name one cancer type or theme (for example: pancreatic cancer, immunotherapy, pediatric brain tumors, early detection).
  • Write one sentence on why it matters to you. You will reuse this in every cold email.
  • Keep it narrow. "Cancer" is not a topic; "why pancreatic cancer is found so late" is.
Gloved hands handling samples in a laboratory
Step 2

Do a real literature review

Two free tools do almost everything: PubMed for indexed biomedical papers, and Google Scholar for breadth and citation counts.

Where to search

  • PubMed — search your topic, filter to the last 1 to 2 years, sort by "Best match".
  • Google Scholar — find highly cited reviews to learn the landscape fast.
  • Open the reviews first. A good review hands you the open questions in its final section.

How to read an abstract, then the full paper

An abstract has four hidden parts. Find each one in a single read:

  • The question — what gap are they trying to close?
  • The method — how did they try to answer it?
  • The result — the one number or finding that matters.
  • The limitation — what they admit they could not do. Limitations are where your project lives.
Then go deep on the full paper Read in this order: title, abstract, then the figures (the figures tell the story), then the introduction for background, then the discussion for what they think it means and what is still open. You do not need to understand every method on the first pass.
Step 3

Identify a gap, then run the "So What" test

A gap is an open question the field has not answered yet. You find it by reading, not by guessing. After about ten papers, you will start seeing the same unanswered question appear in everyone's "limitations" and "future work".

The "So What" test State your question, then ask "so what?" If answering it would not change how anyone screens, treats, or understands the disease, it is not a question yet. Keep tightening until the answer would actually matter to someone.
A researcher reviewing data and findings
A gap is the bridge nobody has built yet.
Step 4 • the most important skill

Think novel: how to find an idea nobody has done

Novel thinking is the whole game. Anyone can repeat a study. The students who get noticed find a question at the edge of two things that have not been connected yet. Here is how.

Technique 01

Read ten, list the silence

Read ten papers on your topic, then write down every question none of them answered. That list of silences is your gap map. The most repeated silence is your best idea.

Technique 02

Cross-domain transfer

Take a method that works in one cancer type and apply it to another where nobody has tried it. A biomarker model built for breast cancer may have never been tested on pancreatic data.

Technique 03

Computational angle on a wet-lab question

Many questions that "need a lab" can be attacked with public data instead. Ask the biology question, then answer it by analyzing a dataset rather than running a bench experiment.

Technique 04

Build on an existing dataset

You do not have to generate new data. Start from TCGA (genomics) and TCIA (imaging). A new question asked of existing data is still original work.

Technique 05

Check that it is actually novel

Before you commit, search your exact question on PubMed and Google Scholar. If it is already answered, sharpen it: a new population, a new cancer type, a new comparison. Novelty is a search, not a hope.

Technique 06

Combine two datasets

Linking genomics from TCGA with imaging from TCIA on the same cancer type is a classic source of fresh, answerable questions that few high schoolers attempt.

Step 5

Choose a methodology that fits a high schooler

Four approaches are realistic without a university lab. Two need no coding and no lab at all.

Bioinformatics

Light codingNo lab

Analyze public genomic or imaging data. Start with spreadsheets, grow into Python or R.

Meta-analysis

No codingNo lab

Systematically combine results from many published studies into one quantitative answer.

Case series

No codingMentor helpful

Describe a pattern across a small set of (de-identified) patient cases, usually with a clinician.

Service-based

No codingNo lab

Design, run, and measure a real outreach or education project, then report the impact.

No coding and no lab? Two paths are built for you A meta-analysis or a service-based project requires neither. They reward careful reading and clear writing, which you can start today.
Step 6

Computational vs wet lab vs clinical

Know which kind of research you are doing. It decides what you need and who you email.

Computational

You work with data on a laptop: genomics, imaging, statistics. Needs: a computer and curiosity, optionally Python. No lab required.

Best first stop for most students.

Wet lab

You run experiments at a bench: cells, samples, assays. Needs: physical lab access and a mentor who supervises you.

Get here through cold emailing local labs.

Clinical

You study patients, treatments, or outcomes. Needs: a clinician mentor and often ethics approval (an IRB).

Start with published trials or a case series.

Step 7

Fit it to your skills and your timeline

The biggest mistake is picking a project your calendar cannot finish. Match the scope to where you are.

  • Seniors and the time-pressed: do not start a long wet-lab project. A meta-analysis or a focused computational question can finish in weeks, not years.
  • Underclassmen with time: you can afford a longer arc, including learning Python while you work.
  • No coding yet: start with a meta-analysis or service project, and learn Python on the side for free (see below).
  • No lab access: stay computational or literature-based. You can do genuinely original work without ever entering a lab.
A model of the DNA double helix
Public data lets you start now, from a laptop.
Step 8

Where to find free datasets

Every source here is free and open. Start your novelty hunt in TCGA and TCIA.

The Cancer Genome Atlas (TCGA)

Genomic and molecular data across dozens of cancer types. Browse it through the GDC Data Portal.

The Cancer Imaging Archive (TCIA)

A huge open library of de-identified medical images. cancerimagingarchive.net

cBioPortal

The friendliest front door to TCGA-style genomics, no coding needed to explore. cbioportal.org

GEO (Gene Expression Omnibus)

Open gene-expression datasets from thousands of studies. ncbi.nlm.nih.gov/geo

SEER

U.S. cancer incidence and survival statistics, ideal for population questions. seer.cancer.gov

PubMed & Google Scholar

Your literature "dataset" for any meta-analysis. PubMedScholar

Learn Python for free, while you work Most early computational work is cleaning spreadsheets. Start with Harvard CS50 (free), then a pandas tutorial. You can begin a project before you finish either.
Step 9

Free programs worth knowing

Programs are optional. Cold emailing a lab usually beats waiting on an application. If you do apply, only consider free ones.

Selective and free

  • RSI (Research Science Institute) — the flagship free summer research program, run by the Center for Excellence in Education.
  • Inspire AI and similar student research programs — check each cycle's official site and confirm it is free.
  • University and NIH/NCI programs — many local universities run free or stipend-paid summer research for students.

How to vet any program

  • It is run by a university, hospital, or nonprofit, not a private company selling a "program".
  • It is free, and often pays you a stipend. It never charges tuition.
  • You do real work under a researcher, not a packaged "experience".
Do not pay for research No legitimate lab charges you to work in it. Paid "summer research programs" that charge tuition are not real research. Skip them entirely and cold email a real lab instead.
Step 10

Find a mentor's email, and message the right ones

Who you email depends on your project. Computational work? Email researchers who already use TCGA or TCIA. Wet lab? Email local labs you could physically reach. Clinical? Email clinician-researchers at nearby medical schools.

Where the email address lives

  • The corresponding author email printed on their recent paper (PubMed or the journal page).
  • The university's faculty directory page.
  • The lab website "contact" or "people" page.
  • Their Google Scholar profile.
An open laptop on a desk, ready to write an email
Your answer sheet

Tell us your situation, get a tailored plan

Answer five questions. We will tell you whether your combination is an optimal fit, recommend a methodology, point you to the right data, and hand you a pre-call checklist.

1. Are you working with friends, or solo?
2. What type of research are you drawn to?
3. Can you code?
4. Do you have lab access?
5. What is your timeline?
What you will get A verdict on whether your mix is optimal, a recommended method (bioinformatics, meta-analysis, case series, or service-based), the exact datasets to start from, who to cold email, and a checklist to run before any call with a mentor.
One question away

Turn your idea into a lab seat

You have a question and a dataset. The last step is reaching the person who can mentor you. That is what the masterclass is for.