Excel to Python: A Practical Guide for Analysts

Michael Murr··10 min read

Last updated: June 2026

Quick answer

Moving from Excel to Python takes about 6 to 8 weeks at 3 hours per week to reach productive analyst-level fluency. The shortest path is to install Anaconda, open a Jupyter notebook, learn the five Pandas operations that replace 80% of your Excel work (read_csv, filter, merge, groupby, pivot_table), then rebuild one of your real weekly reports in Python. The mental shift is treating data as tables you query, not cells you click.

TL;DR

  • The core 5 Pandas operations cover most everyday Excel work (in our experience): read_csv, filter (boolean indexing), merge (VLOOKUP), groupby (SUMIF/COUNTIF), and pivot_table.
  • Rebuild one real report first. Don't learn Python abstractly. Pick the weekly report you hate most and rebuild it in a Jupyter notebook.
  • The mental model shift matters more than syntax. Excel is cell-by-cell. Pandas is table-by-table. Once that clicks, everything else accelerates.

Who this is for

This article is for analysts, ops people, finance teams, and PMs who live in Excel and want to move into Python for analysis. If you build pivot tables in your sleep, write nested IF formulas, and have a workbook with 14 sheets you maintain weekly, you are the target reader.

I have taught Python to dozens of Excel-first analysts. One of them rebuilt her four-hour weekly reporting process into a 15-minute Python script within two months. The path she took is what this article walks through.


Why move from Excel to Python at all?

Excel is not the problem. It is one of the most powerful tools ever made for analysts. The problem is what happens when your analysis grows past Excel's comfort zone.

You hit the Excel ceiling when:

  • Your workbook crashes when you open it
  • You have to copy-paste the same VLOOKUP across 40,000 rows every Monday
  • You need to combine three data sources that don't fit in one sheet
  • A junior analyst breaks the model by sorting a column wrong
  • You want to schedule the report to run automatically at 7am

Python handles all of these natively. Pandas can load a million-row CSV in seconds. A Python script you write once can run forever without anyone touching it. And once you connect Python to a database, AI tool, or scheduler, the leverage compounds.

For the broader case on why analysts should learn Python, see Python for business analysts. This article is the practical how-to.

What setup actually works for an Excel-first analyst?

Skip the long install guides. Here is the minimum setup for an Excel-first analyst:

  1. Install Anaconda (anaconda.com/download). It bundles Python, Pandas, NumPy, Matplotlib, and Jupyter in one installer.
  2. Open Anaconda Navigator and launch Jupyter Notebook. This is your replacement for Excel. A notebook is a web page where you write code in cells and see output below each cell.
  3. In a new notebook, type and run: import pandas as pd. That single import gives you almost everything you will use.

You are now ready to start. The whole setup takes 15 minutes. You do not need Visual Studio Code, virtual environments, or pip yet. Those come later.

The five Pandas operations that replace 80% of Excel

Here is the practical core. These five operations cover the vast majority of analyst work. Each one maps to something you already do in Excel.

1. Read your data (replaces Open → Import)

import pandas as pd

df = pd.read_csv("sales.csv")
df.head()

df (short for "dataframe") is the Python equivalent of an Excel sheet. df.head() shows the first 5 rows, like scrolling to the top of your workbook. To peek at the structure, use df.info() and df.describe().

For Excel files specifically: pd.read_excel("workbook.xlsx", sheet_name="Q3").

2. Filter rows (replaces AutoFilter)

In Excel you click the filter dropdown. In Pandas:

high_value = df[df["revenue"] > 10000]
us_only = df[df["country"] == "US"]
combined = df[(df["revenue"] > 10000) & (df["country"] == "US")]

This is called boolean indexing. The & is "and", | is "or". The mental model: you describe the rows you want, and Pandas hands them back.

3. Merge (replaces VLOOKUP / XLOOKUP)

This is the operation that converts the most Excel users. Goodbye VLOOKUP.

combined = pd.merge(orders, customers, on="customer_id", how="left")

In Excel terms: orders is your main table, customers is your lookup table, customer_id is the matching column, and how="left" keeps all rows from orders (like a VLOOKUP with IFERROR returning blank). Inner, right, and outer joins are also available with how="inner", how="right", how="outer".

Pandas does this in one line for a million rows. Excel's row limit is ~1,048,576 per sheet, but VLOOKUP and SUMIF crawl on datasets past 50,000 rows. Calculation speed, not row count, is what breaks Excel for large datasets.

4. Groupby (replaces SUMIF / COUNTIF / AVERAGEIF)

revenue_by_country = df.groupby("country")["revenue"].sum()
orders_per_customer = df.groupby("customer_id")["order_id"].count()
multi_metric = df.groupby("region").agg({"revenue": "sum", "orders": "count", "discount": "mean"})

Groupby is more powerful than any SUMIF chain you have written. You can group by multiple columns, apply multiple aggregations, and chain the result into the next operation.

5. Pivot tables (replaces, well, pivot tables)

pivot = df.pivot_table(
    index="region",
    columns="quarter",
    values="revenue",
    aggfunc="sum",
    fill_value=0
)

Same idea as Excel's pivot table. index is the rows, columns is the columns, values is what you sum or average, aggfunc is the operation. This produces a clean rectangular table you can export, plot, or feed into another analysis.

Most analysts I teach hit this and realize the entire pivot table system they have been clicking through for years is one function call.

A realistic side-by-side comparison

TaskExcelPandasSpeed-up
Open a 500k-row CSVOften crashes2 secondsMassive
VLOOKUP on 100k rows30+ seconds, lots of clicking1 line, instant10-50x
Pivot table from 3 sourcesPower Query, fragile3 merges + 1 pivotMore reliable
Repeat weeklyManual re-do every MondayRun one notebookHours saved
Email the resultCopy-paste, screenshotAuto-export to PDF/ExcelFully automated
Audit who changed whatTrack changes (clunky)Git historyCleanly version-controlled

The reliability gain is bigger than the speed gain for most analysts. A Python script that runs every Monday at 7am produces the same output every week. An Excel workbook gets corrupted, broken, or accidentally edited.

A practical 6-week plan

This is the path I take analyst students through. It assumes 3 hours per week.

Week 1: Setup and the dataframe mental model. Install Anaconda, open Jupyter, load your first CSV with pd.read_csv. Practice df.head(), df.info(), df.describe(). Get comfortable with the idea that a dataframe is a sheet you query, not click.

Week 2: Filtering and selecting. Master boolean indexing, .loc[], and column selection. Replicate the Excel filters you use every day.

Week 3: Merge. Practice pd.merge with left, inner, and outer joins. Rebuild the most common VLOOKUP in your current workbook.

Week 4: Groupby and aggregation. This is where Python's power becomes obvious. Rebuild your SUMIF dashboards.

Week 5: Pivot tables and charts. Use pivot_table and matplotlib or seaborn to recreate one full report visually.

Week 6: Automation. Schedule the script to run weekly. Export results to Excel or PDF. Email them automatically if needed.

By the end of Week 6 you should have one real weekly report fully rebuilt in Python. That single project is worth more than any course completion certificate.

For the broader Python learning path that this fits into, see the Python learning path for professionals. If you are coming from SQL rather than Excel, the SQL to Python transition guide is the parallel path.

Common mistakes I see

  1. Learning Pandas abstractly with toy datasets. The iris and titanic datasets in every tutorial do not stick. Your messy real workbook does. Bring real data from day one.
  2. Trying to replicate every Excel formula one-to-one. Some things Pandas does differently and better. Don't fight to recreate your nested IF chain. Use np.where or boolean indexing instead and move on.
  3. Skipping the mental model shift. Excel is "click cell, type formula." Pandas is "describe the operation on the whole table." Analysts who keep thinking in cells stall at week 2. Analysts who switch to thinking in tables move fast.

What to do next

Pick the right next step based on where you are.

If you have not installed anything yet, install Anaconda today and open one Jupyter notebook. That single 15-minute step has the highest payoff in the whole process.

If you have started learning Pandas and stalled, the issue is almost always that you're learning syntax without a real project. Pick the weekly report you hate most and start rebuilding it. The frustration of "I don't know this" gets replaced with "I need to learn exactly this one thing to finish my project."

If you want a structured path with someone watching your code, a 1-on-1 tutor can usually compress the 6-week plan into 4 weeks for analysts who already think in data. Book a free 15-minute Discovery Call and we will look at your actual workbooks and figure out the fastest path.

Frequently Asked Questions

Do I need to know SQL before Pandas?

No, but it helps. SQL and Pandas share the same mental model (operate on tables, not cells), so SQL fluency makes Pandas easier. If you don't know SQL, start with Pandas directly. You can pick up SQL later when you need to pull data from a database. The SQL to Python transition guide covers the reverse direction in detail.

Can I keep using Excel for some things?

Yes, and you should. Excel is still excellent for quick visual exploration, small one-off analyses, and sharing with stakeholders who do not have Python. The right model is "Python for repeatable work, Excel for visual one-offs." Most analysts I teach end up using both.

How long does it take to be productive in Pandas from an Excel background?

Six to eight weeks at three focused hours per week is the realistic answer. By Week 8 most analyst students have rebuilt one real weekly report and are faster in Python than Excel for that specific workflow. Full fluency across all common operations takes 3 to 4 months.

What about Power Query, Power BI, or VBA? Are those alternatives?

They are partial alternatives. Power Query handles much of what Pandas does for ETL inside Excel. Power BI is excellent for dashboarding. VBA can automate Excel itself. The reason analysts still move to Python is reach: Python connects to AI tools, databases, web APIs, machine learning libraries, and modern data infrastructure in ways the Microsoft stack does not.

Should I learn NumPy and Matplotlib at the same time as Pandas?

Touch them, don't deep-dive. NumPy is the math layer under Pandas; you'll use it occasionally. Matplotlib is for charts. Learn enough of both to make basic plots and array math work in Pandas. Going deep into either before Pandas fluency is a common time-waster.

Can AI tools like Claude or ChatGPT just write the Pandas code for me?

They can write a lot of it, and they should. The catch is that you still need to read, evaluate, and debug what they produce. An analyst who can't tell when Claude's groupby is wrong is in a worse spot than one who writes slower but correct code. AI tools amplify Pandas fluency. They do not replace it. I cover the right way to use AI for analyst work in best AI tools for analysts.


Ready to move from reading to building?

If you are serious about moving from Excel to Python, stop watching tutorials and start rebuilding one of your real reports. A tutor can compress the 6-week plan into 4 and catch the mistakes that slow most analysts down. Book a free 15-minute Discovery Call. No pitch, just a conversation about your workbooks and goals.

Book a Free Discovery Call →


Written by AI Tutor Code, private 1-on-1 online tutoring for professionals learning Python, AI, and modern ML tools. 200+ students taught. 3,000+ hours of private tutoring delivered. 4.9/5 average rating.

Related articles

Keep reading on related topics.

Enjoyed this article?

You can master this and more with a dedicated 1-on-1 tutor.

Book a Free Discovery Call