Excel, SQL, and Pandas: One-to-One Cheat Sheet for Everyday Data Tasks

- August 19, 2025

A compact, side-by-side reference that maps common data tasks across Excel, SQL, and Python (Pandas). Copy/paste the snippets and adapt table/column names to your data.

A lot of us think in Excel, work with data stored in SQL databases, and script repeatable workflows in Python (Pandas). This cheat sheet lines those worlds up so you can translate muscle memory across tools: if you know how to do it in one, you’ll see the equivalent in the others. It focuses on everyday tasks—loading data, filtering, selecting, sorting, aggregating, joining, creating new columns, handling missing values, exporting, and plotting—using short, copy-pasteable patterns.

What's included

Loading data, selecting columns, filtering, sorting
Grouping/aggregations, joins, computed columns
Missing data handling, exporting, and quick charts

Who this is for

Analysts, engineers, and anyone who bounces between CSVs, databases, and notebooks. If you know one of these tools, this guide helps you move just as fast in the others.

TL;DR: Same task, three idiomatic snippets—minimize guessing and edge-case surprises.

When to use what

Excel: quick ad-hoc exploration, lightweight cleanup, and presentation; great for small to medium files and one-off pivots.
SQL: the data already lives in a database; you need set-based operations, scalable joins/aggregations, and shared, reproducible queries.
Pandas: you want programmable analysis, feature engineering, automation, or to integrate with Python’s ecosystem (files, APIs, ML).

How to use this page

Pick the task (row) you need—filter, join, group, etc.
Copy the snippet from the column of the tool you’re using.
Compare columns to learn the equivalent in the other tools.

How to read this table

Each row is a task. Columns show a minimally idiomatic example for each tool. Replace table_name, column, and ranges with your own names. SQL examples are ANSI-style and may need small tweaks for your dialect (Postgres, MySQL, SQLite, SQL Server, etc.). Pandas examples assume:

Notes
SQL examples are ANSI-like and may vary by dialect (Postgres, MySQL, SQL Server, SQLite, etc.).
Pandas assumes import pandas as pd and a DataFrame named df unless otherwise shown.
Excel ranges are illustrative—adjust to your sheet.

Task	Excel	SQL	Python (Pandas)
Load Data	Open Excel file or use File → Open	`SELECT * FROM table_name;`	`df = pd.read_csv("file.csv")` (or `pd.read_excel("file.xlsx")`)
Filter Rows	`=FILTER(A2:B10, B2:B10>100)`	`SELECT * FROM table WHERE column > 100;`	`df[df["column"] > 100]`
Select Columns	Use column letters (e.g., A, B)	`SELECT column1, column2 FROM table;`	`df[["column1", "column2"]]`
Sort Data	Data → Sort	`SELECT * FROM table ORDER BY column DESC;`	`df.sort_values(by="column", ascending=False)`
Group By / Aggregate	Use Pivot Table	`SELECT dept, COUNT(*) FROM emp GROUP BY dept;`	`df.groupby("dept").size()` or `df.groupby("dept").agg({...})`
Count Rows	`=COUNTA(A2:A100)`	`SELECT COUNT(*) FROM table;`	`len(df)` or `df.shape[0]`
Average / Mean	`=AVERAGE(B2:B100)`	`SELECT AVG(salary) FROM emp;`	`df["salary"].mean()`
SUM	`=SUM(B2:B100)`	`SELECT SUM(sales) FROM data;`	`df["sales"].sum()`
Remove Duplicates	Data → Remove Duplicates	`SELECT DISTINCT column FROM table;`	`df.drop_duplicates()`
Join Tables	Use VLOOKUP or XLOOKUP	`SELECT * FROM A JOIN B ON A.id = B.id;`	`pd.merge(df1, df2, on="id")`
Create New Column	In new column: `=B2*0.1`	`SELECT salary, salary*0.1 AS bonus FROM emp;`	`df["bonus"] = df["salary"] * 0.1`
Rename Column	Rename manually	`SELECT column AS new_name FROM table;`	`df.rename(columns={"old": "new"})`
Handle Missing Data	`=IF(ISBLANK(A2), "N/A", A2)`	Use `IS NULL`, `COALESCE()`	`df.fillna("N/A")` or `df.dropna()`
Export Data	File → Save As (CSV/XLSX)	Use client tool or `COPY/INTO OUTFILE` (dialect)	`df.to_csv("output.csv", index=False)`
Data Visualization	Insert → Charts	Not native; use BI tools	`df.plot(kind="bar")` (or matplotlib/seaborn)

Quick Tips

Type safety: In Pandas, check dtypes with df.dtypes; cast with df.astype({...}) to avoid silent string/number issues.
Filtering nulls: In SQL use WHERE column IS NOT NULL. In Pandas use df[df["column"].notna()].
Multiple conditions: SQL uses AND/OR; Pandas uses bitwise & / | with parentheses, e.g. df[(df.a > 0) & (df.b == "X")].
Indexes: Sorting/grouping in Pandas can be faster if the key column is set as index: df.set_index("id", inplace=True).
Large CSVs: Use pd.read_csv(..., chunksize=100_000) to stream in batches.

Search This Blog

Web Development Notes