Python for Data — Set Up the Lab
pip install the data stack → launch Jupyter → run one cell. That's a lab.Why a whole "lab" and not just python.exe?
You've already stood up a LAMP server, so you know the drill: real work needs a real environment, not a global free-for-all. Python's version of "lock it down" is the virtual environment — a self-contained folder that holds one project's Python and its exact packages. Different project, different folder, zero version fights. This is the single habit that separates people who enjoy Python from people who fight it.
🐘 PHP: A venv is the spiritual cousin of Composer's vendor/ directory — per-project dependencies that don't leak into the rest of the system. The difference is a venv also pins the interpreter, not just the libraries.
Step 1: Get Python itself
You want Python 3.11 or newer. Check what you've got first — open a terminal and ask:
python --version
# or, on many systems:
python3 --version
If that prints Python 3.11.x (or higher), you're set. If it says "command not found" or shows something ancient like 3.8, grab the installer from python.org/downloads. On Windows, tick "Add Python to PATH" during install — skipping that box is the #1 reason beginners can't find python afterwards.
python; on Lubuntu/macOS it's often python3 (and pip3). Wherever you see python below, use whichever one your machine answers to.Step 2: Make a home for the project
Pick a folder for your analytics work and create a virtual environment inside it:
mkdir ba-lab
cd ba-lab
python -m venv .venv
python -m venv .venv means "run the built-in venv module and build an environment in a folder called .venv." The dot just hides it from casual folder listings. Nothing is installed globally — it all lives in that one folder you can delete anytime to start fresh.
Step 3: Activate it
Creating the venv isn't enough; you have to step into it so your terminal uses that Python instead of the system one.
# Windows (PowerShell):
.venv\Scripts\Activate.ps1
# Lubuntu / macOS:
source .venv/bin/activate
Your prompt should now show (.venv) at the front. That little tag is your proof you're inside the lab. Type deactivate any time to step back out.
Set-ExecutionPolicy -Scope CurrentUser RemoteSigned once, then try activating again. It's a Windows safety setting, not a Python problem.Step 4: Install the data stack
With the venv active, install the core business-analytics toolkit in one shot:
pip install pandas numpy matplotlib seaborn jupyterlab scikit-learn openpyxl
Quick tour of what you just installed, because names matter:
- pandas — spreadsheets in code. The single most important library in this whole module.
- numpy — fast numeric arrays; pandas is built on top of it.
- matplotlib + seaborn — charts. Seaborn makes matplotlib look good with less effort.
- jupyterlab — the notebook app where analysts actually live.
- scikit-learn — your first machine-learning models (Chapter 10).
- openpyxl — lets pandas read and write real
.xlsxExcel files.
Step 5: Launch the notebook
jupyter lab
This opens JupyterLab in your browser. A notebook is a stack of cells — you type code in one, press Shift+Enter, and the output appears right underneath. It's a conversation with your data: ask a question, see the answer, ask the next one. This back-and-forth is why notebooks won the data-science world.
- In JupyterLab, click the big Python 3 tile under "Notebook" to make a new notebook
- In the first cell, type
import pandas as pdand press Shift+Enter - No error = pandas loaded. In the next cell, run
pd.__version__ - It prints a version like
2.2.1. Your lab is live.
Hello, Data
Goal: prove the whole stack works end to end by turning three rows of numbers into a chart — in four cells.
- Cell 1:
import pandas as pd - Cell 2: build a tiny table —
sales = pd.DataFrame({'month': ['Jan','Feb','Mar'], 'revenue': [1200, 1800, 1500]}) - Cell 3: just type
salesand run it — Jupyter renders a clean table - Cell 4:
sales.plot(x='month', y='revenue', kind='bar')— a bar chart appears inline
You just did the entire analytics loop — load, inspect, visualise — in under a minute. Everything after this is just doing it with bigger, messier, more interesting data.
ba-lab folder — we'll build in it for the rest of the module.
pip freeze > requirements.txt. Open the file — it's the exact recipe of every package and version you installed. Anyone (including future-you on a new laptop) can recreate this lab with pip install -r requirements.txt. That one file is how teams stay in sync.