
A real-world guide for anyone working with hiring data (or any data, really)
If you work in Talent, HR, or Engineering Analytics, you’ve probably wrestled with messy data. You have simple questions like “How many people did we hire?” or “Which recruiter is performing best?” — but getting clean answers feels harder than it should be.
The problem isn’t your skills. It’s your data model.
Let me show you exactly what that means, using a real hiring scenario.
The Scenario: Your Hiring Data Challenge
You need to answer basic questions:
- How many people did we hire?
- How long did it take to hire them?
- Which recruiter is performing better?
- Which location is struggling?
You have all the data. But something feels… off.
Let’s walk through three different ways to organize this data — and why only one actually works.
Step 1: The Raw Data Trap (Where Most People Start)
Here’s what your data probably looks like right now — one big Excel file:
Raw_Hiring_Data
The Problems Are Obvious:
❌ Names repeat constantly ❌ Location repeats for every row ❌ Recruiter information duplicated everywhere ❌ Calculating “time to hire” means manual work ❌ Filters break when data grows
This is called a flat model or raw data structure.
Reality check: It’s good for data storage. Terrible for analytics.
Step 2: The Normalized Approach (Engineering Thinking)
Now you think: “Why am I repeating the same information over and over?”
So you split the data into logical tables:
Candidate Table
Recruiter Table
Job Table
Hiring Table (connects everything)
This is the relational/normalized model.
✅ More accurate ✅ No data duplication ❌ But: Every dashboard query needs multiple joins ❌ Slow performance ❌ Confusing for non-technical users
Good for databases. Not ideal for analytics.
Step 3: The Star Schema (This Is the Game-Changer)
Now we design specifically for analytics, not storage.
The Core Concept (Read This Twice):
Fact Table = Your numbers (metrics, measurements) Dimension Tables = Your details (filters, context)
FACT TABLE: Where Your Numbers Live
Fact_Hiring
👉 One row = One hiring event
DIMENSION TABLES: Your Context and Filters
Dim_Candidate
Dim_Job
Dim_Recruiter
Dim_Date
Step 4: How Everything Connects
Think of it literally like a star:
- Fact table sits in the center (the numbers)
- Dimension tables connect around it (the context)
- One-to-many relationships from dimensions to facts
This is why it’s called a Star Schema. It actually looks like a star.
Step 5: Why This Works So Well in Real Life
Now answering questions becomes ridiculously easy:
Question 1: Average Time to Hire by Recruiter
- Filter by Recruiter Name (from Dim_Recruiter)
- Calculate
AVG(Time_To_Hire)from Fact table - Done.
Question 2: Total Hires by Department
- Filter by Department (from Dim_Job)
- Count rows in Fact table
- Done.
Question 3: Hiring Trend by Month
- Use Date dimension
- Group by Month/Year
- Done.
No confusion. No broken filters. No weird results.
Step 6: Why Power BI, Tableau, and Looker Love This
These tools are designed for star schemas because:
✅ One clear fact table (no ambiguity) ✅ Clean, defined relationships ✅ Measures calculate correctly every time ✅ Performance is fast ✅ New users can understand it quickly
Hard truth: If your data model is wrong, your dashboard will always feel wrong — no matter how pretty you make it.
My Simple Test for a Good Data Model
If someone new can open your model and understand “what’s a fact vs what’s a dimension” in 5 minutes → good model.
If they need explanation every time → bad model.
That’s it.
The Bottom Line
Data modeling isn’t about fancy terminology or complex theory.
It’s about organizing data so questions become easy to answer.
If asking simple questions feels hard, your model is wrong. Simple as that.
What’s Next?
This was the foundation. Next topics I can cover:
- How to handle slowly changing dimensions (like when a recruiter changes teams)
- Snowflake schema vs Star schema
- Real Power BI implementation of this exact model
- Common mistakes that break everything
Let me know what you want to see next.
Have questions about your own data model? Drop them in the comments. I read and respond to all of them.
POSTS ACROSS THE NETWORK
2026 Content Productivity Report: The Great Consolidation of All-in-One AI Platforms
AI in Academia: How to Humanize Student Writing Without Losing Your Voice

How Can Non-Technical Teams Choose Reliable Web Hosting Without Overcomplicating It?

Wisey Review (2026): Can technology be supportive without being addictive?

Turning Google Search into a Kafka Event Stream for Many Consumers
