SQL for Accountants: Query Financial Databases & Extract Accounting Data
Why Accountants Need SQL
SQL vs Excel: The Core Difference
| Capability | Excel | SQL |
|---|---|---|
| Maximum practical rows | ~500K before slowdown | Billions of rows |
| Data access | Requires file export from ERP | Connects directly to database |
| Audit trail | Formula history is opaque | Every query logged by database |
| Multi-table analysis | VLOOKUP / INDEX-MATCH | JOIN across any number of tables |
| Duplicate detection | Conditional formatting, manual | GROUP BY + HAVING in one query |
| Concurrent access | One user at a time (file lock) | Hundreds of users simultaneously |
| Data freshness | Depends on export schedule | Real-time (live database connection) |
| Learning effort | Already known | Readable English-like syntax — 2-4 weeks to productive |
The key insight for accountants: SQL gives you direct database access without an IT intermediary. Instead of raising a ticket and waiting 3 days for a data extract, you write a 10-line SQL query and get the result in seconds. For internal audit, this means unfiltered access to journal entry logs for fraud detection. For finance analysis, it means querying this month's GL data before the period has even been formally closed in the ERP.
SQL Fundamentals for Finance
The Core SQL Clause Structure
SELECT column1, column2, SUM(amount) AS total_amount
FROM table_name
WHERE condition
GROUP BY column1, column2
HAVING SUM(amount) > 10000
ORDER BY total_amount DESC
LIMIT 100; -- MySQL/PostgreSQL syntax (TOP 100 in SQL Server)
Clause order matters: SELECT and FROM are mandatory; the others are optional and must appear in this sequence.
Data Types Relevant to Accounting
| Data Type | Use in Finance | Example |
|---|---|---|
| DECIMAL(15,2) | Monetary amounts — precise to 2 decimal places | DECIMAL(15,2) for invoice_amount up to 9 trillion |
| INTEGER / BIGINT | Count, document numbers, IDs | invoice_id INTEGER |
| VARCHAR(n) | Text: account names, narrations, GSTIN | VARCHAR(15) for GSTIN (15 characters exactly) |
| DATE | Transaction dates, due dates | DATE format: YYYY-MM-DD |
| DATETIME / TIMESTAMP | Audit timestamps — when entry was created | TIMESTAMP for created_at column |
| BOOLEAN / TINYINT(1) | Flags: is_reconciled, is_approved, is_reversed | TINYINT(1) in MySQL (1=true, 0=false) |
Filtering with WHERE: Accounting Examples
-- All unpaid invoices over Rs 1 lakh due before today
SELECT invoice_no, vendor_name, amount, due_date
FROM ap_invoices
WHERE payment_status = 'Unpaid'
AND amount > 100000
AND due_date < CURDATE(); -- CURDATE() in MySQL, GETDATE() in SQL Server
-- Transactions in a financial year (April 2024 to March 2025)
SELECT *
FROM journal_entries
WHERE transaction_date BETWEEN '2024-04-01' AND '2025-03-31';
-- Filter by list of accounts (useful for P&L extraction)
SELECT account_code, account_name, SUM(debit_amount) - SUM(credit_amount) AS balance
FROM journal_entries
WHERE account_code IN ('4001','4002','4003','4004','4010')
GROUP BY account_code, account_name;
-- Find transactions with NULL narration (data quality issue)
SELECT *
FROM journal_entries
WHERE narration IS NULL
OR narration = '';
-- Fuzzy search: find all rent-related expenses
SELECT *
FROM journal_entries
WHERE narration LIKE '%rent%'
OR narration LIKE '%lease%';
Aggregation Functions — The Core of Financial Reporting
-- Trial balance: sum debits and credits by account
SELECT
account_code,
account_name,
SUM(debit_amount) AS total_debit,
SUM(credit_amount) AS total_credit,
SUM(debit_amount) - SUM(credit_amount) AS net_balance
FROM journal_entries
WHERE posting_date BETWEEN '2024-04-01' AND '2025-03-31'
GROUP BY account_code, account_name
ORDER BY account_code;
-- Vendor spend summary: top 10 vendors by amount
SELECT
vendor_id,
vendor_name,
COUNT(*) AS invoice_count,
SUM(invoice_amount) AS total_spend,
AVG(invoice_amount) AS avg_invoice,
MAX(invoice_amount) AS largest_invoice
FROM ap_invoices
WHERE invoice_date BETWEEN '2024-04-01' AND '2025-03-31'
GROUP BY vendor_id, vendor_name
ORDER BY total_spend DESC
LIMIT 10;
JOINs for Accountants: Reconciliation Logic
JOINs are how SQL combines data from multiple tables — and for accountants, JOINs are the mechanism for reconciliation. Understanding which JOIN type to use is the key insight.
-- INNER JOIN: invoices that HAVE been paid (matched records only)
SELECT
i.invoice_no,
i.vendor_name,
i.invoice_amount,
p.payment_date,
p.payment_amount
FROM ap_invoices i
INNER JOIN ap_payments p ON i.invoice_no = p.invoice_reference
WHERE i.invoice_date BETWEEN '2024-04-01' AND '2025-03-31';
-- LEFT JOIN: ALL invoices + payments if they exist
-- Unpaid invoices have NULL in payment columns
SELECT
i.invoice_no,
i.vendor_name,
i.invoice_amount,
i.due_date,
p.payment_date,
p.payment_amount,
CASE WHEN p.payment_date IS NULL THEN 'Outstanding' ELSE 'Paid' END AS status
FROM ap_invoices i
LEFT JOIN ap_payments p ON i.invoice_no = p.invoice_reference;
-- LEFT ANTI JOIN: invoices with NO payment (outstanding only)
SELECT
i.invoice_no,
i.vendor_name,
i.invoice_amount,
i.due_date,
DATEDIFF(CURDATE(), i.due_date) AS days_overdue
FROM ap_invoices i
LEFT JOIN ap_payments p ON i.invoice_no = p.invoice_reference
WHERE p.invoice_reference IS NULL -- NULL means no match = unpaid
AND i.invoice_amount > 0
ORDER BY days_overdue DESC;
Real Accounting Queries with Code
Query 1: Aged Debtors Report
SELECT
customer_id,
customer_name,
SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) <= 0 THEN outstanding_amount ELSE 0 END) AS current_amt,
SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) BETWEEN 1 AND 30 THEN outstanding_amount ELSE 0 END) AS days_1_30,
SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) BETWEEN 31 AND 60 THEN outstanding_amount ELSE 0 END) AS days_31_60,
SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) BETWEEN 61 AND 90 THEN outstanding_amount ELSE 0 END) AS days_61_90,
SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) > 90 THEN outstanding_amount ELSE 0 END) AS over_90_days,
SUM(outstanding_amount) AS total_outstanding
FROM ar_invoices
WHERE payment_status = 'Outstanding'
GROUP BY customer_id, customer_name
ORDER BY total_outstanding DESC;
Query 2: Duplicate Payment Detection
-- Find potential duplicate payments: same vendor, same amount, same date
SELECT
vendor_id,
vendor_name,
payment_amount,
payment_date,
COUNT(*) AS payment_count,
GROUP_CONCAT(payment_reference ORDER BY payment_reference) AS payment_refs
FROM ap_payments
GROUP BY vendor_id, vendor_name, payment_amount, payment_date
HAVING COUNT(*) > 1
ORDER BY payment_amount DESC;
-- More targeted: same vendor + same amount within 7 days (looser duplicate check)
SELECT
a.payment_reference AS payment_1,
b.payment_reference AS payment_2,
a.vendor_name,
a.payment_amount,
a.payment_date AS date_1,
b.payment_date AS date_2,
ABS(DATEDIFF(a.payment_date, b.payment_date)) AS days_apart
FROM ap_payments a
JOIN ap_payments b ON a.vendor_id = b.vendor_id
AND a.payment_amount = b.payment_amount
AND a.payment_reference < b.payment_reference -- avoid self-join duplicates
WHERE ABS(DATEDIFF(a.payment_date, b.payment_date)) <= 7
ORDER BY a.payment_amount DESC;
Query 3: Journal Entries by User (Fraud Detection)
-- Flag journal entries posted by users outside business hours
-- And identify users with unusual round-number postings
SELECT
created_by_user,
COUNT(*) AS entry_count,
SUM(debit_amount) AS total_debit_posted,
SUM(CASE WHEN debit_amount = ROUND(debit_amount, -3) THEN 1 ELSE 0 END) AS round_number_entries,
MIN(created_at) AS earliest_posting,
MAX(created_at) AS latest_posting
FROM journal_entries
WHERE posting_date BETWEEN '2024-04-01' AND '2025-03-31'
GROUP BY created_by_user
ORDER BY total_debit_posted DESC;
-- Entries posted on weekends or after hours (potential unauthorised access)
SELECT *
FROM journal_entries
WHERE DAYOFWEEK(created_at) IN (1, 7) -- 1=Sunday, 7=Saturday in MySQL
OR HOUR(created_at) NOT BETWEEN 8 AND 20 -- outside 8am-8pm
ORDER BY created_at DESC;
Query 4: GST Reconciliation — AP Module vs GSTR-3B
-- Compare input tax credit claimed in books vs GSTR-3B filed
-- Using CTE (Common Table Expression) for clarity
WITH books_itc AS (
SELECT
supplier_gstin,
SUM(igst_amount) AS books_igst,
SUM(cgst_amount) AS books_cgst,
SUM(sgst_amount) AS books_sgst,
SUM(igst_amount + cgst_amount + sgst_amount) AS books_total_itc
FROM ap_invoices
WHERE invoice_date BETWEEN '2024-04-01' AND '2025-03-31'
AND gst_claim_eligible = 1
GROUP BY supplier_gstin
),
gstr_data AS (
SELECT
supplier_gstin,
SUM(igst_amount) AS gstr_igst,
SUM(cgst_amount) AS gstr_cgst,
SUM(sgst_amount) AS gstr_sgst,
SUM(igst_amount + cgst_amount + sgst_amount) AS gstr_total_itc
FROM gstr2a_data
WHERE tax_period BETWEEN '2024-04' AND '2025-03'
GROUP BY supplier_gstin
)
SELECT
COALESCE(b.supplier_gstin, g.supplier_gstin) AS supplier_gstin,
b.books_total_itc,
g.gstr_total_itc,
(b.books_total_itc - g.gstr_total_itc) AS variance,
CASE
WHEN g.supplier_gstin IS NULL THEN 'In Books Only — Not in GSTR-2A'
WHEN b.supplier_gstin IS NULL THEN 'In GSTR-2A Only — Not in Books'
WHEN ABS(b.books_total_itc - g.gstr_total_itc) > 1 THEN 'Amount Mismatch'
ELSE 'Matched'
END AS reconciliation_status
FROM books_itc b
FULL OUTER JOIN gstr_data g ON b.supplier_gstin = g.supplier_gstin
WHERE COALESCE(b.supplier_gstin, g.supplier_gstin) IS NOT NULL
ORDER BY ABS(COALESCE(b.books_total_itc,0) - COALESCE(g.gstr_total_itc,0)) DESC;
Advanced SQL: CTEs, Window Functions & Tools
CTEs (Common Table Expressions) — WITH Clause
CTEs make complex queries readable by naming intermediate result sets. The GST reconciliation query above demonstrates their use. Key benefit: you can reference a CTE multiple times in the main query without recalculating it. CTEs are also essential for self-referencing queries like parent-child account hierarchies.
Window Functions for Finance Analysis
-- ROW_NUMBER: rank vendors by spend within each department
SELECT
department,
vendor_name,
total_spend,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY total_spend DESC) AS spend_rank
FROM vendor_spend_summary
WHERE spend_rank <= 5; -- Top 5 vendors per department (use subquery in WHERE)
-- LAG: month-over-month revenue comparison
SELECT
month_year,
revenue,
LAG(revenue, 1) OVER (ORDER BY month_year) AS prev_month_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY month_year) AS mom_change,
ROUND(
(revenue - LAG(revenue, 1) OVER (ORDER BY month_year))
/ LAG(revenue, 1) OVER (ORDER BY month_year) * 100, 2
) AS mom_change_pct
FROM monthly_revenue_summary
ORDER BY month_year;
-- Running total of payments to a vendor
SELECT
payment_date,
vendor_name,
payment_amount,
SUM(payment_amount) OVER (PARTITION BY vendor_id ORDER BY payment_date) AS cumulative_payments
FROM ap_payments
WHERE vendor_id = 'V001'
ORDER BY payment_date;
Databases Used in Indian Companies
| Database | Common In | Free to Learn? |
|---|---|---|
| MySQL | SMEs, e-commerce, startups, Tally backend | Yes — MySQL Community Edition |
| PostgreSQL | Fintech, SaaS, modern startups | Yes — fully open source |
| Microsoft SQL Server | Mid-market corporates, ERP like Dynamics | Yes — SQL Server Express (free tier) |
| Oracle Database | Large enterprises, PSU banks, SAP on Oracle | Limited — Oracle Express Edition (free, limited) |
| SQLite | Mobile apps, lightweight tools | Yes — fully free, ideal for learning |
Free SQL Tools for Accountants
- DB Browser for SQLite: Install SQLite, load any CSV as a database table, run SQL queries — the fastest zero-cost way to learn SQL on your own data
- DBeaver: Universal database client that connects to MySQL, PostgreSQL, SQL Server, Oracle, and 80+ others with a free community edition
- MySQL Workbench: Official MySQL GUI — schema design, query editor, data export
- Azure Data Studio: Microsoft's free cross-platform tool for SQL Server and PostgreSQL
Where to Practice SQL
- SQLZoo.net — interactive SQL tutorials with immediate feedback, free
- HackerRank SQL challenges — practice problems from easy to expert, ranked leaderboard
- Mode Analytics SQL Tutorial — analytics-focused SQL using real datasets
- LeetCode SQL — interview-style SQL problems, useful for finance analyst job preparation
⚡ Take Action Now
Install DB Browser for SQLite (free, 20 MB download). Import any accounting CSV you already have (bank statement, invoice list) as a database table. Run your first SELECT query with a WHERE clause and GROUP BY. You will have written real SQL against real financial data within 30 minutes of starting.
Explore CorpReady Programs📚 Real Student Story
Rahul Agarwal, Internal Auditor at a listed NBFC, Pune — Rahul's team was conducting a fraud risk assessment across 3 million payment transactions from the previous 2 years. The IT department said it would take 2 weeks to write the custom reports requested. Rahul, who had recently learned SQL through CorpReady's analytics module, asked for direct read-only access to the MySQL database. Within 4 hours, he had written and run his duplicate payment detection query, a round-number payment analysis, and a weekend/after-hours journal entry report. His queries identified 27 suspicious payment patterns that the standard reports would not have surfaced. The audit report was completed in 6 days instead of 6 weeks, and Rahul was fast-tracked to Senior Auditor.
💼 What Firms Actually Want
Big Four and mid-tier audit firms are increasingly requiring SQL for IT audit and data analytics roles. Forensic accounting teams list SQL as mandatory for data-driven fraud investigations. Finance analytics roles in BFSI (banks, NBFCs, insurance) specifically require SQL for querying core banking systems, loan management systems, and policy administration databases. FP&A (Financial Planning & Analysis) roles in MNCs use SQL to pull actuals directly from ERP databases for variance analysis. The combination of accounting domain knowledge plus SQL creates a profile that stands out: the accountant who does not need IT to get data.
Frequently Asked Questions
Why should accountants learn SQL?
Accountants who know SQL can directly access financial databases without waiting for IT or developers to run reports. SQL handles datasets of 10 million+ rows that would crash Excel, queries leave an audit trail in database logs, and the ability to write your own data extraction queries makes accountants significantly more autonomous and valuable in analytics-driven finance roles.
What SQL database is most commonly used in Indian companies?
MySQL is the most common in Indian SMEs and startups. Microsoft SQL Server is dominant in mid-market and large corporates. PostgreSQL is widely used in fintech and SaaS companies. Oracle is used in large enterprises and public sector banks. For learning and practice, MySQL (free) or SQLite (via DB Browser, fully free) are the best starting points.
What is the difference between INNER JOIN and LEFT JOIN for accountants?
INNER JOIN returns only rows where a match exists in both tables — useful for finding matched invoices. LEFT JOIN returns all rows from the left table plus matching rows from the right (unmatched right-side rows appear as NULL). In accounting, LEFT JOIN is used for reconciliation: find all invoices in the AP ledger and show which ones have no corresponding bank payment — NULL equals unmatched equals outstanding.
How can SQL detect duplicate payments in accounting data?
Duplicate payment detection uses GROUP BY with HAVING COUNT greater than 1. Query the payments table, group by vendor ID, amount, and payment date, and return groups where the count exceeds 1. This finds payments to the same vendor for the same amount on the same date — the classic duplicate payment pattern in AP fraud or processing errors. A looser check uses a self-JOIN to find payments within 7 days of each other.
✅ Key Takeaways
- SQL gives accountants direct database access, eliminating the IT bottleneck for custom reports and enabling analysis of 10 million-row datasets that Excel cannot handle.
- LEFT JOIN with a WHERE IS NULL condition is the SQL equivalent of "find unmatched items" — the core mechanism for all reconciliation queries.
- Duplicate payment detection, aged debtors, GST reconciliation, and journal entry fraud detection are the four highest-impact SQL use cases for accountants.
- CTEs (WITH clause) transform complex multi-step SQL logic into readable, maintainable queries — essential for production accounting query libraries.
- Window functions (ROW_NUMBER, LAG, LEAD, running totals) enable the period-over-period analysis that previously required multiple Excel SUMIF formulas.
- DB Browser for SQLite and DBeaver are free tools that let you practice SQL on real data immediately — no server setup required.
Ready to Master SQL for Finance?
CorpReady Academy's programs build practical data skills alongside globally recognized credentials.
Explore CorpReady Programs Talk to a Counsellor