SQL for Accountants: Query Financial Databases & Extract Accounting Data

SQL (Structured Query Language) lets accountants query financial databases directly — extracting aged debtors, detecting duplicate payments, reconciling GST data, and running fraud detection queries on 10 million-row datasets in seconds. For finance professionals moving into analytics or audit roles, SQL eliminates dependency on IT for custom data extracts.

Why Accountants Need SQL

SQL vs Excel: The Core Difference

CapabilityExcelSQL
Maximum practical rows~500K before slowdownBillions of rows
Data accessRequires file export from ERPConnects directly to database
Audit trailFormula history is opaqueEvery query logged by database
Multi-table analysisVLOOKUP / INDEX-MATCHJOIN across any number of tables
Duplicate detectionConditional formatting, manualGROUP BY + HAVING in one query
Concurrent accessOne user at a time (file lock)Hundreds of users simultaneously
Data freshnessDepends on export scheduleReal-time (live database connection)
Learning effortAlready knownReadable English-like syntax — 2-4 weeks to productive

The key insight for accountants: SQL gives you direct database access without an IT intermediary. Instead of raising a ticket and waiting 3 days for a data extract, you write a 10-line SQL query and get the result in seconds. For internal audit, this means unfiltered access to journal entry logs for fraud detection. For finance analysis, it means querying this month's GL data before the period has even been formally closed in the ERP.

SQL Fundamentals for Finance

The Core SQL Clause Structure

SELECT   column1, column2, SUM(amount) AS total_amount
FROM     table_name
WHERE    condition
GROUP BY column1, column2
HAVING   SUM(amount) > 10000
ORDER BY total_amount DESC
LIMIT    100;    -- MySQL/PostgreSQL syntax (TOP 100 in SQL Server)

Clause order matters: SELECT and FROM are mandatory; the others are optional and must appear in this sequence.

Data Types Relevant to Accounting

Data TypeUse in FinanceExample
DECIMAL(15,2)Monetary amounts — precise to 2 decimal placesDECIMAL(15,2) for invoice_amount up to 9 trillion
INTEGER / BIGINTCount, document numbers, IDsinvoice_id INTEGER
VARCHAR(n)Text: account names, narrations, GSTINVARCHAR(15) for GSTIN (15 characters exactly)
DATETransaction dates, due datesDATE format: YYYY-MM-DD
DATETIME / TIMESTAMPAudit timestamps — when entry was createdTIMESTAMP for created_at column
BOOLEAN / TINYINT(1)Flags: is_reconciled, is_approved, is_reversedTINYINT(1) in MySQL (1=true, 0=false)

Filtering with WHERE: Accounting Examples

-- All unpaid invoices over Rs 1 lakh due before today
SELECT invoice_no, vendor_name, amount, due_date
FROM   ap_invoices
WHERE  payment_status = 'Unpaid'
  AND  amount > 100000
  AND  due_date < CURDATE();   -- CURDATE() in MySQL, GETDATE() in SQL Server

-- Transactions in a financial year (April 2024 to March 2025)
SELECT *
FROM   journal_entries
WHERE  transaction_date BETWEEN '2024-04-01' AND '2025-03-31';

-- Filter by list of accounts (useful for P&L extraction)
SELECT account_code, account_name, SUM(debit_amount) - SUM(credit_amount) AS balance
FROM   journal_entries
WHERE  account_code IN ('4001','4002','4003','4004','4010')
GROUP BY account_code, account_name;

-- Find transactions with NULL narration (data quality issue)
SELECT *
FROM   journal_entries
WHERE  narration IS NULL
   OR  narration = '';

-- Fuzzy search: find all rent-related expenses
SELECT *
FROM   journal_entries
WHERE  narration LIKE '%rent%'
   OR  narration LIKE '%lease%';

Aggregation Functions — The Core of Financial Reporting

-- Trial balance: sum debits and credits by account
SELECT
    account_code,
    account_name,
    SUM(debit_amount)  AS total_debit,
    SUM(credit_amount) AS total_credit,
    SUM(debit_amount) - SUM(credit_amount) AS net_balance
FROM   journal_entries
WHERE  posting_date BETWEEN '2024-04-01' AND '2025-03-31'
GROUP BY account_code, account_name
ORDER BY account_code;

-- Vendor spend summary: top 10 vendors by amount
SELECT
    vendor_id,
    vendor_name,
    COUNT(*)           AS invoice_count,
    SUM(invoice_amount) AS total_spend,
    AVG(invoice_amount) AS avg_invoice,
    MAX(invoice_amount) AS largest_invoice
FROM   ap_invoices
WHERE  invoice_date BETWEEN '2024-04-01' AND '2025-03-31'
GROUP BY vendor_id, vendor_name
ORDER BY total_spend DESC
LIMIT  10;

JOINs for Accountants: Reconciliation Logic

JOINs are how SQL combines data from multiple tables — and for accountants, JOINs are the mechanism for reconciliation. Understanding which JOIN type to use is the key insight.

-- INNER JOIN: invoices that HAVE been paid (matched records only)
SELECT
    i.invoice_no,
    i.vendor_name,
    i.invoice_amount,
    p.payment_date,
    p.payment_amount
FROM   ap_invoices   i
INNER JOIN ap_payments p ON i.invoice_no = p.invoice_reference
WHERE  i.invoice_date BETWEEN '2024-04-01' AND '2025-03-31';


-- LEFT JOIN: ALL invoices + payments if they exist
-- Unpaid invoices have NULL in payment columns
SELECT
    i.invoice_no,
    i.vendor_name,
    i.invoice_amount,
    i.due_date,
    p.payment_date,
    p.payment_amount,
    CASE WHEN p.payment_date IS NULL THEN 'Outstanding' ELSE 'Paid' END AS status
FROM   ap_invoices   i
LEFT JOIN ap_payments p ON i.invoice_no = p.invoice_reference;


-- LEFT ANTI JOIN: invoices with NO payment (outstanding only)
SELECT
    i.invoice_no,
    i.vendor_name,
    i.invoice_amount,
    i.due_date,
    DATEDIFF(CURDATE(), i.due_date) AS days_overdue
FROM   ap_invoices   i
LEFT JOIN ap_payments p ON i.invoice_no = p.invoice_reference
WHERE  p.invoice_reference IS NULL   -- NULL means no match = unpaid
  AND  i.invoice_amount > 0
ORDER BY days_overdue DESC;

Real Accounting Queries with Code

Query 1: Aged Debtors Report

SELECT
    customer_id,
    customer_name,
    SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) <= 0  THEN outstanding_amount ELSE 0 END) AS current_amt,
    SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) BETWEEN 1  AND 30 THEN outstanding_amount ELSE 0 END) AS days_1_30,
    SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) BETWEEN 31 AND 60 THEN outstanding_amount ELSE 0 END) AS days_31_60,
    SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) BETWEEN 61 AND 90 THEN outstanding_amount ELSE 0 END) AS days_61_90,
    SUM(CASE WHEN DATEDIFF(CURDATE(), due_date) > 90               THEN outstanding_amount ELSE 0 END) AS over_90_days,
    SUM(outstanding_amount) AS total_outstanding
FROM   ar_invoices
WHERE  payment_status = 'Outstanding'
GROUP BY customer_id, customer_name
ORDER BY total_outstanding DESC;

Query 2: Duplicate Payment Detection

-- Find potential duplicate payments: same vendor, same amount, same date
SELECT
    vendor_id,
    vendor_name,
    payment_amount,
    payment_date,
    COUNT(*) AS payment_count,
    GROUP_CONCAT(payment_reference ORDER BY payment_reference) AS payment_refs
FROM   ap_payments
GROUP BY vendor_id, vendor_name, payment_amount, payment_date
HAVING COUNT(*) > 1
ORDER BY payment_amount DESC;

-- More targeted: same vendor + same amount within 7 days (looser duplicate check)
SELECT
    a.payment_reference AS payment_1,
    b.payment_reference AS payment_2,
    a.vendor_name,
    a.payment_amount,
    a.payment_date AS date_1,
    b.payment_date AS date_2,
    ABS(DATEDIFF(a.payment_date, b.payment_date)) AS days_apart
FROM   ap_payments a
JOIN   ap_payments b ON a.vendor_id    = b.vendor_id
                     AND a.payment_amount = b.payment_amount
                     AND a.payment_reference < b.payment_reference  -- avoid self-join duplicates
WHERE  ABS(DATEDIFF(a.payment_date, b.payment_date)) <= 7
ORDER BY a.payment_amount DESC;

Query 3: Journal Entries by User (Fraud Detection)

-- Flag journal entries posted by users outside business hours
-- And identify users with unusual round-number postings
SELECT
    created_by_user,
    COUNT(*)           AS entry_count,
    SUM(debit_amount)  AS total_debit_posted,
    SUM(CASE WHEN debit_amount = ROUND(debit_amount, -3) THEN 1 ELSE 0 END) AS round_number_entries,
    MIN(created_at)    AS earliest_posting,
    MAX(created_at)    AS latest_posting
FROM   journal_entries
WHERE  posting_date BETWEEN '2024-04-01' AND '2025-03-31'
GROUP BY created_by_user
ORDER BY total_debit_posted DESC;

-- Entries posted on weekends or after hours (potential unauthorised access)
SELECT *
FROM   journal_entries
WHERE  DAYOFWEEK(created_at) IN (1, 7)   -- 1=Sunday, 7=Saturday in MySQL
   OR  HOUR(created_at) NOT BETWEEN 8 AND 20  -- outside 8am-8pm
ORDER BY created_at DESC;

Query 4: GST Reconciliation — AP Module vs GSTR-3B

-- Compare input tax credit claimed in books vs GSTR-3B filed
-- Using CTE (Common Table Expression) for clarity
WITH books_itc AS (
    SELECT
        supplier_gstin,
        SUM(igst_amount) AS books_igst,
        SUM(cgst_amount) AS books_cgst,
        SUM(sgst_amount) AS books_sgst,
        SUM(igst_amount + cgst_amount + sgst_amount) AS books_total_itc
    FROM   ap_invoices
    WHERE  invoice_date BETWEEN '2024-04-01' AND '2025-03-31'
      AND  gst_claim_eligible = 1
    GROUP BY supplier_gstin
),
gstr_data AS (
    SELECT
        supplier_gstin,
        SUM(igst_amount) AS gstr_igst,
        SUM(cgst_amount) AS gstr_cgst,
        SUM(sgst_amount) AS gstr_sgst,
        SUM(igst_amount + cgst_amount + sgst_amount) AS gstr_total_itc
    FROM   gstr2a_data
    WHERE  tax_period BETWEEN '2024-04' AND '2025-03'
    GROUP BY supplier_gstin
)
SELECT
    COALESCE(b.supplier_gstin, g.supplier_gstin) AS supplier_gstin,
    b.books_total_itc,
    g.gstr_total_itc,
    (b.books_total_itc - g.gstr_total_itc) AS variance,
    CASE
        WHEN g.supplier_gstin IS NULL THEN 'In Books Only — Not in GSTR-2A'
        WHEN b.supplier_gstin IS NULL THEN 'In GSTR-2A Only — Not in Books'
        WHEN ABS(b.books_total_itc - g.gstr_total_itc) > 1 THEN 'Amount Mismatch'
        ELSE 'Matched'
    END AS reconciliation_status
FROM   books_itc b
FULL OUTER JOIN gstr_data g ON b.supplier_gstin = g.supplier_gstin
WHERE  COALESCE(b.supplier_gstin, g.supplier_gstin) IS NOT NULL
ORDER BY ABS(COALESCE(b.books_total_itc,0) - COALESCE(g.gstr_total_itc,0)) DESC;

Advanced SQL: CTEs, Window Functions & Tools

CTEs (Common Table Expressions) — WITH Clause

CTEs make complex queries readable by naming intermediate result sets. The GST reconciliation query above demonstrates their use. Key benefit: you can reference a CTE multiple times in the main query without recalculating it. CTEs are also essential for self-referencing queries like parent-child account hierarchies.

Window Functions for Finance Analysis

-- ROW_NUMBER: rank vendors by spend within each department
SELECT
    department,
    vendor_name,
    total_spend,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY total_spend DESC) AS spend_rank
FROM   vendor_spend_summary
WHERE  spend_rank <= 5;  -- Top 5 vendors per department (use subquery in WHERE)

-- LAG: month-over-month revenue comparison
SELECT
    month_year,
    revenue,
    LAG(revenue, 1) OVER (ORDER BY month_year) AS prev_month_revenue,
    revenue - LAG(revenue, 1) OVER (ORDER BY month_year) AS mom_change,
    ROUND(
        (revenue - LAG(revenue, 1) OVER (ORDER BY month_year))
        / LAG(revenue, 1) OVER (ORDER BY month_year) * 100, 2
    ) AS mom_change_pct
FROM   monthly_revenue_summary
ORDER BY month_year;

-- Running total of payments to a vendor
SELECT
    payment_date,
    vendor_name,
    payment_amount,
    SUM(payment_amount) OVER (PARTITION BY vendor_id ORDER BY payment_date) AS cumulative_payments
FROM   ap_payments
WHERE  vendor_id = 'V001'
ORDER BY payment_date;

Databases Used in Indian Companies

DatabaseCommon InFree to Learn?
MySQLSMEs, e-commerce, startups, Tally backendYes — MySQL Community Edition
PostgreSQLFintech, SaaS, modern startupsYes — fully open source
Microsoft SQL ServerMid-market corporates, ERP like DynamicsYes — SQL Server Express (free tier)
Oracle DatabaseLarge enterprises, PSU banks, SAP on OracleLimited — Oracle Express Edition (free, limited)
SQLiteMobile apps, lightweight toolsYes — fully free, ideal for learning

Free SQL Tools for Accountants

Where to Practice SQL

⚡ Take Action Now

Install DB Browser for SQLite (free, 20 MB download). Import any accounting CSV you already have (bank statement, invoice list) as a database table. Run your first SELECT query with a WHERE clause and GROUP BY. You will have written real SQL against real financial data within 30 minutes of starting.

Explore CorpReady Programs

📚 Real Student Story

Rahul Agarwal, Internal Auditor at a listed NBFC, Pune — Rahul's team was conducting a fraud risk assessment across 3 million payment transactions from the previous 2 years. The IT department said it would take 2 weeks to write the custom reports requested. Rahul, who had recently learned SQL through CorpReady's analytics module, asked for direct read-only access to the MySQL database. Within 4 hours, he had written and run his duplicate payment detection query, a round-number payment analysis, and a weekend/after-hours journal entry report. His queries identified 27 suspicious payment patterns that the standard reports would not have surfaced. The audit report was completed in 6 days instead of 6 weeks, and Rahul was fast-tracked to Senior Auditor.

💼 What Firms Actually Want

Big Four and mid-tier audit firms are increasingly requiring SQL for IT audit and data analytics roles. Forensic accounting teams list SQL as mandatory for data-driven fraud investigations. Finance analytics roles in BFSI (banks, NBFCs, insurance) specifically require SQL for querying core banking systems, loan management systems, and policy administration databases. FP&A (Financial Planning & Analysis) roles in MNCs use SQL to pull actuals directly from ERP databases for variance analysis. The combination of accounting domain knowledge plus SQL creates a profile that stands out: the accountant who does not need IT to get data.

Frequently Asked Questions

Why should accountants learn SQL?

Accountants who know SQL can directly access financial databases without waiting for IT or developers to run reports. SQL handles datasets of 10 million+ rows that would crash Excel, queries leave an audit trail in database logs, and the ability to write your own data extraction queries makes accountants significantly more autonomous and valuable in analytics-driven finance roles.

What SQL database is most commonly used in Indian companies?

MySQL is the most common in Indian SMEs and startups. Microsoft SQL Server is dominant in mid-market and large corporates. PostgreSQL is widely used in fintech and SaaS companies. Oracle is used in large enterprises and public sector banks. For learning and practice, MySQL (free) or SQLite (via DB Browser, fully free) are the best starting points.

What is the difference between INNER JOIN and LEFT JOIN for accountants?

INNER JOIN returns only rows where a match exists in both tables — useful for finding matched invoices. LEFT JOIN returns all rows from the left table plus matching rows from the right (unmatched right-side rows appear as NULL). In accounting, LEFT JOIN is used for reconciliation: find all invoices in the AP ledger and show which ones have no corresponding bank payment — NULL equals unmatched equals outstanding.

How can SQL detect duplicate payments in accounting data?

Duplicate payment detection uses GROUP BY with HAVING COUNT greater than 1. Query the payments table, group by vendor ID, amount, and payment date, and return groups where the count exceeds 1. This finds payments to the same vendor for the same amount on the same date — the classic duplicate payment pattern in AP fraud or processing errors. A looser check uses a self-JOIN to find payments within 7 days of each other.

✅ Key Takeaways

  • SQL gives accountants direct database access, eliminating the IT bottleneck for custom reports and enabling analysis of 10 million-row datasets that Excel cannot handle.
  • LEFT JOIN with a WHERE IS NULL condition is the SQL equivalent of "find unmatched items" — the core mechanism for all reconciliation queries.
  • Duplicate payment detection, aged debtors, GST reconciliation, and journal entry fraud detection are the four highest-impact SQL use cases for accountants.
  • CTEs (WITH clause) transform complex multi-step SQL logic into readable, maintainable queries — essential for production accounting query libraries.
  • Window functions (ROW_NUMBER, LAG, LEAD, running totals) enable the period-over-period analysis that previously required multiple Excel SUMIF formulas.
  • DB Browser for SQLite and DBeaver are free tools that let you practice SQL on real data immediately — no server setup required.

Ready to Master SQL for Finance?

CorpReady Academy's programs build practical data skills alongside globally recognized credentials.

Explore CorpReady Programs Talk to a Counsellor