SQL
Structured Query Language files — text files containing database queries for schema definition, data manipulation, migrations, and seed data.
You are a file format specialist with deep expertise in SQL files, including DDL/DML syntax, dialect-specific differences across PostgreSQL, MySQL, SQLite, and SQL Server, migration frameworks, parameterized query safety, and linting with sqlfluff.
## Key Points
- **DDL** (Data Definition): `CREATE`, `ALTER`, `DROP`, `TRUNCATE` — define schema.
- **DML** (Data Manipulation): `SELECT`, `INSERT`, `UPDATE`, `DELETE` — work with data.
- **DCL** (Data Control): `GRANT`, `REVOKE` — manage permissions.
- **TCL** (Transaction Control): `BEGIN`, `COMMIT`, `ROLLBACK`, `SAVEPOINT`.
- **sqlfluff**: SQL linter and formatter — `sqlfluff lint query.sql --dialect postgres`
- **sqlfmt**: Opinionated SQL formatter.
- **pgFormatter**: PostgreSQL-specific formatter.
- **sqlcheck**: Anti-pattern detector.
- IDE: DataGrip, DBeaver, Azure Data Studio provide real-time validation.
- **Schema migrations**: Version-controlled database schema changes.
- **Seed data**: Initial or test data population scripts.
- **Stored procedures/functions**: Database-side business logic.
## Quick Example
```
migrations/
001_create_users.up.sql
001_create_users.down.sql
002_add_orders.up.sql
002_add_orders.down.sql
```
```python
# WRONG — vulnerable to SQL injection
cur.execute(f"SELECT * FROM users WHERE name = '{name}'")
# RIGHT — parameterized query
cur.execute("SELECT * FROM users WHERE name = %s", (name,))
```skilldb get file-formats-skills/SQLFull skill: 229 linesYou are a file format specialist with deep expertise in SQL files, including DDL/DML syntax, dialect-specific differences across PostgreSQL, MySQL, SQLite, and SQL Server, migration frameworks, parameterized query safety, and linting with sqlfluff.
SQL — Database Query Files
Overview
SQL (Structured Query Language) is the standard language for managing and querying relational databases. SQL files (.sql) contain one or more SQL statements used for database schema creation, data manipulation, migrations, stored procedures, and seed data. While SQL itself is standardized (ISO/IEC 9075), every database engine extends it with proprietary syntax, making dialect awareness essential when working with SQL files.
Core Philosophy
SQL files are not data files — they are programs. A .sql file contains instructions (queries, DDL statements, DML operations, stored procedures) that a database engine executes. This distinction matters: the same SQL file can produce completely different results depending on the database state, the database engine (PostgreSQL vs. MySQL vs. SQLite), and the execution context. Treat SQL files with the same care you treat any executable code.
SQL's declarative philosophy — describing what data you want rather than how to get it — makes it one of the most powerful and durable abstractions in computing. The core SQL standard has remained remarkably stable since the 1980s, and a basic SELECT/INSERT/UPDATE/DELETE query looks essentially the same across all major databases. This portability is SQL's greatest strength, though dialect-specific extensions (CTEs, window functions, JSON operators) can reduce portability.
For database schema management, use migration frameworks (Flyway, Alembic, Rails migrations) rather than maintaining monolithic SQL files. For data exchange, prefer structured formats (CSV, Parquet, JSON) over SQL INSERT statements. For version-controlled schema definitions, keep each migration as a separate, numbered SQL file. SQL files are most valuable as executable artifacts in a managed migration pipeline, not as static data containers.
Technical Specifications
Syntax and Structure
SQL files contain statements terminated by semicolons:
-- Single-line comment
/* Multi-line
comment */
-- Schema definition (DDL)
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
is_active BOOLEAN DEFAULT TRUE
);
CREATE INDEX idx_users_email ON users(email);
-- Data manipulation (DML)
INSERT INTO users (email, name) VALUES
('alice@example.com', 'Alice'),
('bob@example.com', 'Bob');
-- Queries
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.is_active = TRUE
GROUP BY u.name
HAVING COUNT(o.id) > 5
ORDER BY order_count DESC
LIMIT 10;
-- Transactions
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- Common Table Expression (CTE)
WITH active_users AS (
SELECT * FROM users WHERE is_active = TRUE
)
SELECT * FROM active_users WHERE created_at > '2025-01-01';
Statement Categories
- DDL (Data Definition):
CREATE,ALTER,DROP,TRUNCATE— define schema. - DML (Data Manipulation):
SELECT,INSERT,UPDATE,DELETE— work with data. - DCL (Data Control):
GRANT,REVOKE— manage permissions. - TCL (Transaction Control):
BEGIN,COMMIT,ROLLBACK,SAVEPOINT.
Key Dialect Differences
| Feature | PostgreSQL | MySQL | SQLite | SQL Server |
|---|---|---|---|---|
| Auto-increment | SERIAL | AUTO_INCREMENT | AUTOINCREMENT | IDENTITY(1,1) |
| String concat | || | CONCAT() | || | + |
| Upsert | ON CONFLICT | ON DUPLICATE KEY | ON CONFLICT | MERGE |
| Boolean | BOOLEAN | TINYINT(1) | INTEGER | BIT |
| LIMIT/OFFSET | LIMIT n OFFSET m | LIMIT m, n | LIMIT n OFFSET m | OFFSET m FETCH NEXT n |
| JSON support | jsonb | JSON | json() | OPENJSON |
How to Work With It
Executing SQL Files
# PostgreSQL
psql -h localhost -U user -d mydb -f schema.sql
# MySQL
mysql -u root -p mydb < schema.sql
# SQLite
sqlite3 mydb.sqlite < schema.sql
# SQL Server
sqlcmd -S localhost -d mydb -i schema.sql
Programmatic Usage
import psycopg2 # PostgreSQL
conn = psycopg2.connect("dbname=mydb user=admin")
cur = conn.cursor()
cur.execute(open("query.sql").read())
results = cur.fetchall()
# Parameterized queries (ALWAYS use these — never string interpolation)
cur.execute("SELECT * FROM users WHERE email = %s", (email,))
// Node.js with pg
import { readFileSync } from 'fs';
import pg from 'pg';
const client = new pg.Client();
await client.connect();
await client.query(readFileSync('schema.sql', 'utf-8'));
// Parameterized:
await client.query('SELECT * FROM users WHERE id = $1', [userId]);
Migration Frameworks
SQL files are commonly organized as migrations:
migrations/
001_create_users.up.sql
001_create_users.down.sql
002_add_orders.up.sql
002_add_orders.down.sql
Tools: Flyway, Liquibase, golang-migrate, Alembic (Python), Knex (JS), ActiveRecord (Ruby).
Validating and Linting
- sqlfluff: SQL linter and formatter —
sqlfluff lint query.sql --dialect postgres - sqlfmt: Opinionated SQL formatter.
- pgFormatter: PostgreSQL-specific formatter.
- sqlcheck: Anti-pattern detector.
- IDE: DataGrip, DBeaver, Azure Data Studio provide real-time validation.
Security
SQL Injection Prevention: Never concatenate user input into SQL strings.
# WRONG — vulnerable to SQL injection
cur.execute(f"SELECT * FROM users WHERE name = '{name}'")
# RIGHT — parameterized query
cur.execute("SELECT * FROM users WHERE name = %s", (name,))
Common Use Cases
- Schema migrations: Version-controlled database schema changes.
- Seed data: Initial or test data population scripts.
- Stored procedures/functions: Database-side business logic.
- Reports and analytics: Complex queries saved for reuse.
- Database backups:
pg_dump,mysqldumpproduce SQL files. - ETL pipelines: Data extraction and transformation queries.
Pros & Cons
Pros
- Universal language for relational databases — decades of industry adoption.
- Declarative — describe what you want, not how to get it.
- Powerful for complex queries (joins, subqueries, CTEs, window functions).
- Plain text — version-controllable, diffable, reviewable.
- Migration frameworks provide orderly schema evolution.
Cons
- Dialect fragmentation — SQL is not truly portable across databases.
- No built-in parameterization in files — injection risk when misused.
- Large migration sets become hard to manage.
- Stored procedures create database-vendor lock-in.
- No module system or import mechanism — each file is standalone.
- Debugging complex queries can be challenging.
Compatibility
| Database | CLI Tool | Popular ORM / Query Builder |
|---|---|---|
| PostgreSQL | psql | SQLAlchemy, Prisma, TypeORM |
| MySQL | mysql | Sequelize, Knex, Hibernate |
| SQLite | sqlite3 | Drizzle, Diesel, GORM |
| SQL Server | sqlcmd | Entity Framework, Dapper |
| Oracle | sqlplus | Hibernate, cx_Oracle |
MIME type: application/sql. File extension: .sql.
Related Formats
- SQLite: Single-file database that executes SQL (binary format).
- Prisma Schema: Declarative schema DSL that generates SQL migrations.
- Liquibase XML/YAML: Database changelog in XML or YAML format.
- GraphQL: Query language for APIs (not databases directly).
- PRQL: Pipelined Relational Query Language — compiles to SQL.
- EdgeQL: Query language for EdgeDB.
Practical Usage
- Always use parameterized queries (prepared statements) when incorporating user input -- never concatenate strings into SQL, as this is the primary vector for SQL injection attacks.
- Use a migration framework (Flyway, Alembic, golang-migrate) to manage schema changes as versioned SQL files in source control rather than applying ad-hoc DDL changes.
- Use
sqlfluff lint --dialect postgres(or your target dialect) to catch syntax errors, style violations, and anti-patterns before executing SQL files. - Write idempotent migration scripts with
IF NOT EXISTSandIF EXISTSguards so they can be safely re-run without errors. - Keep seed data separate from schema migrations -- use distinct files for DDL (schema) and DML (data) to make the execution order explicit.
- Use CTEs (Common Table Expressions) to break complex queries into readable, named steps rather than deeply nested subqueries.
Anti-Patterns
- Concatenating user input directly into SQL strings -- This is the most exploited vulnerability in web applications; always use parameterized queries with placeholders (
$1,%s,?). - Writing SQL that assumes a specific dialect without documenting it -- SQL dialects differ significantly (auto-increment, string concatenation, LIMIT syntax); always document the target database and test portability if multiple databases are supported.
- Using
SELECT *in production queries and migrations -- This makes queries fragile to schema changes; always enumerate the specific columns you need. - Running DDL changes directly in production without a migration framework -- Ad-hoc schema changes are unreproducible, irreversible, and invisible to version control; always use tracked migrations.
- Storing SQL dumps as backups without testing restore -- A SQL dump is only a valid backup if it can be successfully restored; periodically test restoration to a fresh database.
Install this skill directly: skilldb add file-formats-skills
Related Skills
3MF 3D Manufacturing Format
The 3MF file format — the modern replacement for STL in 3D printing, supporting colors, materials, multi-object assemblies, and precise manufacturing data in a single package.
7-Zip Compressed Archive
The 7z archive format — open-source high-ratio compression using LZMA2, with strong AES-256 encryption, solid archives, and multi-threading support.
AAC (Advanced Audio Coding)
A lossy audio codec standardized as part of MPEG-2 and MPEG-4, designed to supersede MP3 with better quality at equivalent or lower bitrates.
AC3 (Dolby Digital)
Dolby's surround sound audio codec used in cinema, DVD, Blu-ray, and broadcast television for multichannel 5.1 audio delivery.
AI Adobe Illustrator Format
AI is Adobe Illustrator's native vector graphics file format, used for
AIFF (Audio Interchange File Format)
Apple's uncompressed audio format storing raw PCM data, serving as the Mac equivalent of WAV for professional audio production.