# ANTLR Oracle PL/SQL Grammar

## Overview

The Oracle ANTLR parser uses grammar files from a forked repository. Grammar files in this directory are managed via Maven and pulled from our fork rather than maintained directly in KOS.

**Upstream**: https://github.com/antlr/grammars-v4 (sql/plsql/)
**Our Fork**: https://github.com/datadotworld/grammars-v4-oracle
**Production Branch**: `oracle-kos-modifications`
**License**: MIT License

---

## IntelliJ IDEA Setup for Large Generated Files

ANTLR-generated parser files (e.g., `PlSqlParser.java`) can exceed IntelliJ's default file size limits, causing indexing issues and disabling code completion. To fix this:

### Steps

1. **Open Custom Properties**
   - Go to `Help` → `Edit Custom Properties...`
   - If prompted, create the file

2. **Add These Properties**
   ```properties
   # Maximum file size (in KB) for IntelliSense/autocomplete features
   idea.max.intellisense.filesize=10000

   # Maximum file size (in KB) that IntelliJ will load and parse
   idea.max.content.load.filesize=20000
   ```

3. **Restart IntelliJ IDEA**

---

## Pulling Grammar Files

### Using Claude Code (Recommended)

If using Claude Code, use the slash command:

```bash
/pull-antlr-grammars oracle
```

This command will:
- Download grammar files from the fork
- Verify the download succeeded
- Check for KOS modifications
- Compile the grammar with ANTLR
- Run parser tests to verify everything works
- Report git diffs if any

### Manual Maven Command

Alternatively, download grammar files manually (run from project root):

```bash
mvn initialize -P download-antlr-grammars -pl kos-collectors-oracle
```

The `download-antlr-grammars` profile in `pom.xml` specifies the source URLs.

**After manual download**, you should:
1. Regenerate parser: `mvn clean compile -pl kos-collectors-oracle`
2. Reload Maven projects in IntelliJ (if using IntelliJ)
3. Run tests: `mvn test -pl kos-collectors-oracle -Dtest=OracleAntlrParserTest`

---

## Modifying the Grammar

### Process Overview

1. **Make changes** in the fork repository (`grammars-v4-oracle`) on the `oracle-kos-modifications` branch
2. **Document changes** in `sql/plsql/KOS_MODIFICATIONS.md`
3. **Commit and push** to the fork
4. **Pull into KOS** using the Maven command above
5. **Regenerate parser** with `mvn clean install -pl kos-collectors-oracle`
6. **Test**

### Modification Workflow Options

**Option A - Edit in Fork:**
Edit grammar files directly in fork → commit/push → pull to KOS → test

**Option B - Edit in KOS First:**
Edit local grammar files → test → copy to fork → commit/push → confirm via download

Both approaches work. Option B allows faster iteration during development.

### Documentation

All grammar changes should be documented in the fork's `sql/plsql/KOS_MODIFICATIONS.md` file with:
- Date and description
- File and line numbers
- Reason for change
- Whether it's upstream-compatible

---

## Fork Branch Strategy

**main branch**: Synced with upstream, minimal changes
**oracle-kos-modifications branch**: Production branch with all KOS-specific changes

This separation allows easy upstream syncing while maintaining KOS customizations.

---

## Current KOS Modifications

As of 2026-02-04, the Oracle grammar has 4 KOS-specific modifications:

1. **Parenthesized SELECT statements** - Allow recursive parenthesization for nested queries
2. **CTE uses select_only_statement** - Enables WITH clauses inside CTEs
3. **Move ASTERISK to select_list_elements** - Simplifies SELECT * handling
4. **3-part qualified table names** - Supports database.schema.table notation

See `sql/plsql/KOS_MODIFICATIONS.md` in the fork for detailed documentation.

---

## License

MIT License

Permits commercial use, modification, and distribution. Requires preservation of copyright notice and license text in grammar files.
