ODT (OpenDocument Text)
The open standard document format from the OASIS OpenDocument Format (ODF) family, used by LibreOffice, Apache OpenOffice, and other applications for word processing documents.
You are a file format specialist with deep expertise in ODT (OpenDocument Text), including the ODF XML schema, styles and content separation, LibreOffice Writer workflows, DOCX round-trip fidelity, and programmatic document generation with odfpy and Pandoc.
## Key Points
- **File extension:** `.odt`
- **MIME type:** `application/vnd.oasis.opendocument.text`
- **Standard:** OASIS ODF 1.3 / ISO/IEC 26300
- **Magic bytes:** PK (ZIP), with `mimetype` as first file entry containing the MIME type string
- **Character encoding:** UTF-8 in all XML parts
- **Current version:** ODF 1.3 (2020)
- **Native:** LibreOffice Writer, Apache OpenOffice Writer
- **Also supports:** Calligra Words, AbiWord, Google Docs (import)
- **Microsoft Word:** Can open ODT files (since Word 2007 SP2) with varying fidelity
- **macOS:** TextEdit (basic support), Pages (import)
- LibreOffice Writer (default save format)
- Apache OpenOffice Writer
## Quick Example
```bash
unzip document.odt -d extracted/
# Parse extracted/content.xml with any XML library
```
```python
from odf.opendocument import load
from odf.text import P
doc = load("document.odt")
for para in doc.getElementsByType(P):
# process paragraphs
```skilldb get file-formats-skills/ODT (OpenDocument Text)Full skill: 147 linesYou are a file format specialist with deep expertise in ODT (OpenDocument Text), including the ODF XML schema, styles and content separation, LibreOffice Writer workflows, DOCX round-trip fidelity, and programmatic document generation with odfpy and Pandoc.
ODT — OpenDocument Text
Overview
ODT is the word processing file format defined by the OpenDocument Format (ODF) standard. ODF is developed by OASIS and standardized as ISO/IEC 26300. ODT is the default format for LibreOffice Writer and Apache OpenOffice Writer, and is supported by many other applications. Like DOCX, an ODT file is a ZIP archive containing XML files, but it uses a different, independently developed XML schema.
Core Philosophy
ODT (OpenDocument Text) is the open standard word processing format defined by the OASIS OpenDocument Format (ODF). Its core philosophy is that document formats should be owned by standards bodies, not software vendors, ensuring that documents remain readable regardless of which application created them or which application the reader uses.
ODT is the native format for LibreOffice Writer, the most widely used open-source word processor. It is also supported by Google Docs, Microsoft Word (with some limitations), and other ODF-compliant applications. The format's XML-based structure inside a ZIP container makes it transparent, well-documented, and suitable for long-term archival — properties that matter for organizations with document retention requirements.
The practical challenge with ODT is Microsoft Word's dominance in business environments. While Word can open and save ODT files, complex formatting and advanced features may not convert perfectly. For documents that must round-trip between LibreOffice and Word without formatting loss, test thoroughly or consider using DOCX as the interchange format. ODT is the right choice when vendor independence matters more than Microsoft Office compatibility.
Technical Specifications
- File extension:
.odt - MIME type:
application/vnd.oasis.opendocument.text - Standard: OASIS ODF 1.3 / ISO/IEC 26300
- Magic bytes: PK (ZIP), with
mimetypeas first file entry containing the MIME type string - Character encoding: UTF-8 in all XML parts
- Current version: ODF 1.3 (2020)
Internal Structure
mimetype — Uncompressed, must be first entry
META-INF/manifest.xml — Package manifest listing all parts
content.xml — Document body content
styles.xml — Style definitions
meta.xml — Document metadata
settings.xml — Application settings
Pictures/ — Embedded images
Thumbnails/thumbnail.png — Document thumbnail
Content is structured using ODF elements in the text:, table:, draw:, and office: namespaces. Paragraphs are <text:p>, spans are <text:span>, and formatting is applied through named styles defined in styles.xml.
How to Work With It
Opening
- Native: LibreOffice Writer, Apache OpenOffice Writer
- Also supports: Calligra Words, AbiWord, Google Docs (import)
- Microsoft Word: Can open ODT files (since Word 2007 SP2) with varying fidelity
- macOS: TextEdit (basic support), Pages (import)
Creating
- LibreOffice Writer (default save format)
- Apache OpenOffice Writer
- Google Docs: File > Download > ODF Document (.odt)
- Programmatically:
- Python:
odfpy— read/write ODF documents - Java: ODF Toolkit (Apache), JODConverter
- PHP: No dominant library; manipulate ZIP + XML directly
- Any language: unzip, modify XML, rezip
- Python:
Parsing
unzip document.odt -d extracted/
# Parse extracted/content.xml with any XML library
Python example with odfpy:
from odf.opendocument import load
from odf.text import P
doc = load("document.odt")
for para in doc.getElementsByType(P):
# process paragraphs
Converting
- To DOCX: LibreOffice headless (
libreoffice --convert-to docx) - To PDF:
libreoffice --convert-to pdf - To HTML/Markdown: Pandoc supports ODT as input
- From DOCX: Open in LibreOffice and save as ODT, or use Pandoc
Common Use Cases
- Default document format in LibreOffice-based organizations
- Government mandates requiring open standards (many EU governments mandate ODF)
- Cross-platform document exchange without Microsoft licensing
- Long-term archival where vendor-neutral formats are required
- Academic and nonprofit organizations using free/open-source software
- Templates and mail merge in LibreOffice environments
Pros & Cons
Pros
- Fully open standard (ISO/IEC 26300) with no vendor lock-in
- Supported by multiple independent implementations
- ZIP/XML structure is transparent and easily parsed
- Mandated by many governments for interoperability
- Smaller file sizes than equivalent DOCX in many cases
- No proprietary extensions needed for core features
Cons
- Lower market share means less ubiquitous tool support than DOCX
- Microsoft Word's ODT support has imperfect fidelity (layout differences)
- Fewer advanced features compared to DOCX (e.g., content controls, SmartArt)
- Collaborative editing ecosystem is less developed than Microsoft 365
- Some enterprise templates and workflows are DOCX-only
- Macro support (via LibreOffice Basic/Python) differs from VBA
Compatibility
| Platform | Applications |
|---|---|
| Windows | LibreOffice, OpenOffice, Word (import/export) |
| macOS | LibreOffice, Pages (import), Word (import/export) |
| Linux | LibreOffice (native), OpenOffice, Calligra, AbiWord |
| Web | Google Docs (import/export), Collabora Online |
| Mobile | Collabora Office, AndrOpen Office |
Fidelity is best in LibreOffice. Expect layout differences when opening ODT in Microsoft Word, especially with complex styles, frames, or advanced formatting.
Related Formats
- ODS (.ods): OpenDocument Spreadsheet
- ODP (.odp): OpenDocument Presentation
- ODG (.odg): OpenDocument Graphics
- OTT (.ott): ODT template file
- DOCX (.docx): Microsoft's competing XML-based format
- FODT (.fodt): Flat (single-file, non-zipped) ODT
Practical Usage
- Use
libreoffice --headless --convert-to docxorlibreoffice --headless --convert-to pdffor batch conversion of ODT files in automated pipelines. - Use Pandoc for converting ODT to Markdown, HTML, or other text formats -- it handles ODT well as an input format.
- Use the FODT (Flat ODT) variant for version control workflows -- the single-file XML format produces meaningful diffs in Git unlike zipped ODT.
- When creating ODT programmatically, use
odfpy(Python) for full control over styles and content, or generate ODT by assembling ZIP + XML directly for simpler use cases. - Always test cross-application compatibility when sharing ODT with Word users -- complex styles, frames, text boxes, and page layouts are the most common sources of rendering differences.
- Prefer named styles over direct formatting to maintain consistency and ensure portable rendering across ODF-compliant applications.
Anti-Patterns
- Assuming ODT renders identically in Word and LibreOffice -- Microsoft Word's ODT implementation has known fidelity issues with page layout, frames, and advanced formatting; always verify the output.
- Using VBA macros and expecting them to work in ODT -- ODT does not support VBA; LibreOffice uses its own Basic dialect, Python, or JavaScript macros.
- Sending ODT to recipients without checking their tooling -- Many users only have Microsoft Word, which may misrender the document; provide a PDF alongside for guaranteed fidelity.
- Editing the internal ZIP without preserving the mimetype entry -- The
mimetypefile must be the first entry in the ZIP archive and stored uncompressed; violating this breaks ODF compliance and may prevent some applications from opening the file. - Choosing ODT for enterprise templates requiring content controls or form fields -- ODT's form field support is limited compared to DOCX content controls; use DOCX if the workflow depends on structured document automation.
Install this skill directly: skilldb add file-formats-skills
Related Skills
3MF 3D Manufacturing Format
The 3MF file format — the modern replacement for STL in 3D printing, supporting colors, materials, multi-object assemblies, and precise manufacturing data in a single package.
7-Zip Compressed Archive
The 7z archive format — open-source high-ratio compression using LZMA2, with strong AES-256 encryption, solid archives, and multi-threading support.
AAC (Advanced Audio Coding)
A lossy audio codec standardized as part of MPEG-2 and MPEG-4, designed to supersede MP3 with better quality at equivalent or lower bitrates.
AC3 (Dolby Digital)
Dolby's surround sound audio codec used in cinema, DVD, Blu-ray, and broadcast television for multichannel 5.1 audio delivery.
AI Adobe Illustrator Format
AI is Adobe Illustrator's native vector graphics file format, used for
AIFF (Audio Interchange File Format)
Apple's uncompressed audio format storing raw PCM data, serving as the Mac equivalent of WAV for professional audio production.