Skip to main content
Technology & EngineeringFile Formats147 lines

ODT (OpenDocument Text)

The open standard document format from the OASIS OpenDocument Format (ODF) family, used by LibreOffice, Apache OpenOffice, and other applications for word processing documents.

Quick Summary33 lines
You are a file format specialist with deep expertise in ODT (OpenDocument Text), including the ODF XML schema, styles and content separation, LibreOffice Writer workflows, DOCX round-trip fidelity, and programmatic document generation with odfpy and Pandoc.

## Key Points

- **File extension:** `.odt`
- **MIME type:** `application/vnd.oasis.opendocument.text`
- **Standard:** OASIS ODF 1.3 / ISO/IEC 26300
- **Magic bytes:** PK (ZIP), with `mimetype` as first file entry containing the MIME type string
- **Character encoding:** UTF-8 in all XML parts
- **Current version:** ODF 1.3 (2020)
- **Native:** LibreOffice Writer, Apache OpenOffice Writer
- **Also supports:** Calligra Words, AbiWord, Google Docs (import)
- **Microsoft Word:** Can open ODT files (since Word 2007 SP2) with varying fidelity
- **macOS:** TextEdit (basic support), Pages (import)
- LibreOffice Writer (default save format)
- Apache OpenOffice Writer

## Quick Example

```bash
unzip document.odt -d extracted/
# Parse extracted/content.xml with any XML library
```

```python
from odf.opendocument import load
from odf.text import P
doc = load("document.odt")
for para in doc.getElementsByType(P):
    # process paragraphs
```
skilldb get file-formats-skills/ODT (OpenDocument Text)Full skill: 147 lines
Paste into your CLAUDE.md or agent config

You are a file format specialist with deep expertise in ODT (OpenDocument Text), including the ODF XML schema, styles and content separation, LibreOffice Writer workflows, DOCX round-trip fidelity, and programmatic document generation with odfpy and Pandoc.

ODT — OpenDocument Text

Overview

ODT is the word processing file format defined by the OpenDocument Format (ODF) standard. ODF is developed by OASIS and standardized as ISO/IEC 26300. ODT is the default format for LibreOffice Writer and Apache OpenOffice Writer, and is supported by many other applications. Like DOCX, an ODT file is a ZIP archive containing XML files, but it uses a different, independently developed XML schema.

Core Philosophy

ODT (OpenDocument Text) is the open standard word processing format defined by the OASIS OpenDocument Format (ODF). Its core philosophy is that document formats should be owned by standards bodies, not software vendors, ensuring that documents remain readable regardless of which application created them or which application the reader uses.

ODT is the native format for LibreOffice Writer, the most widely used open-source word processor. It is also supported by Google Docs, Microsoft Word (with some limitations), and other ODF-compliant applications. The format's XML-based structure inside a ZIP container makes it transparent, well-documented, and suitable for long-term archival — properties that matter for organizations with document retention requirements.

The practical challenge with ODT is Microsoft Word's dominance in business environments. While Word can open and save ODT files, complex formatting and advanced features may not convert perfectly. For documents that must round-trip between LibreOffice and Word without formatting loss, test thoroughly or consider using DOCX as the interchange format. ODT is the right choice when vendor independence matters more than Microsoft Office compatibility.

Technical Specifications

  • File extension: .odt
  • MIME type: application/vnd.oasis.opendocument.text
  • Standard: OASIS ODF 1.3 / ISO/IEC 26300
  • Magic bytes: PK (ZIP), with mimetype as first file entry containing the MIME type string
  • Character encoding: UTF-8 in all XML parts
  • Current version: ODF 1.3 (2020)

Internal Structure

mimetype                    — Uncompressed, must be first entry
META-INF/manifest.xml       — Package manifest listing all parts
content.xml                 — Document body content
styles.xml                  — Style definitions
meta.xml                    — Document metadata
settings.xml                — Application settings
Pictures/                   — Embedded images
Thumbnails/thumbnail.png    — Document thumbnail

Content is structured using ODF elements in the text:, table:, draw:, and office: namespaces. Paragraphs are <text:p>, spans are <text:span>, and formatting is applied through named styles defined in styles.xml.

How to Work With It

Opening

  • Native: LibreOffice Writer, Apache OpenOffice Writer
  • Also supports: Calligra Words, AbiWord, Google Docs (import)
  • Microsoft Word: Can open ODT files (since Word 2007 SP2) with varying fidelity
  • macOS: TextEdit (basic support), Pages (import)

Creating

  • LibreOffice Writer (default save format)
  • Apache OpenOffice Writer
  • Google Docs: File > Download > ODF Document (.odt)
  • Programmatically:
    • Python: odfpy — read/write ODF documents
    • Java: ODF Toolkit (Apache), JODConverter
    • PHP: No dominant library; manipulate ZIP + XML directly
    • Any language: unzip, modify XML, rezip

Parsing

unzip document.odt -d extracted/
# Parse extracted/content.xml with any XML library

Python example with odfpy:

from odf.opendocument import load
from odf.text import P
doc = load("document.odt")
for para in doc.getElementsByType(P):
    # process paragraphs

Converting

  • To DOCX: LibreOffice headless (libreoffice --convert-to docx)
  • To PDF: libreoffice --convert-to pdf
  • To HTML/Markdown: Pandoc supports ODT as input
  • From DOCX: Open in LibreOffice and save as ODT, or use Pandoc

Common Use Cases

  • Default document format in LibreOffice-based organizations
  • Government mandates requiring open standards (many EU governments mandate ODF)
  • Cross-platform document exchange without Microsoft licensing
  • Long-term archival where vendor-neutral formats are required
  • Academic and nonprofit organizations using free/open-source software
  • Templates and mail merge in LibreOffice environments

Pros & Cons

Pros

  • Fully open standard (ISO/IEC 26300) with no vendor lock-in
  • Supported by multiple independent implementations
  • ZIP/XML structure is transparent and easily parsed
  • Mandated by many governments for interoperability
  • Smaller file sizes than equivalent DOCX in many cases
  • No proprietary extensions needed for core features

Cons

  • Lower market share means less ubiquitous tool support than DOCX
  • Microsoft Word's ODT support has imperfect fidelity (layout differences)
  • Fewer advanced features compared to DOCX (e.g., content controls, SmartArt)
  • Collaborative editing ecosystem is less developed than Microsoft 365
  • Some enterprise templates and workflows are DOCX-only
  • Macro support (via LibreOffice Basic/Python) differs from VBA

Compatibility

PlatformApplications
WindowsLibreOffice, OpenOffice, Word (import/export)
macOSLibreOffice, Pages (import), Word (import/export)
LinuxLibreOffice (native), OpenOffice, Calligra, AbiWord
WebGoogle Docs (import/export), Collabora Online
MobileCollabora Office, AndrOpen Office

Fidelity is best in LibreOffice. Expect layout differences when opening ODT in Microsoft Word, especially with complex styles, frames, or advanced formatting.

Related Formats

  • ODS (.ods): OpenDocument Spreadsheet
  • ODP (.odp): OpenDocument Presentation
  • ODG (.odg): OpenDocument Graphics
  • OTT (.ott): ODT template file
  • DOCX (.docx): Microsoft's competing XML-based format
  • FODT (.fodt): Flat (single-file, non-zipped) ODT

Practical Usage

  • Use libreoffice --headless --convert-to docx or libreoffice --headless --convert-to pdf for batch conversion of ODT files in automated pipelines.
  • Use Pandoc for converting ODT to Markdown, HTML, or other text formats -- it handles ODT well as an input format.
  • Use the FODT (Flat ODT) variant for version control workflows -- the single-file XML format produces meaningful diffs in Git unlike zipped ODT.
  • When creating ODT programmatically, use odfpy (Python) for full control over styles and content, or generate ODT by assembling ZIP + XML directly for simpler use cases.
  • Always test cross-application compatibility when sharing ODT with Word users -- complex styles, frames, text boxes, and page layouts are the most common sources of rendering differences.
  • Prefer named styles over direct formatting to maintain consistency and ensure portable rendering across ODF-compliant applications.

Anti-Patterns

  • Assuming ODT renders identically in Word and LibreOffice -- Microsoft Word's ODT implementation has known fidelity issues with page layout, frames, and advanced formatting; always verify the output.
  • Using VBA macros and expecting them to work in ODT -- ODT does not support VBA; LibreOffice uses its own Basic dialect, Python, or JavaScript macros.
  • Sending ODT to recipients without checking their tooling -- Many users only have Microsoft Word, which may misrender the document; provide a PDF alongside for guaranteed fidelity.
  • Editing the internal ZIP without preserving the mimetype entry -- The mimetype file must be the first entry in the ZIP archive and stored uncompressed; violating this breaks ODF compliance and may prevent some applications from opening the file.
  • Choosing ODT for enterprise templates requiring content controls or form fields -- ODT's form field support is limited compared to DOCX content controls; use DOCX if the workflow depends on structured document automation.

Install this skill directly: skilldb add file-formats-skills

Get CLI access →