Python package¶
The core of this package is in the textract.parsers submodule organized by file extension. For example, the .docx parser is located in textract.parsers.docx. Every parser submodule must have a method called extract that does the default text extraction for that file type.