OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities. The system is being developed with the generous support from Google and other organizations; the primary developers are at the IUPR Research Group at the DFKI Research Center.