Sparkflows has connectors to read unstructured file formats such as Binary files, PDF and to process OCR images as well. #input #binary