I have created a custom processor using google AutoML entity extractor and trained few pdfs. The Pdf’s actually contains Photo identity card. I was able to test it in their UI and it was able to extract the entity properly. Now Im using their Java client library to do it using code given below. Here is the sample
Here I see that they pass the text content into the library instead I want to send the PDF content. I don’t want to use the google cloud storage bucket instead I want to load file locally and sent it to the entity extractor. I tried using the
Document class as below
Document.parseDelimitedFrom(FileInputStream("test.pdf")) but it gives me an error.
Any help is highly appriciated.
Document.parseDelimitedFrom(FileInputStream("test.pdf")) throws an error because the
parseDelimitedFrom() method expects a protobuf message for parsing not the InputStream of the local PDF file. That being said, currently, there is no provision to send local files for prediction as seen in this REST API documentation. The
DocumentInputConfig parameter supports only GCS source.
I have raised this requirement as a feature request in Google’s Issue Tracker. The issue can be found here– Issue #218865096. You can
STAR the issue to receive automatic updates and give it traction by referring to this link. Also, please be reminded that there is no timeline nor implementation guarantee for feature requests. All communication regarding this feature request will be done on the Issue Tracker.