I am trying to use aws Transcribe to convert a wav file to text. I have uploaded a wav file to S3, which is located here and it has public read/write permissions: https://s3.us-east-1.amazonaws.com/csld8xmsdksdf8s9sk3mmdjsdifkjksdijsldk/Transcribe2.wav. The wav file is valid. I can download it in my browser and replay it (and it sounds like the origin recording), so I think we can rule out an invalid input file, file permissions, etc.
I am using java version: 1.8.0_275 for mac.
I expect my program to give me back the transcribed text: “Hello amazon Subscribe, what is this?”
Here is the actual program output, including exception:
/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/bin/java "-javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=60898:/Applications/IntelliJ IDEA CE.app/Contents/bin" -Dfile.encoding=UTF-8 -classpath /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/jaccess.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/nashorn.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/lib/tools.jar:/Users/cdornin/work/transcribe/target/classes:/Users/cdornin/.m2/repository/org/apiguardian/apiguardian-api/1.0.0/apiguardian-api-1.0.0.jar:/Users/cdornin/.m2/repository/org/junit/platform/junit-platform-commons/1.4.0/junit-platform-commons-1.4.0.jar:/Users/cdornin/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar:/Users/cdornin/.m2/repository/org/slf4j/slf4j-api/1.7.25/slf4j-api-1.7.25.jar:/Users/cdornin/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/transcribe/2.15.65/transcribe-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/protocol-core/2.15.65/protocol-core-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/aws-json-protocol/2.15.65/aws-json-protocol-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/sdk-core/2.15.65/sdk-core-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/profiles/2.15.65/profiles-2.15.65.jar:/Users/cdornin/.m2/repository/org/reactivestreams/reactive-streams/1.0.2/reactive-streams-1.0.2.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/auth/2.15.65/auth-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/eventstream/eventstream/1.0.1/eventstream-1.0.1.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/http-client-spi/2.15.65/http-client-spi-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/regions/2.15.65/regions-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/annotations/2.15.65/annotations-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/utils/2.15.65/utils-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/aws-core/2.15.65/aws-core-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/metrics-spi/2.15.65/metrics-spi-2.15.65.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/apache-client/2.15.65/apache-client-2.15.65.jar:/Users/cdornin/.m2/repository/org/apache/httpcomponents/httpclient/4.5.13/httpclient-4.5.13.jar:/Users/cdornin/.m2/repository/commons-codec/commons-codec/1.11/commons-codec-1.11.jar:/Users/cdornin/.m2/repository/org/apache/httpcomponents/httpcore/4.4.11/httpcore-4.4.11.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/netty-nio-client/2.15.65/netty-nio-client-2.15.65.jar:/Users/cdornin/.m2/repository/io/netty/netty-codec-http/4.1.53.Final/netty-codec-http-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-codec-http2/4.1.53.Final/netty-codec-http2-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-codec/4.1.53.Final/netty-codec-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-transport/4.1.53.Final/netty-transport-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-resolver/4.1.53.Final/netty-resolver-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-common/4.1.53.Final/netty-common-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-buffer/4.1.53.Final/netty-buffer-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-handler/4.1.53.Final/netty-handler-4.1.53.Final.jar:/Users/cdornin/.m2/repository/io/netty/netty-transport-native-epoll/4.1.53.Final/netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:/Users/cdornin/.m2/repository/io/netty/netty-transport-native-unix-common/4.1.53.Final/netty-transport-native-unix-common-4.1.53.Final.jar:/Users/cdornin/.m2/repository/com/typesafe/netty/netty-reactive-streams-http/2.0.4/netty-reactive-streams-http-2.0.4.jar:/Users/cdornin/.m2/repository/com/typesafe/netty/netty-reactive-streams/2.0.4/netty-reactive-streams-2.0.4.jar:/Users/cdornin/.m2/repository/software/amazon/awssdk/transcribestreaming/2.15.65/transcribestreaming-2.15.65.jar:/Users/cdornin/.m2/repository/com/amazonaws/aws-java-sdk-s3/1.11.939/aws-java-sdk-s3-1.11.939.jar:/Users/cdornin/.m2/repository/com/amazonaws/aws-java-sdk-kms/1.11.939/aws-java-sdk-kms-1.11.939.jar:/Users/cdornin/.m2/repository/com/amazonaws/aws-java-sdk-core/1.11.939/aws-java-sdk-core-1.11.939.jar:/Users/cdornin/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/cdornin/.m2/repository/software/amazon/ion/ion-java/1.0.2/ion-java-1.0.2.jar:/Users/cdornin/.m2/repository/com/fasterxml/jackson/dataformat/jackson-dataformat-cbor/2.6.7/jackson-dataformat-cbor-2.6.7.jar:/Users/cdornin/.m2/repository/joda-time/joda-time/2.8.1/joda-time-2.8.1.jar:/Users/cdornin/.m2/repository/com/amazonaws/jmespath-java/1.11.939/jmespath-java-1.11.939.jar:/Users/cdornin/.m2/repository/com/amazonaws/aws-java-sdk-transcribe/1.11.939/aws-java-sdk-transcribe-1.11.939.jar:/Users/cdornin/.m2/repository/io/minio/minio/8.0.3/minio-8.0.3.jar:/Users/cdornin/.m2/repository/com/carrotsearch/thirdparty/simple-xml-safe/2.7.1/simple-xml-safe-2.7.1.jar:/Users/cdornin/.m2/repository/com/google/guava/guava/29.0-jre/guava-29.0-jre.jar:/Users/cdornin/.m2/repository/com/google/guava/failureaccess/1.0.1/failureaccess-1.0.1.jar:/Users/cdornin/.m2/repository/com/google/guava/listenablefuture/9999.0-empty-to-avoid-conflict-with-guava/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar:/Users/cdornin/.m2/repository/com/google/code/findbugs/jsr305/3.0.2/jsr305-3.0.2.jar:/Users/cdornin/.m2/repository/org/checkerframework/checker-qual/2.11.1/checker-qual-2.11.1.jar:/Users/cdornin/.m2/repository/com/google/errorprone/error_prone_annotations/2.3.4/error_prone_annotations-2.3.4.jar:/Users/cdornin/.m2/repository/com/google/j2objc/j2objc-annotations/1.3/j2objc-annotations-1.3.jar:/Users/cdornin/.m2/repository/com/squareup/okhttp3/okhttp/4.8.1/okhttp-4.8.1.jar:/Users/cdornin/.m2/repository/com/squareup/okio/okio/2.7.0/okio-2.7.0.jar:/Users/cdornin/.m2/repository/org/jetbrains/kotlin/kotlin-stdlib-common/1.3.70/kotlin-stdlib-common-1.3.70.jar:/Users/cdornin/.m2/repository/org/jetbrains/kotlin/kotlin-stdlib/1.3.72/kotlin-stdlib-1.3.72.jar:/Users/cdornin/.m2/repository/org/jetbrains/annotations/13.0/annotations-13.0.jar:/Users/cdornin/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.11.2/jackson-annotations-2.11.2.jar:/Users/cdornin/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.11.2/jackson-core-2.11.2.jar:/Users/cdornin/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.11.2/jackson-databind-2.11.2.jar com.amazonaws.transcribe.AmazonTranscribeServiceImpl log4j:WARN Continuable parsing error 2 and column 30 log4j:WARN Document root element "Configuration", must match DOCTYPE root "null". log4j:WARN Continuable parsing error 2 and column 30 log4j:WARN Document is invalid: no grammar found. log4j:ERROR DOM element is - not a <log4j:configuration> element. log4j:WARN No appenders could be found for logger (com.amazonaws.AmazonWebServiceClient). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" com.amazonaws.services.transcribe.model.AmazonTranscribeException: null (Service: AmazonTranscribe; Status Code: 400; Error Code: null; Request ID: 6BBE51FDC2CA981B; Proxy: null) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1819) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1403) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1372) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) at com.amazonaws.services.transcribe.AmazonTranscribeClient.doInvoke(AmazonTranscribeClient.java:1995) at com.amazonaws.services.transcribe.AmazonTranscribeClient.invoke(AmazonTranscribeClient.java:1962) at com.amazonaws.services.transcribe.AmazonTranscribeClient.invoke(AmazonTranscribeClient.java:1951) at com.amazonaws.services.transcribe.AmazonTranscribeClient.executeStartTranscriptionJob(AmazonTranscribeClient.java:1712) at com.amazonaws.services.transcribe.AmazonTranscribeClient.startTranscriptionJob(AmazonTranscribeClient.java:1681) at com.amazonaws.transcribe.AmazonTranscribeServiceImpl.callTranscribeService(AmazonTranscribeServiceImpl.java:34) at com.amazonaws.transcribe.AmazonTranscribeServiceImpl.main(AmazonTranscribeServiceImpl.java:20)
Here is my java code (add your aws key and secret)
package com.amazonaws.transcribe; import com.amazonaws.ClientConfiguration; import com.amazonaws.auth.DefaultAWSCredentialsProviderChain; import com.amazonaws.client.builder.AwsClientBuilder; import com.amazonaws.services.transcribe.AmazonTranscribe; import com.amazonaws.services.transcribe.AmazonTranscribeClientBuilder; import com.amazonaws.services.transcribe.model.Media; import com.amazonaws.services.transcribe.model.StartTranscriptionJobRequest; import com.amazonaws.services.transcribe.model.StartTranscriptionJobResult; /** * @author ravindu.s */ public class AmazonTranscribeServiceImpl { public static void main(String[] args) throws Exception { System.setProperty("aws.accessKeyId", "myKey"); System.setProperty("aws.secretAccessKey", "mySecret"); callTranscribeService("https://s3.us-east-1.amazonaws.com/csld8xmsdksdf8s9sk3mmdjsdifkjksdijsldk/Transcribe2.wav"); } public static void callTranscribeService(String mediaFile) { ClientConfiguration clientConfig = new ClientConfiguration(); clientConfig.setConnectionTimeout(60000); clientConfig.setMaxConnections(100); clientConfig.setSocketTimeout(60000); AmazonTranscribe transcribeClient = AmazonTranscribeClientBuilder.standard().withCredentials( DefaultAWSCredentialsProviderChain.getInstance()).withEndpointConfiguration( new AwsClientBuilder.EndpointConfiguration(mediaFile, "us-east-1")).withClientConfiguration(clientConfig).build(); StartTranscriptionJobRequest request = buildRequest(mediaFile); StartTranscriptionJobResult response = transcribeClient.startTranscriptionJob(request); System.out.println(response.getTranscriptionJob().getTranscriptionJobStatus()); } private static StartTranscriptionJobRequest buildRequest(String mediaFile) { StartTranscriptionJobRequest request = new StartTranscriptionJobRequest(); request.setMediaSampleRateHertz(16000); request.setMediaFormat("wav"); request.setLanguageCode("en-US"); request.setTranscriptionJobName("JOB-001"); Media media = new Media(); media.setMediaFileUri(mediaFile); request.setMedia(media); return request; } }
Here is my pom.xml file
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>play</groupId> <artifactId>transcribeTest</artifactId> <version>1.0-SNAPSHOT</version> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <configuration> <source>8</source> <target>8</target> </configuration> </plugin> </plugins> </build> <dependencies> <!--<dependency> <groupId>org.junit.jupiter</groupId> <artifactId>junit-jupiter-api</artifactId> <version>5.4.2</version> <scope>test</scope> </dependency>--> <!-- https://mvnrepository.com/artifact/junit/junit --> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.13.1</version> <scope>test</scope> </dependency> <dependency> <groupId>org.junit.jupiter</groupId> <artifactId>junit-jupiter-engine</artifactId> <version>5.4.2</version> <scope>test</scope> </dependency> <dependency> <groupId>org.junit.platform</groupId> <artifactId>junit-platform-commons</artifactId> <version>1.4.0</version> </dependency> <dependency> <groupId>org.junit.platform</groupId> <artifactId>junit-platform-launcher</artifactId> <version>1.4.0</version> <scope>test</scope> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-log4j12</artifactId> <version>1.7.25</version> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>transcribe</artifactId> <version>2.15.65</version> </dependency> <!-- https://mvnrepository.com/artifact/software.amazon.awssdk/transcribestreaming --> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>transcribestreaming</artifactId> <version>2.15.65</version> </dependency> <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 --> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-s3</artifactId> <version>1.11.939</version> </dependency> <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-transcribe --> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk-transcribe</artifactId> <version>1.11.939</version> </dependency> <dependency> <groupId>io.minio</groupId> <artifactId>minio</artifactId> <version>8.0.3</version> </dependency> </dependencies> </project>
Advertisement
Answer
I had a small mistake in my code. This line wasn’t necessary and when I removed, it worked:
withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration(mediaFile, "us-east-1"))