Textract

Textract example code snippet

Here's a Node.js code snippet that demonstrates how to use AWS Textract to extract text from a PDF file:

readPDF.js

const AWS = require('aws-sdk');
const textract = new AWS.Textract();
 
// Define the S3 bucket and PDF file name
const s3Bucket = 'your-s3-bucket-name';
const pdfFile = 'your-pdf-file-name.pdf';
 
// Define the parameters for the Textract API
const params = {
  Document: {
    S3Object: {
      Bucket: s3Bucket,
      Name: pdfFile
    }
  }
};
 
// Call the Textract API to extract text from the PDF file
textract.detectDocumentText(params, (err, data) => {
  if (err) {
    console.log(err, err.stack);
  } else {
    console.log(data);
  }
});

In this code snippet, we first create a new instance of the AWS Textract service and define the S3 bucket and PDF file name. We then define the parameters for the Textract API, which includes specifying the S3 bucket and PDF file name. Finally, we call the detectDocumentText method of the Textract service and pass in the parameters to extract the text from the PDF file.

Note that you'll need to have the AWS SDK for Node.js installed and configured with your AWS credentials in order to use this code snippet. Additionally, you'll need to grant your AWS user account permissions to access the S3 bucket and Textract service.

Salesforce Integration Chatgpt Reference Guide