I want to automate the data uploading/ingestion to Amazon S3 bucket. I don't want to use softwares like Filezilla for FTP to S3.
The files will be available to FTP server on daily basis. I want to pick those files from FTP server and store in Amazon S3 on daily basis. Can i set up cron jobs or scripts to run in AWS in a cost-effective manner. What AWS instances can help me in achieving this.
The files are approx 1GB of size.
Amazon S3 is an object storage service. It cannot "pull" data from an external location.
Therefore, you will need a script or program that will:
It would be best to run such a script from the FTP server itself, so that the data can be sent to S3 without having to download from the FTP server first. If this is not possible, then you could run the script on any computer on the Internet, such as your own computer or an Amazon EC2 instance.
The simplest way to upload to Amazon S3 is to use the AWS Command-Line Interface (CLI). It has a aws s3 cp
command to copy files, or depending upon what needs to be copied it might be easier to use the aws s3 sync
command that automatically copies new or modified files.
The script could be triggered via a schedule (cron on Linux or a Scheduled Task on Windows).
If you are using an Amazon EC2 instance, you could save money by turning off the instance when it is not required. The flow could be:
StartInstances()
to start a stopped EC2 instancesudo shutdown now -h
)This might seem like a lot of steps, but the CloudWatch Event and Lambda function are trivial to configure.
To execute a script every time a Linux instance starts, put it in: /var/lib/cloud/scripts/per-boot/
See also: Auto-Stop EC2 instances when they finish a task - DEV Community
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments